Re: [Imdbpy-devel] DOM Parser

Davide Alberani Tue, 02 Sep 2008 12:05:49 -0700

On Sep 02, James Rubino <[EMAIL PROTECTED]> wrote:

> What is the DOM parser you are working on?


To tell the truth, it's mostly H. Turgut Uyar, the one who is
working on it... :-)

Anyway, it's a new approach to parse the web pages of the IMDb.com
site (it has nothing to do with the local/sql data access systems).

Basically, the new parsers see the pages as a structured document (a
hierarchy of nested tags), while the old parsers just iterated over
one tag at a time, without seeing the document as "a whole".

It should simplify the task to write and keep up-to-date the
many-many-many required parsers.

If you have the cvs command, you can check it out with:
  cvs -z3 -d:ext:[EMAIL PROTECTED]:/cvsroot/imdbpy co -P -r dom -d imdbpy.dom 
imdbpy

The above command will create the "imdbpy.dom" directory in you
working dir.


> (I have the mysql bugs ironed out)

In the sense that it's working now, or you just give up? :-)


-- 
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Re: [Imdbpy-devel] DOM Parser

Reply via email to