I've committed some changes to the CVS: now, in a page about a series'
episode, lists of people after the "Series Crew" label are put in
"series DUTY" keywords.
E.g.: for an episode you can have a "writer" keyword for the writers
that have for sure worked on the specific episode, and "series writer"
for a list of writers who have worked regularly on the series (and it's
still not clear if they have worked on the given episode).

On the technical side, I've introduced the preprocess_dom method
to the DOMParserBase class: now the dom is created, passed to
preprocess_dom for (optional) modifications/handling and after that
it continues its path to the usual parse_dom method.
In short: I've introduced a new step between preprocess_string
and parse_dom.

Another - minor - change are the getattribute and setattribute
functions in lxmladapter and bsoupadapter: they are used to get
and set an attribute for a given node (an html tag, for short).
They are used by the preprocess_dom, as overridden in the
DOMHTMLMovieParser class.

The next step is to move the gather_refs (and if possible
_fix_rowspans, too) method _after_ the DOM creation, so that
there's no need to parse the html more than one time.
For _fix_rowspans it will be probably required to write some other
"adapter functions" to add/replace specific nodes, but I don't
think it would be too difficult.


Enjoy,
-- 
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to