Davide Alberani wrote:
> Maybe a good candidate is the movieParser.HTMLOfficialsitesParser
> class: it handles 7 different pages for movies (and another one
> for persons); the only thing that changes, is the key in the
> returned dictionary.
> E.g.:
> {'official sites': [list, of, official, sites]}
> {'external reviews': [list, of, external, reviews]}
> ...
>
I've implemented a parser for this one too. It has been interesting
because I had to add a feature to section specification to access
instance attributes.
I did some more work on the parsers today, they're all in the CVS. An
important one is naming the fields of a composite attribute (like
birthdate=birthday+birthyear). I kept forgetting which element in the
tuple corresponded to which piece of information when writing the
postprocessor. I think the code is more readable now.
Maybe the most important modification is that I've renamed the infamous
'elem' to 'path' :-) I figured, if attributes and extractors can both
have 'postprocess', they can also both have 'path'. I'll change it back
if you disagree.
Turgut
-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Imdbpy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel