Davide Alberani wrote:
> Maybe a good candidate is the movieParser.HTMLOfficialsitesParser
> class: it handles 7 different pages for movies (and another one
> for persons); the only thing that changes, is the key in the
> returned dictionary.
> E.g.:
>   {'official sites': [list, of, official, sites]}
>   {'external reviews': [list, of, external, reviews]}
>   ...
> 

I've implemented a parser for this one too. It has been interesting 
because I had to add a feature to section specification to access 
instance attributes.

I did some more work on the parsers today, they're all in the CVS. An 
important one is naming the fields of a composite attribute (like 
birthdate=birthday+birthyear). I kept forgetting which element in the 
tuple corresponded to which piece of information when writing the 
postprocessor. I think the code is more readable now.

Maybe the most important modification is that I've renamed the infamous 
'elem' to 'path' :-) I figured, if attributes and extractors can both 
have 'postprocess', they can also both have 'path'. I'll change it back 
if you disagree.

Turgut


-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to