Davide Alberani wrote: > Maybe a good candidate is the movieParser.HTMLOfficialsitesParser > class: it handles 7 different pages for movies (and another one > for persons); the only thing that changes, is the key in the > returned dictionary. > E.g.: > {'official sites': [list, of, official, sites]} > {'external reviews': [list, of, external, reviews]} > ... >
I've implemented a parser for this one too. It has been interesting because I had to add a feature to section specification to access instance attributes. I did some more work on the parsers today, they're all in the CVS. An important one is naming the fields of a composite attribute (like birthdate=birthday+birthyear). I kept forgetting which element in the tuple corresponded to which piece of information when writing the postprocessor. I think the code is more readable now. Maybe the most important modification is that I've renamed the infamous 'elem' to 'path' :-) I figured, if attributes and extractors can both have 'postprocess', they can also both have 'path'. I'll change it back if you disagree. Turgut ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel