On Jun 26, "H. Turgut Uyar" <[EMAIL PROTECTED]> wrote: > I did some minor cleanup and simplification. I think my changes to > the original branch are now minimal and the current state could be > the base for a new parser class.
I'm committing my changes. Basically I've moved your _paths structure to "extractors", a list/tuple of Extractor instances, which in turn contains a list of Attribute instances. The design is very close to your and may be a bit more verbose, but in the long term can be more readable - I hope. I've slightly modified the parse_dom method adding minor feature (they are absolutely untested - some are still unused by the code!) I hope you won't find it a complete mess. :-) Basically, now, the parse method calls a set of other methods (including parse_dom), so that subclasses can modify the output where they need. If something is not clear, ask (I wrote the code in a very small time). Every name/structure can still be changed: if you have other ideas and/or better names for classes and methods, it's time to do these changes. Many things are not handled, like name/title references (but the add_refs method is there). I've removed the "result" parameter: it was too prone to side-effects; now parse_dom always returns a dictionary; later - other methods - can return whatever they want. In general, I'm amazed by the amount of code spared by this approach. Just incredible. :-) Obviously there are still many things to do: error handling, for one (and checking that everything is unicode, and managing things like numeric values, and taking care of html/xml references, and so on...) > > Thank you (and good luck for the match against Germany ;-) > > Thanks :-) It didn't turn out as we hoped it would but it was an > entertaining game after all. I've seen it; great match. After the first half of Italy-Spain, I had to put needles under my nails to keep me awake... ;-) -- Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47] http://erlug.linux.it/~da/ ------------------------------------------------------------------------- Check out the new SourceForge.net Marketplace. It's the best place to buy or sell services for just about anything Open Source. http://sourceforge.net/services/buy/index.php _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel