On Jul 03, "H. Turgut Uyar" <[EMAIL PROTECTED]> wrote: > Yes, this one's tricky.
In the CVS, my first try. Maybe I've oversimplified it, but it works. I think that now our first priority is to settle the functionalities of the parse_dom() method and the scheme of the "extractors" attribute. Unfortunately I think I'm still not confident enough with my abilities with DOM/XPath, so I'm a bit confused. Keeping an eye at the features we need, extractors can be used in too many different ways: "attribute.key" behave in one way if it's None and in another way if it's a string (my fault), path can be a list or a dictionary, "section" can interfer with "attribute.key" and so on... Obviously I'm not saying these are bad things: these features are _fundamentals_ and must stay. What we need is a cleaner usage schema: it should be more clear (or at least documented) that if you want to just extract the text from a list of <li> tags inside an <ol> tag, you must write an "extractors" in a given way. On the other side, if you have a complex data structure inside a <div> tag, "extractors" must be written in another way. Generally speaking, it should also be clearer the type of returned items: a list, a string, another dictionary... Maybe the code is already good enough and I still have just to grasp it. :-) -- Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47] http://erlug.linux.it/~da/ ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel