On Jul 03, "H. Turgut Uyar" <[EMAIL PROTECTED]> wrote:

> Yes, this one's tricky.

In the CVS, my first try.  Maybe I've oversimplified it, but it works.

I think that now our first priority is to settle the functionalities
of the parse_dom() method and the scheme of the "extractors" attribute.

Unfortunately I think I'm still not confident enough with my abilities
with DOM/XPath, so I'm a bit confused.

Keeping an eye at the features we need, extractors can be used in too
many different ways: "attribute.key" behave in one way if it's None and
in another way if it's a string (my fault), path can be a list or a
dictionary, "section" can interfer with "attribute.key" and so on...

Obviously I'm not saying these are bad things: these features are
_fundamentals_ and must stay.
What we need is a cleaner usage schema: it should be more clear (or
at least documented) that if you want to just extract the text
from a list of <li> tags inside an <ol> tag, you must write an
"extractors" in a given way.
On the other side, if you have a complex data structure inside a
<div> tag, "extractors" must be written in another way.
Generally speaking, it should also be clearer the type of returned
items: a list, a string, another dictionary...

Maybe the code is already good enough and I still have just to
grasp it. :-)

-- 
Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47]
http://erlug.linux.it/~da/

-------------------------------------------------------------------------
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to