On 08/12/2008 06:10 PM, H. Turgut Uyar wrote: > At the risk of complicating the DOM parser, I've added an extractor > grouping feature that should improve the parser performance. The aim is > to eliminate repeated xpath lookups. Now, if the extractor has a 'group' > attribute (which is supposed to be an xpath), that will be applied first > to get a set of group elements together with a group key. Then the > extractor path will be applied to get the final elements.
I think that I should clear up some point here: The extractor path will be applied to the group element, so it can (should?) be relative to the group element. For example: path="//[EMAIL PROTECTED]'_imdbpy']/a", group="//[EMAIL PROTECTED]'_imdbpy']", group_key="./h5/text()", can also be written as: group="//[EMAIL PROTECTED]'_imdbpy']", group_key="./h5/text()", path="./a", I'm not sure if the first method might produce incorrect results or cause extra work for the parser since it might traverse to other _imdbpy div's as well (other than the div of the current group). Turgut ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel