On Oct 13, "H. Turgut Uyar" <[EMAIL PROTECTED]> wrote: > I wonder how I never noticed that the reference gathering parser > caused the same page to be parsed twice.
Not a big deal: it was done only when needed, and it was already really fast. > Nice idea to manipulate the dom object to get to the series info > for episodes without touching the xpaths. I think it can be useful in some corner-case situations, to have a way to manipulate the DOM: there are cases where it's too hard to modify the HTML with regular expressions. > I think there is an unnecessary -and hazardous- lxml import in the > preprocess_dom method of the DOMHTMLMovieParser. Wooops. I've played too much. :-) Fixed, thanks. > I was going to look at this today, but man you're fast :-) To be honest, I've just copied your _fix_rowspans function written for lxml. :-) -- Davide Alberani <[EMAIL PROTECTED]> [PGP KeyID: 0x465BFD47] http://erlug.linux.it/~da/ ------------------------------------------------------------------------- This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK & win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100&url=/ _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel