Re: [Imdbpy-devel] Simplifying the DOM parser

H. Turgut Uyar Mon, 11 Aug 2008 09:27:36 -0700

On 08/10/2008 06:18 PM, Davide Alberani wrote:
> We'll see what's the best solution (the old parser has a lot of
> its own bugs, after all ;-)
>


I've done most of the remaining person parsers today. There might be a 
problem with the old parser: If there are multiple nick names or mini 
biographies, only the first one seems to be collected. I've added a test 
for this (using the page http://akas.imdb.com/name/nm0000022/bio).

> Just as a note: the last DOMHTMLMovieParser begins to be good
> enough, but I've encountered a problem: the 'blackcatheader'
> Extractor fails, if lxmladapter is used (while it works with bsoup).
> Looks like the 'comp-link' ("./a/@href") path is not collected, for
> some reason.
> 

I couldn't reproduce this. On my setup, both parsers produce the same 
result. But I might be looking in the wrong place. Can you tell me how 
to produce it?

Turgut

> 
> Thanks!


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Imdbpy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Re: [Imdbpy-devel] Simplifying the DOM parser

Reply via email to