Re: [Imdbpy-devel] french query

Sébastien RAGONS Sun, 21 Mar 2010 07:06:50 -0700

> What about "so difficult it's not worth it"? ;-)
> 
lol ;)
but i prefere my saying ;)



> A different approach: use IMDbPY ('http' or 'mobile') to fetch the bulk
> of information you need - if the translated ones are not too many, write
> specific scrapers to access the French site and override the English
> entries.  This way you don't have to modify IMDbPY, but just write a
> slightly more complex script.
> 
ok thanks


Finally it works:

title='avatar'
imdbURL_base = 'http://www.imdb.fr%s'
imdbURL_find = imdbURL_base % '/find?%s'

params = 's=%s;mx=%s;q=%s' % ("tt", str(10), quote_plus(title))

#dans imdbpy on recherche td[3], ici td[3] est vide ????
path="//td[3]/a[starts-with(@href, '/title/tt')]/.."
path="//td/a[starts-with(@href, '/title/tt')]/.."

url_opener = IMDbURLopener()
ustring = url_opener.retrieve_unicode(imdbURL_find % params)
some_file_like = StringIO.StringIO(ustring)
parser = etree.XMLParser(recover=True)
tree = etree.parse(some_file_like, parser)
elements = tree.xpath(path)
logger.info(imdbURL_base % elements[0].xpath("./a/@href")[0])


give me:
http://www.imdb.fr/title/tt0499549/

then it will be easy to find the cover.
And for now it's the only thing i need

sebastien











------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Re: [Imdbpy-devel] french query

Reply via email to