> What about "so difficult it's not worth it"? ;-)
>
lol ;)
but i prefere my saying ;)
> A different approach: use IMDbPY ('http' or 'mobile') to fetch the bulk
> of information you need - if the translated ones are not too many, write
> specific scrapers to access the French site and override the English
> entries. This way you don't have to modify IMDbPY, but just write a
> slightly more complex script.
>
ok thanks
Finally it works:
title='avatar'
imdbURL_base = 'http://www.imdb.fr%s'
imdbURL_find = imdbURL_base % '/find?%s'
params = 's=%s;mx=%s;q=%s' % ("tt", str(10), quote_plus(title))
#dans imdbpy on recherche td[3], ici td[3] est vide ????
path="//td[3]/a[starts-with(@href, '/title/tt')]/.."
path="//td/a[starts-with(@href, '/title/tt')]/.."
url_opener = IMDbURLopener()
ustring = url_opener.retrieve_unicode(imdbURL_find % params)
some_file_like = StringIO.StringIO(ustring)
parser = etree.XMLParser(recover=True)
tree = etree.parse(some_file_like, parser)
elements = tree.xpath(path)
logger.info(imdbURL_base % elements[0].xpath("./a/@href")[0])
give me:
http://www.imdb.fr/title/tt0499549/
then it will be easy to find the cover.
And for now it's the only thing i need
sebastien
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Imdbpy-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel