On Mon, Sep 10, 2012 at 10:27 PM, Rick Summerhill <rr...@summerhill.org> wrote:
> David, is there a write-up on the web page about the issues related to the 
> imdbIDs/sqlIDs, and are there work arounds that people use,

Not much, as a matter of fact.
As said, the real imdbIDs are not in the plain text data files, so we have to
make up ours IDs for movies, persons, etc.

The IMDbPY's database has a field, 'imdb_id', that is used to store the
real imdbID; it's updated transparently when for some reason your code
do something that requires a connection to the web to fetch data for the
given movie/person/...
For example, these method will store the real imdbID: get_imdbMovieID,
get_imdbID, get_imdbURL.

The imdbpy2sql.py tries to be smart enough, doing an upgrade, to preserve
the realIDs (associations beween movies/persons/... and real imdbIDs is done
checking the md5sum of the row).
Unfortunately I think that there're still some bugs in the code, especially
using postgresql and/or SQlAlchemy. :-|

Beside this, there's not much you can do, and I'd refrain you from using
IMDbPY to scan each and every title on the IMDb web site... you can't really
DoS them (ehi, it's Amazon ;) but it's for sure against their policies. :-)


Davide Alberani <davide.alber...@gmail.com>  [PGP KeyID: 0x465BFD47]

Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
Imdbpy-help mailing list

Reply via email to