Thanks a lot for the explanations, much appreciated. I'm still mostly poking around the database right now just to see what data is in there, but I definitely appreciate the tip on using IMDbPY to extract information from the database.
On Mon, Mar 22, 2010 at 5:07 PM, Davide Alberani <davide.alber...@gmail.com> wrote: > On Mar 22, Michael Liu <mikel...@gmail.com> wrote: > >> In the title table generated by imdbpy2sql, what are the meanings of >> the column titled imdb_index and phonetic_code? > > They are internally used, so they're not documented. > > phonetic_code (and the various *_pcode columns in other tables) is > used when you search for a given title; it's value is calculated > at insert-time, and is a SOUNDEX phonetic code (i.e. a representation > of how a given word/phrase sounds). > So that, handling a search, we can select a subset of the database of > titles that "sound similar" to the one we're searching for - this subset > is then ordered using a Ratcliff-Obershelp similarity metric. > You can find the layout of the database in the imdb.parser.sql.dbschema > module (abstracted: we're pretty naive and support both SQLObject and > SQLAlchemy... ;-) > An old message about sondex/racliff-obershelp: > http://sourceforge.net/mailarchive/message.php?msg_name=20060407152643.GB4376%40libero.it > > imdb_index is what a long time ago I decided to call the "imdbIndex" > (probably not a very good name...): it's used when two movies, > produced the same year, share the same title. It's the one you may see > in the imdb.com page after a title, inside the parentheses containing > the production year, separated by a slash. > > Example: > 10 Bullets (2007/I) > 10 Bullets (2007/II) > > It's also used to disambiguate persons' names. > > > Now... a question: do you really need to understand the internals > of IMDbPY? It's perfectly legit, but IMDbPY is not only a tool to > put the plain text data files into a SQL database: it's perfectly > able to extract the information from the database, too. :-) > Are you sure that you need to directly access the database, without > using IMDbPY? > > > Bye! > -- > Davide Alberani <davide.alber...@gmail.com> [GPG KeyID: 0x465BFD47] > http://www.mimante.net/ > ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Imdbpy-help mailing list Imdbpy-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-help