Hi, this was a recurring theme: sometimes the "canonical title" for a movie (the one in the "Movie Title, The" format) returned a wrong title, because it does some simplistic assumptions about what an article (in the grammatical sense) is.
Talking with Turgut we came up with a draft of possible solution; it _can't be_ perfect, but it uses some smartness to guess the language of (the title of) a movie. What we need from you is: 1. the name of your language and a list of countries in which it's the main lang (see the _LANG_COUNTRIES dictionary in the new articles.py file [1]). 2. a list of articles (if possible in order of using frequency) in that language - see the LANG_ARTICLES dictionary. An example that actually works: from imdb import IMDb ia = IMDb('http') m = ia.get_movie('0095016') # Die Hard (1988), a well known problematic case print m['canonical title'], '::', m['smart canonical title'] print m['long imdb canonical title'], '::', m['smart long imdb canonical title'] How it works: - the utils.canonicalTitle and the utils.build_title functions now accept the 'lang' argument, which can be None (in this case the old behaviour is triggered - and the same is true if the given lang is not known) or can be any language specified in the articles.LANG_ARTICLES dictionary. - the Movie class has a 'smartCanonicalTitle' method, which tries to guess the language of the movie using the first production country. If not found, it uses the first languages. - the smartCanonicalTitle method is called by some keys, like: movie['smart canonical title'] movie['smart long imdb canonical title'] movie['smart canonical series title'] movie['smart canonical episode title'] As usual, I'm open to any hint; the current implementation is far from perfect and was done in a very short amount of time. I'm not yet sure that testing the production country first, and only later the language of a movie, is a good idea. What do you think? +++ [1] http://imdbpy.svn.sourceforge.net/viewvc/imdbpy/trunk/imdbpy/imdb/articles.py?view=markup Contribute to it for fun, glory and money! Pick only one. And forget about money... ;-) -- Davide Alberani <davide.alber...@gmail.com> [GPG KeyID: 0x465BFD47] http://erlug.linux.it/~da/ ------------------------------------------------------------------------------ Come build with us! The BlackBerry® Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9-12, 2009. Register now! http://p.sf.net/sfu/devconf _______________________________________________ Imdbpy-devel mailing list Imdbpy-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/imdbpy-devel