Hi,
this was a recurring theme: sometimes the "canonical title" for a
movie (the one in the "Movie Title, The" format) returned a wrong
title, because it does some simplistic assumptions about what
an article (in the grammatical sense) is.

Talking with Turgut we came up with a draft of possible solution;
it _can't be_ perfect, but it uses some smartness to guess the
language of (the title of) a movie.

What we need from you is:
1. the name of your language and a list of countries in which it's
   the main lang (see the _LANG_COUNTRIES dictionary in the new
   articles.py file [1]).
2. a list of articles (if possible in order of using frequency)
   in that language - see the LANG_ARTICLES dictionary.

An example that actually works:
  from imdb import IMDb
  ia = IMDb('http')
  m = ia.get_movie('0095016') # Die Hard (1988), a well known problematic case
  print m['canonical title'], '::', m['smart canonical title']
  print m['long imdb canonical title'], '::', m['smart long imdb canonical 
title']

How it works:
- the utils.canonicalTitle and the utils.build_title functions now
  accept the 'lang' argument, which can be None (in this case the old
  behaviour is triggered - and the same is true if the given lang is
  not known) or can be any language specified in the articles.LANG_ARTICLES
  dictionary.
- the Movie class has a 'smartCanonicalTitle' method, which tries to
  guess the language of the movie using the first production country.  If
  not found, it uses the first languages.
- the smartCanonicalTitle method is called by some keys, like:
    movie['smart canonical title']
    movie['smart long imdb canonical title']
    movie['smart canonical series title']
    movie['smart canonical episode title']

As usual, I'm open to any hint; the current implementation is far from
perfect and was done in a very short amount of time.

I'm not yet sure that testing the production country first, and only
later the language of a movie, is a good idea.  What do you think?


+++
[1] 
http://imdbpy.svn.sourceforge.net/viewvc/imdbpy/trunk/imdbpy/imdb/articles.py?view=markup
    Contribute to it for fun, glory and money!
    Pick only one.
    And forget about money... ;-)
-- 
Davide Alberani <davide.alber...@gmail.com> [GPG KeyID: 0x465BFD47]
http://erlug.linux.it/~da/

------------------------------------------------------------------------------
Come build with us! The BlackBerry&reg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9&#45;12, 2009. Register now&#33;
http://p.sf.net/sfu/devconf
_______________________________________________
Imdbpy-devel mailing list
Imdbpy-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-devel

Reply via email to