Re: [Imdbpy-help] getting grossing data

2011-11-13 Thread Davide Alberani
On Sat, Nov 12, 2011 at 20:44, Zsolt Ero zsolt@gmail.com wrote:

 I have just started using IMDbPY. I would like to get the grossing
 field of a given movie, but I don't know how.

These information are included in the 'business' data set, that
you've to retrieve, first.

An example:
import imdb
ia = imdb.IMDb()
avatar = ia.get_movie(0499549)
ia.update(avatar, 'business') # get the business information

business = avatar.get('business') or {} # may be empty
print business.get('gross') # to see what else is available: print
business.keys()


Unfortunately I notice now that a lot of garbage is colleted, too:
you should just get a list of information... :-/
I'll try to fix it ASAP.
When the data is retrieved from a SQL db, there are no problems.


-- 
Davide Alberani davide.alber...@gmail.com  [PGP KeyID: 0x465BFD47]
http://www.mimante.net/

--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
___
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help


Re: [Imdbpy-help] getting grossing data

2011-11-13 Thread Zsolt Ero
Yesterday I finally figured out how to get the grossing data out.

But I had to convert it into INT and it wasn't easy.

Finally I implemented this:

match = re.match(r\$([1-9][0-9,]+), movie['business']['gross'][0])
gross = match.group()[1:]
grossint = int(gross.replace(',', ''))

Do you say that you have implemented this now in the SVN?

Also, can you help me how to get the English title of a movie, what is
listed on the www site not on the aka site? So far I haven't found any
option to find out the www title. I know there is the akas list, but
it's almost impossible to handle those, as all of them are different
kind of long strings. Can you recommend a way to return the title from
the www. site? Or to figure out which is the www title from the akas
list?

Zsolt





On Sun, Nov 13, 2011 at 9:46 AM, Davide Alberani
davide.alber...@gmail.com wrote:
 On Sun, Nov 13, 2011 at 10:20, Davide Alberani
 davide.alber...@gmail.com wrote:

 Unfortunately I notice now that a lot of garbage is colleted, too:
 you should just get a list of information... :-/
 I'll try to fix it ASAP.

 It should be fixed in the mercurial repository, but I've not
 exactly tested it... :-)

 --
 Davide Alberani davide.alber...@gmail.com  [PGP KeyID: 0x465BFD47]
 http://www.mimante.net/


--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
___
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help


Re: [Imdbpy-help] getting grossing data

2011-11-13 Thread Davide Alberani
On Sun, Nov 13, 2011 at 15:00, Zsolt Ero zsolt@gmail.com wrote:

 Yesterday I finally figured out how to get the grossing data out.

Good. :-)

 match = re.match(r\$([1-9][0-9,]+), movie['business']['gross'][0])
 gross = match.group()[1:]
 grossint = int(gross.replace(',', ''))

Ok, but keep in mind that the currency may be English pound
or anything else, also... (and I think it can be before the value
or even after)

 Do you say that you have implemented this now in the SVN?

No, the previous code introduced in the list things that were not
business information at all: I just stripped those.

 Also, can you help me how to get the English title of a movie, what is
 listed on the www site not on the aka site?

Hmmm... do they ever differ?  Do you have an example?

 Or to figure out which is the www title from the akas list?

I'd probably need to do some tests (and have at least an example
to work on).
Maybe you can use the list of akas and the 'guessLanguage' of
the Movie instances (it tries to guess the language of the title/movie),
but I'm not too sure.


-- 
Davide Alberani davide.alber...@gmail.com  [PGP KeyID: 0x465BFD47]
http://www.mimante.net/

--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
___
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help


Re: [Imdbpy-help] getting grossing data

2011-11-13 Thread Zsolt Ero
Yes, they always differ for all international movies.

http://www.imdb.com/title/tt0060196/ - The Good, the Bad and the Ugly
- Il buono, il brutto, il cattivo. (original title)
http://akas.imdb.com/title/tt0060196/ - Il buono, il brutto, il cattivo.

Zsolt





On Sun, Nov 13, 2011 at 2:17 PM, Davide Alberani
davide.alber...@gmail.com wrote:
 On Sun, Nov 13, 2011 at 15:00, Zsolt Ero zsolt@gmail.com wrote:

 Yesterday I finally figured out how to get the grossing data out.

 Good. :-)

 match = re.match(r\$([1-9][0-9,]+), movie['business']['gross'][0])
 gross = match.group()[1:]
 grossint = int(gross.replace(',', ''))

 Ok, but keep in mind that the currency may be English pound
 or anything else, also... (and I think it can be before the value
 or even after)

 Do you say that you have implemented this now in the SVN?

 No, the previous code introduced in the list things that were not
 business information at all: I just stripped those.

 Also, can you help me how to get the English title of a movie, what is
 listed on the www site not on the aka site?

 Hmmm... do they ever differ?  Do you have an example?

 Or to figure out which is the www title from the akas list?

 I'd probably need to do some tests (and have at least an example
 to work on).
 Maybe you can use the list of akas and the 'guessLanguage' of
 the Movie instances (it tries to guess the language of the title/movie),
 but I'm not too sure.


 --
 Davide Alberani davide.alber...@gmail.com  [PGP KeyID: 0x465BFD47]
 http://www.mimante.net/


--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
___
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help


Re: [Imdbpy-help] getting grossing data

2011-11-13 Thread Zsolt Ero
Hi Davide,

Thanks for the code, however it doesn't really work so far. Here are
the top 10 movies ['title'] and guessEnglishTitles. It seems that the
aka matching is not working (except for the Good Bad and Ugly :-))

The Shawshank Redemption
Втеча з Шоушенка
-
The Godfather
Mario Puzo's The Godfather
-
The Godfather: Part II
Mario Puzo's The Godfather: Part II
-
Il buono, il brutto, il cattivo.
The Good, the Bad and the Ugly
-
Pulp Fiction
Кримiнальне чтиво
-
Schindler's List
Список Шиндлера
-
One Flew Over the Cuckoo's Nest
Zbor deasupra unui cuib de cuci
-
The Dark Knight
Batman: The Dark Knight
-
The Lord of the Rings: The Return of the King
The Return of the





Zsolt





On Sun, Nov 13, 2011 at 7:11 PM, Davide Alberani
davide.alber...@gmail.com wrote:
 On Sun, Nov 13, 2011 at 18:19, Zsolt Ero zsolt@gmail.com wrote:

 Yes, they always differ for all international movies.

 It has to be this way, to have consistency with the data from the plain
 text data files.

 You best shot is something like this (modify it as you wish):

 import re
 import imdb

 ia = imdb.IMDb()
 ibibic = ia.get_movie(0060196)

 # List of regexp used to search for
 # possible English title in the notes.
 # Notice that the order matters: the
 # first match, wins.
 # Right now .findall is used, but .match
 # can be used, too (changing the regexps...)
 # A
lso notice that there may be other variations
 # that can be consided, like '(imdb display title)',
 # '(literal English title)' and so on.
 _re_english_akas_notes = (
    re.compile('^International \(English', re.I),
    re.compile('International \(English', re.I),
    re.compile('^USA', re.I),
    re.compile('USA', re.I),
    re.compile('^UK', re.I),
    re.compile('UK', re.I),
    re.compile('^English', re.I),
    re.compile('English', re.I),
 )

 def guessEnglishTitle(movie, _releaseInfoToo=True):
    Return the guessed English title
    of the movie, or the default title,
    if unable to guess.
    # Consider both AKAs from the main
    # and release info pages.
    akas = movie.get('akas') or []
    if _releaseInfoToo:
        # FIXME: ia MUST be a parameter of this function!
        ia.update(movie, 'release dates')
        akas += movie.get('akas from release info') or []
    aka_list = []
    for aka in akas:
        aka_split = aka.split('::', 1)
        if len(aka_split)  2:
            continue
        aka_list.append(aka_split)
    best_guess = None
    for title, note in aka_list:
        for re_ in _re_english_akas_notes:
            if re_.findall(note):
                best_guess = title
                break
        if best_guess:
            break
    return title or movie.get('title')

 print guessEnglishTitle(ibibic)




 --
 Davide Alberani davide.alber...@gmail.com  [PGP KeyID: 0x465BFD47]
 http://www.mimante.net/


--
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
___
Imdbpy-help mailing list
Imdbpy-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/imdbpy-help