On Tue, Jan 15, 2013 at 1:51 PM, Valerio Marra <valerio.ma...@me.com> wrote:
> the ratings.list.gz file would be perfect if the rating distributions were 
> not rounded at steps of 10%.
> We do need the actually rating distribution for our analysis.

I see.
Your best option is to use IMDbPY to parse the 'ratings' page of a movie, then.
Something like:
import imdb
ia = imdb.IMDb()

m = ia.get_movie('0133093')
print m.get('number of votes')

Obviously you also have to 'votes' and 'rating' keys.
There are also the 'mean and median' and 'demographic voters'
keys, but seems to be broken, at the moment.

Problem is, this method will be really slow, if you have to parse
many movies.  Plus, you also need to know their IDs (or do a search).

Have you tried asking directly IMDb, for these information?

Davide Alberani <davide.alber...@gmail.com>  [PGP KeyID: 0x465BFD47]

Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
Imdbpy-help mailing list

Reply via email to