Movielens would be the one thats most commonly used by researchers, they have a 100k, 1million and 10 million ratings dataset.
On Fri, Jul 8, 2011 at 10:26 AM, Lance Norskog <[email protected]> wrote: > Thanks. > > Netflix & Yahoo KDD were my first choice, but are gone. It did not > occur to me that stashing such things away would be wise; packrat > though I am. > > Purpose is testing large user/item or document'/term databases. > > On Fri, Jul 8, 2011 at 12:44 AM, Sebastian Schelter <[email protected]> > wrote: > > Another dataset to play with is this compilation of song listenings > scraped > > from the last.fm API: > > > > http://mtg.upf.edu/node/1671. > > > > Should include about 20M ratings. > > > > --sebastian > > > > On 08.07.2011 09:17, Sean Owen wrote: > >> > >> The link is http://www.occamslab.com/petricek/data/ > >> > >> The KDD or Netflix data are plenty big to play with. How big is big for > >> your > >> purpose? > >> > >> On Fri, Jul 8, 2011 at 7:05 AM, web service<[email protected]> wrote: > >> > >>> Is it taken offline as well ? > >>> > >>> On Thu, Jul 7, 2011 at 10:40 PM, Alex Kozlov<[email protected]> > wrote: > >>> > >>>> There is still a libimseti dataset > >>>> http://www.occamslab.com/petricek/datawith 17,359,346 ratings. > People > >>>> are scared after the Netflix lawsuit. > >>>> > >>>> On Thu, Jul 7, 2011 at 10:17 PM, Ted Dunning<[email protected]> > >>>> wrote: > >>>> > >>>>> Those are both reasonably large, but not commercial in scale. > >>>>> > >>>>> At Veoh, we had about 10 non-zero elements in our raw data. I think > >>>>> Netflix > >>>>> has 100 million. > >>>>> > >>>>> On Thu, Jul 7, 2011 at 8:05 PM, Lance Norskog<[email protected]> > >>> > >>> wrote: > >>>>> > >>>>>> What recommendation datasets, that are available, are considered > >>>>>> "large" by Mahout testing standards? Yahoo KDD Cup is offline, the > >>>>>> Netflix data went under a cloud... > >>>>>> > >>>>>> -- > >>>>>> Lance Norskog > >>>>>> [email protected] > >>>>>> > >>>>> > >>>> > >>> > >> > > > > > > > > -- > Lance Norskog > [email protected] >
