It's not a traditional ratings corpus, but the ASF mail archives I put up all have clear provenance and are freely available and I don't think it is too hard to make a recommender problem out of them, likely based on the replies. There are 6m+ items in it. And now that Amazon has free inbound, I may well setup a job to do it on a more regular basis, perhaps quarterly.
-Grant On Jul 7, 2011, at 11:05 PM, Lance Norskog wrote: > What recommendation datasets, that are available, are considered > "large" by Mahout testing standards? Yahoo KDD Cup is offline, the > Netflix data went under a cloud... > > -- > Lance Norskog > [email protected] -------------------------- Grant Ingersoll
