The simple answer is that: Mahout absorbed a non-distributed recommender project called Taste, which scales up to a point which may be sufficient for a lot of users. It certainly is a lot simpler. Yes it is realistic to do near-real-time recommendations, though it gets harder and harder and requires more tuning, tradeoffs and optimization as this thread shows.
The rest, written from scratch, is almost all distributed and Hadoop-based, including distributed re-implementations of the same algorithms. On Wed, Nov 30, 2011 at 8:23 PM, Dan Beaulieu <[email protected]>wrote: > Hi all, this is a tangent and can mostly be ignored by the people > interested in this problem. > > I'm new to Machine Learning and especially Mahout. Following this > discussion has made me a bit confused. > Isn't Mahout used for large datasets where it makes sense to distribute the > work? Why then isn't anyone pointing > out that the problem may be the use of one single Mahout node? Is it > because it's boolean based? Is it because the data set > isn't really that large? > > Even if for whatever reason a single node will do for this case, is it > really expected that the recommendation process would finish in less than > half a second? > This makes me think if that is the expectation then the data set is > actually small and Mahout might be overkill... > > What obvious piece of the Mahout puzzle am I missing? > > Thanks. > > Dan > > On Wed, Nov 30, 2011 at 11:56 AM, Sean Owen <[email protected]> wrote: > > > Have you used CachingItemSimilarity? That will hold common similarities > in > > memory. It's a lot easier than pre-computing and might help. > > > > I think something like your change is a good one (Sebastian what do you > > think) in that it gives you the ultimate lever to control how many > > candidates are evaluated. That ought to make it go as fast as you like, > but > > it trades off quality. Still I'd be really surprised if there's no viable > > middle ground -- this works fine at smaller scale, where 100s of > candidates > > are evaluated, perhaps, and you can use your lever to get to 100s of > > candidates at your scale too. Is that still both slow and inaccurate? > > > > On Wed, Nov 30, 2011 at 3:18 PM, Daniel Zohar <[email protected]> > wrote: > > > > > I just tested the app with Mahout 0.6. > > > There seems to be a small performance improvement, but still > > > recommendations for the 'heavy users' take between 1-5 seconds. > > > > > > > > >
