Re: Mahout performance issues

Sean Owen Wed, 30 Nov 2011 12:29:01 -0800

The simple answer is that:

Mahout absorbed a non-distributed recommender project called Taste, which
scales up to a point which may be sufficient for a lot of users. It
certainly is a lot simpler. Yes it is realistic to do near-real-time
recommendations, though it gets harder and harder and requires more tuning,
tradeoffs and optimization as this thread shows.


The rest, written from scratch, is almost all distributed and Hadoop-based,
including distributed re-implementations of the same algorithms.

On Wed, Nov 30, 2011 at 8:23 PM, Dan Beaulieu
<[email protected]>wrote:

> Hi all, this is a tangent and can mostly be ignored by the people
> interested in this problem.
>
> I'm new to Machine Learning and especially Mahout. Following this
> discussion has made me a bit confused.
> Isn't Mahout used for large datasets where it makes sense to distribute the
> work? Why then isn't anyone pointing
> out that the problem may be the use of one single Mahout node? Is it
> because it's boolean based? Is it because the data set
> isn't really that large?
>
> Even if for whatever reason a single node will do for this case, is it
> really expected that the recommendation process would finish in less than
> half a second?
> This makes me think if that is the expectation then the data set is
> actually small and Mahout might be overkill...
>
> What obvious piece of the Mahout puzzle am I missing?
>
> Thanks.
>
> Dan
>
> On Wed, Nov 30, 2011 at 11:56 AM, Sean Owen <[email protected]> wrote:
>
> > Have you used CachingItemSimilarity? That will hold common similarities
> in
> > memory. It's a lot easier than pre-computing and might help.
> >
> > I think something like your change is a good one (Sebastian what do you
> > think) in that it gives you the ultimate lever to control how many
> > candidates are evaluated. That ought to make it go as fast as you like,
> but
> > it trades off quality. Still I'd be really surprised if there's no viable
> > middle ground -- this works fine at smaller scale, where 100s of
> candidates
> > are evaluated, perhaps, and you can use your lever to get to 100s of
> > candidates at your scale too. Is that still both slow and inaccurate?
> >
> > On Wed, Nov 30, 2011 at 3:18 PM, Daniel Zohar <[email protected]>
> wrote:
> >
> > > I just tested the app with Mahout 0.6.
> > > There seems to be a small performance improvement, but still
> > > recommendations for the 'heavy users' take between 1-5 seconds.
> > >
> > >
> >
>

Re: Mahout performance issues

Reply via email to