Re: Mahout performance issues

Daniel Zohar Fri, 02 Dec 2011 10:10:20 -0800

And how do you purpose to kill these items? I mean, we should still keep
all the user-item associations, shouldn't we?
If it's that popular, how would we recommend items for users which had
interacted only with that item alone?


On Fri, Dec 2, 2011 at 8:07 PM, Ted Dunning <[email protected]> wrote:

> Since touching them adds nothing but cost, then not touching them is
> better.  Kill the item!
>
> In practical terms, we had this problem at Veoh.  Everybody got the same
> intro video.  It provided no information.  Likewise at Musicmatch,
> everybody got the same startup noise during the splash screen.  It added no
> information.  Both of these cases would kill performance in lots of
> recommendation engines because a vast number of users would get sucked into
> computations where it made no difference at all.
>
> Better to kill these items.
>
> On Fri, Dec 2, 2011 at 10:03 AM, Sean Owen <[email protected]> wrote:
>
> > Yes, but those users will bring no more candidate items to consider, and
> > the apparent bottleneck is not touching those users, but later computing
> > all those similarities. That's my argument.
> >
> > On Fri, Dec 2, 2011 at 5:56 PM, Ted Dunning <[email protected]>
> wrote:
> > >
> > > Actually, if these users single item is a fantastically popular item,
> > then
> > > all of those users will be roped into the computation (with no effect).
> > >
> > > Sean's argument would be correct if the users were each interacting
> with
> > > some item that is way out in the low frequency tail.  By Murphy, this
> > won't
> > > be the case.
> > >
> > > Better to dump the uninformative items using a kill list.
> > >
> >
>

Re: Mahout performance issues

Reply via email to