Re: Add on to itemsimilarity

Ted Dunning Mon, 30 Jan 2012 10:32:47 -0800

I don't know that I have any secrets.

I have observed garbage performance from recommenders based on behavior.
 That performance got enormously better as we chose different behaviors to
indicate engagement.

As an example of what we looked at, consider a video site which records
ratings, video views and 30 second (or more) video views.

Ratings information was minute compared to the other data and thus had
little value.  Many videos never had any ratings and the vast majority of
all users never rated anything.  Even worse, it was impossible to ever
detect any improvement in performance when we added ratings information.
 Performance with ratings alone was not discernibly better than random
recommendations.

Video views was the largest data source and after the problems with the
paucity of ratings it looked better.  Unfortunately, our users often
clicked in videos due to misleading meta-data or because they were vaguely
curious.  Neither of those situations represented an expression of user
preference.  In practice, recommender performance with video views was
better than random, but still pretty poor.

30 second video views produced very good results in spite of the fact that
the data was 10x smaller than raw video views.  This was demonstrated by
heuristic examination (aka the "laugh test") and by click-through and by
user session length.  Mixing in video views degraded performance visibly.

In building these systems, it was critical to incorporate a system like the
LogLikelihoodSimilarity for building the item-item model.  Direct user
based recommenders that used cosine and similar user-user metrics were
laughably bad and were dominated by popular items.

In earlier work at Musicmatch, we had similar results in that we had to
carefully select which interactions we used as input to the recommender.
 The overall process was much simpler, however, since we came closer to
good results in our first tries.

On Mon, Jan 30, 2012 at 1:43 PM, Lee Carroll
<[email protected]>wrote:

> >So I find that mental state estimations are the indirect way to model and
> >predict behaviors while directly modeling behaviors based on observed
> >behaviors is, well, more direct.
>
> That's a lovely switch :-) you should come and work for our business
> unit, they would love you :-)
>
> However the experience of using page behaviour to recommend product
> has been really disappointing
> never out performing simple heuristics (and i mean really simple
> market segmentation). Maybe we should look again
> but having fallen for the engagement metric stuff once what would we
> need to look out for to make it better ?
> What's your secret Ted!

Re: Add on to itemsimilarity

Reply via email to