Re: Recommendations from flat data

Sean Owen Fri, 01 May 2009 11:10:31 -0700

As a small follow-up on this, here's a small result that should hold --

Setting the sampling rate to, say, 1/X (i.e. if you set it to 20%,
X=5), should reduce the time spent in finding a neighborhood by a
factor of X. Of course. Assuming users are pretty evenly scattered
around your rating-space, the average distance to users in your
computed neighborhood also increases by a factor of X.

So you get results X times faster, but the results you get are X times
'worse'. This sounds bad but consider that users 5 times farther away
in your rating-space may still be suitable neighbors and yield the
same recommendations.

On Fri, May 1, 2009 at 8:32 AM, Sean Owen <[email protected]> wrote:
> It really depends on the nature of the data and what tradeoff you want
> to make. I have not studied this in detail. Anecdotally, on a
> large-ish data set you can ignore most users and still end up with an
> OK neighborhood.
>
> Actually I should do a bit of math to get an analytical result on
> this, let me do that.

Re: Recommendations from flat data

Reply via email to