Re: Mahout not giving recommendations with large data sets

Sean Owen Fri, 14 Aug 2009 00:02:31 -0700

Yes, an item-based recommender is better -- slope-one is also
reasonable. Slope one takes more precomputation to start up but is
likely faster at runtime. Also, try LogLikelihoodSimilarity. If you
find it actually gives better results, that's good news. It doesn't
even use rating values, so, you would then be able to drop that part
of your data.


Yes the GroupLens example is still there and updated for the new changes.

I am still quite puzzled about not getting recommendations. Are you
doing any sampling in any part of the code? that is if using less and
less of the data as it grows, in order to scale, maybe at some point
you are omitting so much data that it's sparse enough that many
similarities can't be computed.

On Thu, Aug 13, 2009 at 11:56 PM, mishkinf<[email protected]> wrote:
>
> Well in fact it is strange because I have the same data set and when it is 5
> million lines it produces a number of recommendation results then when it is
> more it simply returns no results but does not run into memory exceptions or
> anything abnormal that is bring printed on the console. This confused me.
>
> My dad is basically of the form -1 to 1. I am looking at a list of purchased
> items aka..
> <userid> <itemid> <# times purchased>
> but then I run a normalization algorithm on it so the data returned is
> actually
> <userid> <itemid> <value -1 to 1>
>
> In terms of users vs products. I'm looking at much much more users than
> products (millions/thousands). And users are always growing too. This is why
> I was thinking item based recommenders were good.
>
> This normalized data is the data I feed to mahout. I basically modified the
> GroupLens example and that has been what I was working off of. If that
> example exists in the 0.2 version it might be worth my while to upgrade.
>

Re: Mahout not giving recommendations with large data sets

Reply via email to