After thinking about it more, I think your theory is right.

You really should use more like 90% of your data to train, and 10% to test,
rather than the other way around. Here it seems fairly clear that the 10%
training test is returning a result that isn't representative of the real
performance. That's how I'd really "fix" this, plain and simple.

Sean

On Wed, Jan 4, 2012 at 11:42 AM, Nick Jordan <[email protected]> wrote:

> Yeah, I'm a little perplexed.  By low-rank items I mean items that have a
> low number of preferences not a low average preference.  Basically if we
> don't have some level of confidence in our ItemSimilarity based on the fact
> that not many people have given a preference good or bad, don't recommend
> them.  To your point though LogLikelihood may already account for that
> making these results even more surprising.
>
>

Reply via email to