Re: ItemSimilarity algorithm

Sean Owen Tue, 03 Jul 2012 09:19:36 -0700

Item-item similarity is a property of the information you have on two
items and just those items. Whether there are just those 2 items over
500K users, or 2M items over 500K users, makes no difference. So no I
don't think that this skew implies you should use any particular
algorithm, by itself.


I think other considerations tend to dominate. For example very sparse
data makes Pearson / cosine measure not work well. But with so
relatively few items... I imagine it is not so sparse.

On Tue, Jul 3, 2012 at 6:57 PM, Saikat Kanjilal <[email protected]> wrote:
>
> Hello Everyone,I was reading through the documentation on the different 
> itemsimilarity algorithms in mahout and had a question, if one has a scenario 
> where the number of items are significantly less  than the number of users 
> (say 500,000 users to 1000 items) are there particular item similarity 
> coefficients (namely logLikelihood or tanimoto coeeficient) that lend 
> themself to producing better recommendations, I've read through the Mahout in 
> action and the java docs and cant seem to find any clues on this.  Any 
> insight based on your experience would be much appreciated.
> Regards

Re: ItemSimilarity algorithm

Reply via email to