Item-item similarity is a property of the information you have on two items and just those items. Whether there are just those 2 items over 500K users, or 2M items over 500K users, makes no difference. So no I don't think that this skew implies you should use any particular algorithm, by itself.
I think other considerations tend to dominate. For example very sparse data makes Pearson / cosine measure not work well. But with so relatively few items... I imagine it is not so sparse. On Tue, Jul 3, 2012 at 6:57 PM, Saikat Kanjilal <[email protected]> wrote: > > Hello Everyone,I was reading through the documentation on the different > itemsimilarity algorithms in mahout and had a question, if one has a scenario > where the number of items are significantly less than the number of users > (say 500,000 users to 1000 items) are there particular item similarity > coefficients (namely logLikelihood or tanimoto coeeficient) that lend > themself to producing better recommendations, I've read through the Mahout in > action and the java docs and cant seem to find any clues on this. Any > insight based on your experience would be much appreciated. > Regards
