Hi all,

I'm curious what approaches are recommended for generating user-user 
similarity, when I've got two (or more) distinct types of item data, both of 
which are fairly large.

E.g. let's say I had a set of users where I knew both (a) what books they had 
bought on Amazon, and (b) what YouTube videos they had watched.

For each user, I want to find the 10 most similar other users.

 - I could create two separate models, find the nearest 30 users for each user, 
and combine (maybe with weighting) the results.
 - I could toss all of the data into one model - and I could use a value of < 
1.0 for whichever type of preference is less important.

Any other suggestions? Input on the above two approaches?

Thanks!

-- Ken

--------------------------
Ken Krugler
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr




Reply via email to