Hi Ted, Thanks for pointing me into the right direction. I just looked up more closely on the recommendation wiki and I think I can do something you proposed. To quote from this<https://cwiki.apache.org/confluence/display/MAHOUT/Itembased+Collaborative+Filtering>page: "*org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob* computes all similar items. It expects a .csv file with the preference data as input, where each line represents a single preference in the form * userID,itemID,value* and outputs pairs of itemIDs with their associated similarity value."
If I will pass the data in format "userId,groupId,1" it should output pairs of groupIDs with their similarities - or at least I hope so. Sounds easy :) Many thanks! Radek On 17 February 2011 17:42, Ted Dunning <[email protected]> wrote: > Yes. > > Simply transpose your data and then use standard similarity techniques. > > Transposition in this case means that you would reformulate your data to be > > group1: user ... user > > In practice, the standard input form for Mahout recommendations is more > like > this: > > user group rating > > where your ratings will always be 1. Simply redesignation of the two first > columns suffices to transpose data like this. > > On Thu, Feb 17, 2011 at 3:34 AM, Radek Maciaszek > <[email protected]>wrote: > > > I am trying to find a similarities between the groups (not the users). > Some > > simple similarity metric (e.g. 0-1, close to 0 for not similar at all, > > close > > to 1 very similar) would be ideal. So essentially I need to calculate > such > > a > > metric for every pair of groups. > > > > Is it something Mahout can help me with? > > >
