[ https://issues.apache.org/jira/browse/MAHOUT-393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved MAHOUT-393. ------------------------------ Assignee: Sean Owen Fix Version/s: 0.4 Resolution: Fixed Done, I committed with only two substantive tweaks: - I had switched over to VLongWritable from LongWritable. Most IDs used don't really need nearly 8 bytes, so variable-length coding saves a lot. - CountUsersKeyWritable didn't define equals() and hashCode() non-trivially, and was inconsistent with compareTo(). Do I miss something about this? > Distributed item similarity functions > ------------------------------------- > > Key: MAHOUT-393 > URL: https://issues.apache.org/jira/browse/MAHOUT-393 > Project: Mahout > Issue Type: Improvement > Components: Collaborative Filtering > Reporter: Sebastian Schelter > Assignee: Sean Owen > Fix For: 0.4 > > Attachments: MAHOUT-393.patch > > > To complete the work started in MAHOUT-389, I've created a distributed > version of any item similarity function that is currently already available > in a non-distributed manner. An additional M/R job was necessary to compute > the number of all users which is needed by some similarity functions (like > LogLikelihoodSimilarity for example). > There is still some optimization potential in the code as not every > similarity function needs all information that is currently extracted (like > the number of users e.g.), but the optimization would have made the code much > less readable so I did not do any work on that. > I hope you consider this a useful addition. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.