[
https://issues.apache.org/jira/browse/MAHOUT-344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873541#action_12873541
]
Ankur commented on MAHOUT-344:
------------------------------
Just back from a vacation. I am catching up with a lot of things so won't be
able to review Cristi's changes for next 3 - 4 days but I am hoping that any
further changes would be minor.
Cristi, do you have some results to share from your testing over last.fm
dataset ?
Once this is in we can start working towards using this to generate
recommendations.
> Minhash based clustering
> -------------------------
>
> Key: MAHOUT-344
> URL: https://issues.apache.org/jira/browse/MAHOUT-344
> Project: Mahout
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 0.3
> Reporter: Ankur
> Assignee: Ankur
> Attachments: MAHOUT-344-v1.patch, MAHOUT-344-v2.patch
>
>
> Minhash clustering performs probabilistic dimension reduction of high
> dimensional data. The essence of the technique is to hash each item using
> multiple independent hash functions such that the probability of collision of
> similar items is higher. Multiple such hash tables can then be constructed
> to answer near neighbor type of queries efficiently.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.