[
https://issues.apache.org/jira/browse/MAHOUT-1601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090515#comment-14090515
]
Frank Rosner commented on MAHOUT-1601:
--------------------------------------
Ok. This discussion is somewhat unrelated to the issue itself that JavaDoc
should be added but I believe that neither of the three similarities will help
you. You can't use the DummySimilarity because it is in a test package and it
does not do anything useful. The GenericUserSimilarity seems to require a
similarity matrix (which you don't have) and LogLikelihoodSimilarity seems
rather complex.
Not sure which objects you want to compare using a similarity measure. Do you
want to compare different names (i.e. projects, I suppose), or all the records
which leads to a comparison of [name1, language1, numberoflinesofcode1] with
[name1, language2, numberoflinesofcode2], for example?
You could model the objects in a vector space and use some vector similarities
(cosine, euclidian, ...). You would have to dummy-code the name and the
language then I suppose. Or you could categorize the LOC (lines of code) and
compare using set similarity, i.e. count the number of matching features.
Maybe you can ask a supervisor or the user mailing list about this.
> Add javadoc for the classes - as there is no clue what the class is for .
> -------------------------------------------------------------------------
>
> Key: MAHOUT-1601
> URL: https://issues.apache.org/jira/browse/MAHOUT-1601
> Project: Mahout
> Issue Type: Documentation
> Components: Documentation
> Reporter: Harish Kayarohanam
> Priority: Minor
> Labels: documentation
>
> I found that the following classes
> org.apache.mahout.cf.taste.impl.neighborhood.DummySimilarity
> org.apache.mahout.cf.taste.impl.similarity.GenericUserSimilarity
> org.apache.mahout.cf.taste.impl.similarity.LogLikelihoodSimilarity
> did not have java doc . So I was unable to find what these classes are for .
> Shall we add java doc for the same ?
--
This message was sent by Atlassian JIRA
(v6.2#6252)