Hi thanks Shannon for the pointers. I was looking at org.apache.mahout.math location to be the logical place where these measures were defined. Thanks suneel for pointing out the actual location.
I am new to apache mahout and was unaware of the fact that MapReduce code is no longer accepted. However KNN is a very basic classification algorithm and hence I wanted to work on it. On Mon, May 19, 2014 at 12:24 AM, Shannon Quinn <[email protected]> wrote: > Hi Arunav, > > Contributions are certainly welcome. If you can post a patch on JIRA ( > https://issues.apache.org/jira/browse/MAHOUT ), we can have a look at it. > I don't know if you've been monitoring our mailing lists or have otherwise > heard, but Mahout is no longer accepting new MapReduce code. We're still in > discussions regarding the next-generation Mahout backends, but we're moving > instead towards engine-agnostic (e.g. Mahout DSL, see > http://mahout.apache.org/users/sparkbindings/home.html ) implementations. > > As for Minkowski distance, I'm not sure if someone else is working on it, > but as I mentioned you're welcome to post a patch and we can discuss it > from there. Thanks! > > Shannon > > > On 5/18/14, 1:29 PM, Arunav Sanyal wrote: > >> Hi >> >> I am new to apache mahout and would like to contribute in whatever humble >> way I can. >> >> I see that the Vector class in Apache Mahout does not have the >> functionality of minkowski distance. >> >> http://en.wikipedia.org/wiki/Minkowski_distance >> >> is a distance metric which generalizes distance measures between any two >> vectors. It can represent hamming distance, euclidean distance depending >> on >> parameters. I already have a simple solution ready for review if this is >> approved. Similarly I am working on the more generic Mahalobnis distance >> measure. >> >> My primary motive for introducing these distance measures is to come up >> with a generic implementation of the K-nearest neighbor classifier (not to >> be confused K-means clustering). I will be working on that as well >> shortly. >> >> If somebody else is working towards these features, I would like to >> collaborate and donate whatever code patches that they deem necessary. If >> not, I humbly request that the community approve these for inclusion into >> apache mahout. >> >> >> Yours sincerely >> Arunav Sanyal >> > > -- Arunav Sanyal Graduate student B.E (Hons) Computer Science BITS Pilani K.K Birla Goa Campus Software Engineer INFORMATICA BUSINESS SOLUTIONS
