Hi Nicholas, Mahalanobis distance sounds pretty useful. If you have any favorite references containing real-world examples, definitely pass them along.
I have committed the patch with some minor modifications. I wanted to point out that contributed code should include Apache license headers and conform to the Mahout/Lucene code formatting conventions, specifically things like indenting 2 spaces per level instead of using tabs, etc. See the following pages for the details: https://cwiki.apache.org/confluence/display/MAHOUT/How+To+Contribute http://wiki.apache.org/lucene-java/HowToContribute Thanks for the contribution and welcome, Drew On Thu, Jul 22, 2010 at 4:05 PM, Nicolas Maillot <[email protected]> wrote: > Ted, > > I have just attached the patch. Tell me if you have any problem with it. > Many thanks for your help, > > Nicolas > > > > On Thu, Jul 22, 2010 at 9:54 PM, Ted Dunning <[email protected]> wrote: >> Nicolas, >> >> I think you ahve to attach the patch as a file. >> >> On Thu, Jul 22, 2010 at 12:52 PM, Nicolas Maillot (JIRA) >> <[email protected]>wrote: >> >>> >>> [ >>> https://issues.apache.org/jira/browse/MAHOUT-446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] >>> >>> Nicolas Maillot updated MAHOUT-446: >>> ----------------------------------- >>> >>> Status: Patch Available (was: Open) >>> >>> > Mahalanobis Distance + Singular Value Decomposition >>> > --------------------------------------------------- >>> > >>> > Key: MAHOUT-446 >>> > URL: https://issues.apache.org/jira/browse/MAHOUT-446 >>> > Project: Mahout >>> > Issue Type: New Feature >>> > Components: Classification >>> > Affects Versions: 0.4 >>> > Environment: GNU/Linux Ubuntu Lucid Lynx running in VMWare fusion >>> 2.0.7. >>> > Reporter: Nicolas Maillot >>> > Original Estimate: 0h >>> > Remaining Estimate: 0h >>> > >>> > This patch contains an implementation of the Mahalanobis distance + a >>> unit test. >>> > As explained in wikipedia ( >>> http://en.wikipedia.org/wiki/Mahalanobis_distance) , it is a useful way >>> of determining similarity of an unknown sample set to a known one. It >>> differs from Euclidean distance in that it takes into account the >>> correlations of the data set and is scale-invariant. >>> > Also contained in the patch: >>> > -A port of the SingularValueDecomposition Class to the Matrix data >>> structure. >>> > -An embryonic port of the matrix.linalg Algebra class to the >>> Matrix/Vector data structure. >>> >>> -- >>> This message is automatically generated by JIRA. >>> - >>> You can reply to this email to add a comment to the issue online. >>> >>> >> >
