Further, the MR version has NamedVectors but the non-MR version has RandomAccessSparseVectors.
On Sat, Jan 25, 2014 at 9:43 AM, Andrew Musselman < [email protected]> wrote: > The vectors are having keys added during the MR version, which the > reference in the test doesn't expect. See attached screenshots of > variables during debugging. > > > On Sat, Jan 25, 2014 at 9:29 AM, Andrew Musselman < > [email protected]> wrote: > >> Still trying to understand what these tests are doing, but that is only >> blowing up when that is called >> during testVectorClassificationWithOutlierRemoval*MR*. Runs fine >> during testVectorClassificationWithOutlierRemoval. >> >> >> On Sat, Jan 25, 2014 at 9:14 AM, Andrew Musselman < >> [email protected]> wrote: >> >>> Trying it out, found one test failure: >>> >>> Failed tests: >>> >>> ClusterClassificationDriverTest.testVectorClassificationWithOutlierRemovalMR:102->assertVectorsWithOutlierRemoval:188->checkClustersWithOutlierRemoval:238->Assert.assertTrue:41->Assert.fail:88 >>> not expecting cluster:0:{0:1.0,1:1.0} >>> >>> Here's the stack trace when I run that test: >>> >>> java.lang.AssertionError: not expecting cluster:0:{0:1.0,1:1.0} >>> at __randomizedtesting.SeedInfo.seed([9DD682CDC661ECA:5DEF7B1855381EF]:0) >>> at org.junit.Assert.fail(Assert.java:88) >>> at org.junit.Assert.assertTrue(Assert.java:41) >>> at >>> org.apache.mahout.clustering.classify.ClusterClassificationDriverTest.checkClustersWithOutlierRemoval(ClusterClassificationDriverTest.java:238) >>> at >>> org.apache.mahout.clustering.classify.ClusterClassificationDriverTest.assertVectorsWithOutlierRemoval(ClusterClassificationDriverTest.java:188) >>> at >>> org.apache.mahout.clustering.classify.ClusterClassificationDriverTest.testVectorClassificationWithOutlierRemovalMR(ClusterClassificationDriverTest.java:102) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> >>> >>> On Sat, Jan 25, 2014 at 1:31 AM, Suneel Marthi (JIRA) >>> <[email protected]>wrote: >>> >>>> >>>> [ >>>> https://issues.apache.org/jira/browse/MAHOUT-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] >>>> >>>> Suneel Marthi updated MAHOUT-1410: >>>> ---------------------------------- >>>> >>>> Affects Version/s: (was: 0.9) >>>> >>>> > clusteredPoints do not contain a vector id >>>> > ------------------------------------------ >>>> > >>>> > Key: MAHOUT-1410 >>>> > URL: >>>> https://issues.apache.org/jira/browse/MAHOUT-1410 >>>> > Project: Mahout >>>> > Issue Type: Bug >>>> > Components: Clustering >>>> > Affects Versions: 0.8 >>>> > Environment: using 0.9 release candidate >>>> > Reporter: Pat Ferrel >>>> > Assignee: Suneel Marthi >>>> > Fix For: 0.9 >>>> > >>>> > Attachments: MAHOUT-1410.patch >>>> > >>>> > >>>> > When clustering non-named vectors there are no vector ids in >>>> clusteredPoints so the other values there, cluster id, vector values, >>>> distance-squared, pdf, cannot be tied to any known vector. >>>> >>>> >>>> >>>> -- >>>> This message was sent by Atlassian JIRA >>>> (v6.1.5#6160) >>>> >>> >>> >> >
