[ 
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842680#comment-13842680
 ] 

Hudson commented on MAHOUT-1030:
--------------------------------

SUCCESS: Integrated in Mahout-Quality #2356 (See 
[https://builds.apache.org/job/Mahout-Quality/2356/])
MAHOUT-1030:Regression: Clustered Points Should be 
WeightedPropertyVectorWritable not WeightedVectorWritable (smarthi: rev 1549089)
* 
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/classify/ClusterClassificationDriver.java
MAHOUT-1030: Regression: Clustered Points Should be 
WeightedPropertyVectorWritable not WeightedVectorWritable (smarthi: rev 1549087)
* /mahout/trunk/CHANGELOG
* 
/mahout/trunk/core/src/main/java/org/apache/mahout/clustering/classify/ClusterClassificationMapper.java
* 
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/classify/ClusterClassificationDriverTest.java
* 
/mahout/trunk/core/src/test/java/org/apache/mahout/clustering/kmeans/TestKmeansClustering.java
* 
/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/AbstractClusterWriter.java
* 
/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/CSVClusterWriter.java
* 
/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/ClusterDumper.java
* 
/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/ClusterDumperWriter.java
* 
/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/GraphMLClusterWriter.java
* 
/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/clustering/JsonClusterWriter.java
* 
/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/vectors/lucene/ClusterLabels.java


> Regression: Clustered Points Should be WeightedPropertyVectorWritable not 
> WeightedVectorWritable
> ------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-1030
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1030
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering, Integration
>    Affects Versions: 0.7
>            Reporter: Jeff Eastman
>            Assignee: Andrew Musselman
>             Fix For: 1.0, 0.9
>
>         Attachments: MAHOUT-1030.patch, MAHOUT-1030.patch, MAHOUT-1030.patch, 
> MAHOUT-1030.patch, MAHOUT-1030.patch, MAHOUT-1030.patch, MAHOUT-1030.patch
>
>
> Looks like this won't make it into this build. Pretty widespread impact on 
> code and tests and I don't know which properties were implemented in the old 
> version. I will create a JIRA and post my interim results.
> On 6/8/12 12:21 PM, Jeff Eastman wrote:
> > That's a reversion that evidently got in when the new 
> > ClusterClassificationDriver was introduced. It should be a pretty easy fix 
> > and I will see if I can make the change before Paritosh cuts the release 
> > bits tonight.
> >
> > On 6/7/12 1:00 PM, Pat Ferrel wrote:
> >> It appears that in kmeans the clusteredPoints are now written as 
> >> WeightedVectorWritable where in mahout 0.6 they were 
> >> WeightedPropertyVectorWritable? This means that the distance from the 
> >> centroid is no longer stored here? Why? I hope I'm wrong because that is 
> >> not a welcome change. How is one to order clustered docs by distance from 
> >> cluster centroid?
> >>
> >> I'm sure I could calculate the distance but that would mean looking up the 
> >> centroid for the cluster id given in the above WeightedVectorWritable, 
> >> which means iterating through all the clusters for each clustered doc. In 
> >> my case the number of clusters could be fairly large.
> >>
> >> Am I missing something?
> >>
> >>
> >



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to