[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-28 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884313#comment-13884313 ] Pat Ferrel commented on MAHOUT-1030: Ran it through KMeans, FuzzyKMeans, sequential

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-27 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883103#comment-13883103 ] Pat Ferrel commented on MAHOUT-1030: using cosine similarity for clustering I'm

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-27 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883118#comment-13883118 ] Suneel Marthi commented on MAHOUT-1030: --- Agreed Pat. The range of Cosine Distance

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-27 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883138#comment-13883138 ] Andrew Musselman commented on MAHOUT-1030: -- As I recall that distance is the

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-27 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883149#comment-13883149 ] Suneel Marthi commented on MAHOUT-1030: --- That's correct Andrew. Its the distance of

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-27 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883188#comment-13883188 ] Pat Ferrel commented on MAHOUT-1030: The distance should be measured the same way

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-27 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883321#comment-13883321 ] Suneel Marthi commented on MAHOUT-1030: --- I hope I am wrong but someone please

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-27 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883555#comment-13883555 ] Andrew Musselman commented on MAHOUT-1030: -- Something's wrong with those

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-27 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883774#comment-13883774 ] Suneel Marthi commented on MAHOUT-1030: --- Andrew, looking at this issue now and u're

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-27 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883826#comment-13883826 ] Andrew Musselman commented on MAHOUT-1030: -- Looks good. Regression: Clustered

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-25 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881762#comment-13881762 ] Suneel Marthi commented on MAHOUT-1030: --- My bad, should have caught this earlier.

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-24 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881226#comment-13881226 ] Pat Ferrel commented on MAHOUT-1030: This fixes a very literal reading of the bug.

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-24 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881273#comment-13881273 ] Andrew Musselman commented on MAHOUT-1030: -- Yes, was looking at this last night

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-24 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881283#comment-13881283 ] Pat Ferrel commented on MAHOUT-1030: Hmm, Suneel recommends creating a new Jira so I

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2014-01-24 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881287#comment-13881287 ] Pat Ferrel commented on MAHOUT-1030: adde a

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-12-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842574#comment-13842574 ] Suneel Marthi commented on MAHOUT-1030: --- Patch committed to trunk, Thanks Andrew.

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-12-08 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842680#comment-13842680 ] Hudson commented on MAHOUT-1030: SUCCESS: Integrated in Mahout-Quality #2356 (See

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-12-02 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836677#comment-13836677 ] Pat Ferrel commented on MAHOUT-1030: as you wish, either is clear, thanks.

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-12-01 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836085#comment-13836085 ] Pat Ferrel commented on MAHOUT-1030: Don't have time to fully test this right now but

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-12-01 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836087#comment-13836087 ] Pat Ferrel commented on MAHOUT-1030: I hope Jeff can answer about normalized results,

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-12-01 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836091#comment-13836091 ] Andrew Musselman commented on MAHOUT-1030: -- Okay; I did the distance calculation

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-12-01 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836099#comment-13836099 ] Pat Ferrel commented on MAHOUT-1030: Thanks, I see now. That looks correct. This is

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-12-01 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836280#comment-13836280 ] Andrew Musselman commented on MAHOUT-1030: -- Or output square-root of

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-11-30 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13835947#comment-13835947 ] Andrew Musselman commented on MAHOUT-1030: -- Hm, just noticed my patch is very

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-11-28 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13834990#comment-13834990 ] Andrew Musselman commented on MAHOUT-1030: -- I'll dig through the 0.6 code for

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-11-28 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13835031#comment-13835031 ] Pat Ferrel commented on MAHOUT-1030: Broken record warning: The bigger issue (I agree

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-11-28 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13835036#comment-13835036 ] Andrew Musselman commented on MAHOUT-1030: -- I'm planning on fixing this for

Re: [jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-10-31 Thread Andrew Musselman
So it sounds like there are a few things going on: (1) The quick fix would be to revert to the WeightedPropertyVectorWritable so we could hold on to the key or distance to centroid, e.g., for each vector (2) But WeightedPropertyVectorWritable is not sufficient or general enough for how people

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-10-30 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809686#comment-13809686 ] Andrew Musselman commented on MAHOUT-1030: -- Grant, who should I talk to to get

Re: [jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-10-30 Thread Pat Ferrel
As I recall this goes back to 0.6 where the clustered points used to store the distance to the centroid in the properties of a WeightedPropertyVectorWritable then 0.7 changed the output to WeightedVectorWritable and the distance to centroid could not be stored in the new class. I modified my

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-10-30 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809813#comment-13809813 ] Grant Ingersoll commented on MAHOUT-1030: - Andrew, I suppose it depends on what

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-06-10 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679620#comment-13679620 ] Pat Ferrel commented on MAHOUT-1030: Not really. What I do now is recalculate the

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-06-10 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679624#comment-13679624 ] Pat Ferrel commented on MAHOUT-1030: BTW you can look at the code of 0.6 or 0.7 to

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-06-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679170#comment-13679170 ] Grant Ingersoll commented on MAHOUT-1030: - Pat, do you have a patch for this that

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-06-08 Thread Suneel Marthi (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678785#comment-13678785 ] Suneel Marthi commented on MAHOUT-1030: --- [~pferrel]Did you have a chance to try the

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2013-06-08 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678847#comment-13678847 ] Pat Ferrel commented on MAHOUT-1030: +1 to Lance's point. My point of 01/Jul/12 was

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2012-07-01 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404737#comment-13404737 ] Pat Ferrel commented on MAHOUT-1030: A couple other thoughts: 1) With

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2012-06-30 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404581#comment-13404581 ] Pat Ferrel commented on MAHOUT-1030: Personally I have a work around that's

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2012-06-30 Thread Lance Norskog (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404650#comment-13404650 ] Lance Norskog commented on MAHOUT-1030: --- bq. If this fixes the regression

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2012-06-28 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403570#comment-13403570 ] Pat Ferrel commented on MAHOUT-1030: Jeff said; It is trivial to back-calculate the

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2012-06-09 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292427#comment-13292427 ] Hudson commented on MAHOUT-1030: Integrated in Mahout-Quality #1537 (See

[jira] [Commented] (MAHOUT-1030) Regression: Clustered Points Should be WeightedPropertyVectorWritable not WeightedVectorWritable

2012-06-09 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292438#comment-13292438 ] Hudson commented on MAHOUT-1030: Integrated in Mahout-Quality #1538 (See