[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13884313#comment-13884313
]
Pat Ferrel commented on MAHOUT-1030:
Ran it through KMeans, FuzzyKMeans, sequential
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883103#comment-13883103
]
Pat Ferrel commented on MAHOUT-1030:
using cosine similarity for clustering I'm
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883118#comment-13883118
]
Suneel Marthi commented on MAHOUT-1030:
---
Agreed Pat. The range of Cosine Distance
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883138#comment-13883138
]
Andrew Musselman commented on MAHOUT-1030:
--
As I recall that distance is the
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883149#comment-13883149
]
Suneel Marthi commented on MAHOUT-1030:
---
That's correct Andrew. Its the distance of
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883188#comment-13883188
]
Pat Ferrel commented on MAHOUT-1030:
The distance should be measured the same way
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883321#comment-13883321
]
Suneel Marthi commented on MAHOUT-1030:
---
I hope I am wrong but someone please
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883555#comment-13883555
]
Andrew Musselman commented on MAHOUT-1030:
--
Something's wrong with those
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883774#comment-13883774
]
Suneel Marthi commented on MAHOUT-1030:
---
Andrew, looking at this issue now and u're
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883826#comment-13883826
]
Andrew Musselman commented on MAHOUT-1030:
--
Looks good.
Regression: Clustered
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881762#comment-13881762
]
Suneel Marthi commented on MAHOUT-1030:
---
My bad, should have caught this earlier.
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881226#comment-13881226
]
Pat Ferrel commented on MAHOUT-1030:
This fixes a very literal reading of the bug.
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881273#comment-13881273
]
Andrew Musselman commented on MAHOUT-1030:
--
Yes, was looking at this last night
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881283#comment-13881283
]
Pat Ferrel commented on MAHOUT-1030:
Hmm, Suneel recommends creating a new Jira so I
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13881287#comment-13881287
]
Pat Ferrel commented on MAHOUT-1030:
adde a
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842574#comment-13842574
]
Suneel Marthi commented on MAHOUT-1030:
---
Patch committed to trunk, Thanks Andrew.
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13842680#comment-13842680
]
Hudson commented on MAHOUT-1030:
SUCCESS: Integrated in Mahout-Quality #2356 (See
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836677#comment-13836677
]
Pat Ferrel commented on MAHOUT-1030:
as you wish, either is clear, thanks.
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836085#comment-13836085
]
Pat Ferrel commented on MAHOUT-1030:
Don't have time to fully test this right now but
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836087#comment-13836087
]
Pat Ferrel commented on MAHOUT-1030:
I hope Jeff can answer about normalized results,
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836091#comment-13836091
]
Andrew Musselman commented on MAHOUT-1030:
--
Okay; I did the distance calculation
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836099#comment-13836099
]
Pat Ferrel commented on MAHOUT-1030:
Thanks, I see now. That looks correct. This is
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13836280#comment-13836280
]
Andrew Musselman commented on MAHOUT-1030:
--
Or output square-root of
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13835947#comment-13835947
]
Andrew Musselman commented on MAHOUT-1030:
--
Hm, just noticed my patch is very
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13834990#comment-13834990
]
Andrew Musselman commented on MAHOUT-1030:
--
I'll dig through the 0.6 code for
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13835031#comment-13835031
]
Pat Ferrel commented on MAHOUT-1030:
Broken record warning: The bigger issue (I agree
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13835036#comment-13835036
]
Andrew Musselman commented on MAHOUT-1030:
--
I'm planning on fixing this for
So it sounds like there are a few things going on:
(1) The quick fix would be to revert to the WeightedPropertyVectorWritable
so we could hold on to the key or distance to centroid, e.g., for each
vector
(2) But WeightedPropertyVectorWritable is not sufficient or general enough
for how people
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809686#comment-13809686
]
Andrew Musselman commented on MAHOUT-1030:
--
Grant, who should I talk to to get
As I recall this goes back to 0.6 where the clustered points used to store the
distance to the centroid in the properties of a WeightedPropertyVectorWritable
then 0.7 changed the output to WeightedVectorWritable and the distance to
centroid could not be stored in the new class.
I modified my
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809813#comment-13809813
]
Grant Ingersoll commented on MAHOUT-1030:
-
Andrew, I suppose it depends on what
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679620#comment-13679620
]
Pat Ferrel commented on MAHOUT-1030:
Not really. What I do now is recalculate the
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679624#comment-13679624
]
Pat Ferrel commented on MAHOUT-1030:
BTW you can look at the code of 0.6 or 0.7 to
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679170#comment-13679170
]
Grant Ingersoll commented on MAHOUT-1030:
-
Pat, do you have a patch for this that
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678785#comment-13678785
]
Suneel Marthi commented on MAHOUT-1030:
---
[~pferrel]Did you have a chance to try the
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678847#comment-13678847
]
Pat Ferrel commented on MAHOUT-1030:
+1 to Lance's point.
My point of 01/Jul/12 was
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404737#comment-13404737
]
Pat Ferrel commented on MAHOUT-1030:
A couple other thoughts:
1) With
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404581#comment-13404581
]
Pat Ferrel commented on MAHOUT-1030:
Personally I have a work around that's
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404650#comment-13404650
]
Lance Norskog commented on MAHOUT-1030:
---
bq. If this fixes the regression
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13403570#comment-13403570
]
Pat Ferrel commented on MAHOUT-1030:
Jeff said; It is trivial to back-calculate the
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292427#comment-13292427
]
Hudson commented on MAHOUT-1030:
Integrated in Mahout-Quality #1537 (See
[
https://issues.apache.org/jira/browse/MAHOUT-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13292438#comment-13292438
]
Hudson commented on MAHOUT-1030:
Integrated in Mahout-Quality #1538 (See
42 matches
Mail list logo