[ 
https://issues.apache.org/jira/browse/MAHOUT-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723678#action_12723678
 ] 

Jeff Eastman commented on MAHOUT-137:
-------------------------------------

revision 788071 and 788116 implement Writable changes to MeanShift and 
Dirichlet clustering. MeanShift no longer has the bogus combiner but still 
holds all clustered points so it really wont scale well. Dirichlet needs some 
more fixing but that is another issue.

Some cleanup of directory structures to improve uniformity of naming is needed. 
Will do that under this issue since it is minor.

> Convert Clustering Algs to use Vector Writable
> ----------------------------------------------
>
>                 Key: MAHOUT-137
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-137
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>             Fix For: 0.2
>
>         Attachments: MAHOUT-137.patch, MAHOUT-137.patch, MAHOUT-137.patch, 
> MAHOUT-137.patch, MAHOUT-137.patch, MAHOUT-137.patch, MAHOUT-137.patch
>
>
> All M/R jobs should use Vector writable instead of encoding and decoding 
> strings.  We can have a separate utility that converts serialized GSON, 
> Strings, whatever into the appropriate vectors.  See MAHOUT-136 and 
> http://www.lucidimagination.com/search/document/6a55f260826fd77f/jira_commented_mahout_136_change_canopy_mr_implementation_to_use_vector_writable

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to