[ 
https://issues.apache.org/jira/browse/MAHOUT-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722246#action_12722246
 ] 

Jeff Eastman commented on MAHOUT-137:
-------------------------------------

MAHOUT-136 changed Canopy to use Writable between map and reduce steps, but 
input and output formats are still Text. In the interests of consistency and 
efficiency, it makes sense to convert all of the clustering jobs to use 
Writables for I/O too. We can have a separate utility job to convert from 
Writable form to Json or other textual representations if that is needed. Since 
most clustering jobs will have an input step to prepare the points for 
clustering anyway, having this output Writables vs Text would be a small change.

> Convert Clustering Algs to use Vector Writable
> ----------------------------------------------
>
>                 Key: MAHOUT-137
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-137
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>             Fix For: 0.2
>
>
> All M/R jobs should use Vector writable instead of encoding and decoding 
> strings.  We can have a separate utility that converts serialized GSON, 
> Strings, whatever into the appropriate vectors.  See MAHOUT-136 and 
> http://www.lucidimagination.com/search/document/6a55f260826fd77f/jira_commented_mahout_136_change_canopy_mr_implementation_to_use_vector_writable

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to