[
https://issues.apache.org/jira/browse/MAHOUT-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722246#action_12722246
]
Jeff Eastman commented on MAHOUT-137:
-------------------------------------
MAHOUT-136 changed Canopy to use Writable between map and reduce steps, but
input and output formats are still Text. In the interests of consistency and
efficiency, it makes sense to convert all of the clustering jobs to use
Writables for I/O too. We can have a separate utility job to convert from
Writable form to Json or other textual representations if that is needed. Since
most clustering jobs will have an input step to prepare the points for
clustering anyway, having this output Writables vs Text would be a small change.
> Convert Clustering Algs to use Vector Writable
> ----------------------------------------------
>
> Key: MAHOUT-137
> URL: https://issues.apache.org/jira/browse/MAHOUT-137
> Project: Mahout
> Issue Type: Improvement
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Fix For: 0.2
>
>
> All M/R jobs should use Vector writable instead of encoding and decoding
> strings. We can have a separate utility that converts serialized GSON,
> Strings, whatever into the appropriate vectors. See MAHOUT-136 and
> http://www.lucidimagination.com/search/document/6a55f260826fd77f/jira_commented_mahout_136_change_canopy_mr_implementation_to_use_vector_writable
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.