[ 
https://issues.apache.org/jira/browse/MAHOUT-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982416#action_12982416
 ] 

Ted Dunning commented on MAHOUT-510:
------------------------------------

(replied instead of commenting ... sorry for the duplicate email)

Putting data objects in the Configuration is a bit of a misuse (it has been the 
subject of an argument on the hadoop mailing lists for a long time now).

I would leave this use in place for now and later refactor to read from HDFS.  
That has more legs in any case when it comes to using the clustering on new 
data without retraining.


> Standardize serialization mechanisms
> ------------------------------------
>
>                 Key: MAHOUT-510
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-510
>             Project: Mahout
>          Issue Type: Task
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-510.patch
>
>
> At the moment this is tracking a broader concern: to standardize as much as 
> possible how we approach serialization. The long-term goal is notionally to 
> use the following "encodings" as the input/output of Mahout stuff, and by 
> extension, probably internally too.
> - Text
> - Vector Writable
> - (maybe Avro)
> not
> - Serializable
> - GSON / JSON

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to