[ 
https://issues.apache.org/jira/browse/MAHOUT-65?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720817#action_12720817
 ] 

Robert Burrell Donkin commented on MAHOUT-65:
---------------------------------------------

The canopy clustering example has been broken by the changes to the internal 
representation of Vector. The problem is that Canopy, Cluster and the example 
OutputMapper all rely on string concatenation. Judging by the code, the string 
parsing work looks inefficient and has proved fragile. (Personally speaking, I 
also find it hard to understand the code when the wire format and object 
designs are quite different.) 

IMHO adopting a binary serialization system which could be used for both Vector 
and other types would make the code more robust in this area 

I agree with Ted that either thrift or avro would be a good choice 

> Add Element Labels to Vectors and Matrices
> ------------------------------------------
>
>                 Key: MAHOUT-65
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-65
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Matrix
>    Affects Versions: 0.1
>            Reporter: Jeff Eastman
>            Assignee: Jeff Eastman
>         Attachments: MAHOUT-65-name.patch, MAHOUT-65-name.patch, 
> MAHOUT-65-name.patch, MAHOUT-65.patch, MAHOUT-65b.patch, MAHOUT-65c.patch, 
> MAHOUT-65d.patch
>
>
> Many applications can benefit by accessing elements in vectors and matrices 
> using String labels in addition to numeric indices. Investigate adding such a 
> capability.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to