If we really want to be vector-type agnostic, perhaps caching the class found in readVector would be a reasonable improvement.

Grant Ingersoll (JIRA) wrote:
[ https://issues.apache.org/jira/browse/MAHOUT-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722666#action_12722666 ]
Grant Ingersoll commented on MAHOUT-137:
----------------------------------------

The only thing I worry about w/ this approach is that forName() call is pretty 
time consuming.

Convert Clustering Algs to use Vector Writable
----------------------------------------------

                Key: MAHOUT-137
                URL: https://issues.apache.org/jira/browse/MAHOUT-137
            Project: Mahout
         Issue Type: Improvement
           Reporter: Grant Ingersoll
           Assignee: Grant Ingersoll
            Fix For: 0.2

        Attachments: MAHOUT-137.patch, MAHOUT-137.patch, MAHOUT-137.patch


All M/R jobs should use Vector writable instead of encoding and decoding 
strings.  We can have a separate utility that converts serialized GSON, 
Strings, whatever into the appropriate vectors.  See MAHOUT-136 and 
http://www.lucidimagination.com/search/document/6a55f260826fd77f/jira_commented_mahout_136_change_canopy_mr_implementation_to_use_vector_writable


Attachment: PGP.sig
Description: PGP signature

Reply via email to