[
https://issues.apache.org/jira/browse/MAHOUT-137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722690#action_12722690
]
Grant Ingersoll commented on MAHOUT-137:
----------------------------------------
Also, did you look at what I did in the patch I posted to handle it?
Basically, push the question off to the user.
Of course, that is slightly less than ideal. It seems like people shouldn't
have to care about the underlying implementation. Furthermore, I don't know
the likelihood that one would need to mix dense w/ sparse. Intuition suggests
to me that if one vector needs to be dense, then most vectors are likely to be
dense and likewise, that if one vector is going to be sparse, the nature of the
problem is such that all vectors are sparse (thinking of text), but this isn't
based on any personal experience, it's just a guess.
> Convert Clustering Algs to use Vector Writable
> ----------------------------------------------
>
> Key: MAHOUT-137
> URL: https://issues.apache.org/jira/browse/MAHOUT-137
> Project: Mahout
> Issue Type: Improvement
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Fix For: 0.2
>
> Attachments: MAHOUT-137.patch, MAHOUT-137.patch, MAHOUT-137.patch
>
>
> All M/R jobs should use Vector writable instead of encoding and decoding
> strings. We can have a separate utility that converts serialized GSON,
> Strings, whatever into the appropriate vectors. See MAHOUT-136 and
> http://www.lucidimagination.com/search/document/6a55f260826fd77f/jira_commented_mahout_136_change_canopy_mr_implementation_to_use_vector_writable
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.