[ 
https://issues.apache.org/jira/browse/MAHOUT-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762673#action_12762673
 ] 

Jake Mannix commented on MAHOUT-148:
------------------------------------

Another good reason to do this is HADOOP-6109 - which will be fixed in 0.21, 
but right now causes pretty severe problems in dealing with "big" lines of Text 
(i.e. they 0.20 and before LineReader uses an O(n^2) algorithm for char[] 
copying when appending)

> Convert Classification Algs to use richer Writable syntax
> ---------------------------------------------------------
>
>                 Key: MAHOUT-148
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-148
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification
>    Affects Versions: 0.1, 0.2
>            Reporter: Grant Ingersoll
>            Assignee: Robin Anil
>             Fix For: 0.2
>
>         Attachments: MAHOUT-148-Work-In-Progress.patch
>
>
> Much of the classification capabilities relies on parsing values out from the 
> Text object just to determine what type of "thing" is being used.  We should 
> try to avoid having to do string manipulation for this kind of thing and 
> instead encapsulate it in Writable instances.  This should make things 
> perform faster and bring stronger typing to the problem, which should make it 
> easier to understand and debug the code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to