[ https://issues.apache.org/jira/browse/MAHOUT-148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12762673#action_12762673 ]
Jake Mannix commented on MAHOUT-148: ------------------------------------ Another good reason to do this is HADOOP-6109 - which will be fixed in 0.21, but right now causes pretty severe problems in dealing with "big" lines of Text (i.e. they 0.20 and before LineReader uses an O(n^2) algorithm for char[] copying when appending) > Convert Classification Algs to use richer Writable syntax > --------------------------------------------------------- > > Key: MAHOUT-148 > URL: https://issues.apache.org/jira/browse/MAHOUT-148 > Project: Mahout > Issue Type: Improvement > Components: Classification > Affects Versions: 0.1, 0.2 > Reporter: Grant Ingersoll > Assignee: Robin Anil > Fix For: 0.2 > > Attachments: MAHOUT-148-Work-In-Progress.patch > > > Much of the classification capabilities relies on parsing values out from the > Text object just to determine what type of "thing" is being used. We should > try to avoid having to do string manipulation for this kind of thing and > instead encapsulate it in Writable instances. This should make things > perform faster and bring stronger typing to the problem, which should make it > easier to understand and debug the code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.