org.apache.mahout.classifier.baytes.MultipleOutputFormat not working as
intended with Hadoop 0.20?
--------------------------------------------------------------------------------------------------
Key: MAHOUT-614
URL: https://issues.apache.org/jira/browse/MAHOUT-614
Project: Mahout
Issue Type: Bug
Components: Classification
Affects Versions: 0.4
Reporter: Sean Owen
Assignee: Robin Anil
Fix For: 0.5
I believe there might be an error in
org.apache.mahout.classifier.baytes.MultipleOutputFormat. It overrides the
Hadoop class FileOutputFormat, and most of its work is done in
getRecordWriter(FileSystem, Configuration, String, Progressable). However this
is not the method that one must override to control how FileOutputFormat writes
records; that's getRecordWriter(TaskAttemptContext). My hunch is that this used
to work, but against the Hadoop 0.19.x APIs. (@Override is our friend!)
I've attached a patch that I believe addresses this and along the way is able
to clean things up slightly. Am I on track here?
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira