[
https://issues.apache.org/jira/browse/MAHOUT-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13057945#comment-13057945
]
Shannon Quinn edited comment on MAHOUT-537 at 6/30/11 5:36 PM:
---------------------------------------------------------------
Ok, this is absolutely a total hack job, but I wanted to see if it would work:
taking the 0.21 mapreduce.lib.join* package, tweaking it slightly to make it
0.20-compatible, and installing it directly in Mahout to make
DistributedRowMatrix 0.20-compliant.
It and the associated tests compile, but I've run into a problem of failing
tests, the cause of which seems to be that it won't write files to
DistributedCache, HDFS, etc. I tried writing to DistributedCache and
immediately reading it back, which worked fine, but otherwise I'm stuck and
could use some help.
If this isn't an avenue worth pursuing, that's also fine. I had the idea and
wanted to give it a shot before throwing in the towel and waiting for 0.22.
was (Author: magsol):
Ok, this is absolutely a total hack job, but I wanted to see if it would
work: taking the 0.21 mapreduce.lib.join* package, tweaking it slightly to make
it 0.20-compatible, and installing it directly in Mahout to make
DistributedRowMatrix 0.20-compliant.
It and the associated tests compile, but I've run into a problem of failing
tests, the cause of which seems to be that it won't write files to
DistributedCache, HDFS, etc. I tried writing to DistributedCache and
immediately reading it back--which worked fine--but otherwise I'm stuck and
could use some help.
If this isn't an avenue worth pursuing, that's also fine. I had the idea and
wanted to give it a shot before throwing in the towel and waiting for 0.22.
> Bring DistributedRowMatrix into compliance with Hadoop 0.20.2
> -------------------------------------------------------------
>
> Key: MAHOUT-537
> URL: https://issues.apache.org/jira/browse/MAHOUT-537
> Project: Mahout
> Issue Type: Improvement
> Components: Math
> Affects Versions: 0.4, 0.5
> Reporter: Shannon Quinn
> Assignee: Shannon Quinn
> Fix For: 0.6
>
> Attachments: MAHOUT-537.patch, MAHOUT-537.patch, MAHOUT-537.patch,
> MAHOUT-537.patch, MAHOUT-537_hack.patch
>
>
> Convert the current DistributedRowMatrix to use the newer Hadoop 0.20.2 API,
> in particular eliminate dependence on the deprecated JobConf, using instead
> the separate Job and Configuration objects.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira