[
https://issues.apache.org/jira/browse/MAHOUT-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shannon Quinn updated MAHOUT-537:
---------------------------------
Attachment: MAHOUT-537_hack.patch
Ok, this is absolutely a total hack job, but I wanted to see if it would work:
taking the 0.21 mapreduce.lib.join* package, tweaking it slightly to make it
0.20-compatible, and installing it directly in Mahout to make
DistributedRowMatrix 0.20-compliant.
It and the associated tests compile, but I've run into a problem of failing
tests, the cause of which seems to be that it won't write files to
DistributedCache, HDFS, etc. I tried writing to DistributedCache and
immediately reading it back--which worked fine--but otherwise I'm stuck and
could use some help.
If this isn't an avenue worth pursuing, that's also fine. I had the idea and
wanted to give it a shot before throwing in the towel and waiting for 0.22.
> Bring DistributedRowMatrix into compliance with Hadoop 0.20.2
> -------------------------------------------------------------
>
> Key: MAHOUT-537
> URL: https://issues.apache.org/jira/browse/MAHOUT-537
> Project: Mahout
> Issue Type: Improvement
> Components: Math
> Affects Versions: 0.4, 0.5
> Reporter: Shannon Quinn
> Assignee: Shannon Quinn
> Fix For: 0.6
>
> Attachments: MAHOUT-537.patch, MAHOUT-537.patch, MAHOUT-537.patch,
> MAHOUT-537.patch, MAHOUT-537_hack.patch
>
>
> Convert the current DistributedRowMatrix to use the newer Hadoop 0.20.2 API,
> in particular eliminate dependence on the deprecated JobConf, using instead
> the separate Job and Configuration objects.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira