[
https://issues.apache.org/jira/browse/MAHOUT-663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13017674#comment-13017674
]
Frank Scholten commented on MAHOUT-663:
---------------------------------------
No, I think it's too big to get done before 0.5 unless other people help me out
;-)
I'm merging in latest changes from trunk and creating a new patch for the
refactored K-Means job as we speak. I think this could serve as an example for
the refactoring of the other jobs. When I add the patch we can see how to fit
in this the jar configuration.
> Rationalize hadoop job creation with respect to setJarByClass
> -------------------------------------------------------------
>
> Key: MAHOUT-663
> URL: https://issues.apache.org/jira/browse/MAHOUT-663
> Project: Mahout
> Issue Type: Bug
> Components: build
> Affects Versions: 0.4
> Reporter: Benson Margulies
>
> Mahout includes a series of driver classes that create hadoop jobs via static
> methods.
> Each one of these calls job.setJarByClass(itself.class).
> Unfortunately, this subverts the hadoop support for putting additional jars
> in the lib directory of a job jar, since the class passed in is not a class
> that lives in the ordinary section of the job jar.
> The effect of this is to force users of Mahout (and Mahout's own example job
> jar) to unpack the mahout-core jar into the main section, instead of just
> treating it as a 'lib' dependency.
> It seems to me that all the static job creators should be refactored into a
> public function that returns a job object (and does NOT call
> waitForCompletion), and then the existing wrapper. Users could call the new
> functions, and make their own call to setJarByClass.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira