[
https://issues.apache.org/jira/browse/MAHOUT-593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13000569#comment-13000569
]
Sean Owen commented on MAHOUT-593:
----------------------------------
All sounds good to me.
On the last point, I agree, there is no sense in creating and re-parsing
command line args in a program. That would indicate design failure.
I think we agree. Mappers and Reducers rarely stand alone; they are almost
always part of a pipeline of Mappers and Reducers that work together to
accomplish something. That's a "Job" to me, one unit that invokes many
Mappers/Reducers.
My only question then is why those intermediate stages of Mappers/Reducers need
to be exposed as stand-alone units ("Jobs" in your patch)? I agree they're not
command-line "Jobs" that would be invoked independently, but they seem exposed
that way.
It's not any better a design really, but, I would have structured it as one
executable that kicks off many MapReduces, and that's what AbstractJob is
supporting.
> Backport of Stochastic SVD patch (Mahout-376) to hadoop 0.20 to ensure
> compatibility with current Mahout dependencies.
> ----------------------------------------------------------------------------------------------------------------------
>
> Key: MAHOUT-593
> URL: https://issues.apache.org/jira/browse/MAHOUT-593
> Project: Mahout
> Issue Type: New Feature
> Components: Math
> Affects Versions: 0.4
> Reporter: Dmitriy Lyubimov
> Fix For: 0.5
>
> Attachments: MAHOUT-593.patch.gz, MAHOUT-593.patch.gz,
> MAHOUT-593.patch.gz, SSVD-givens-CLI.pdf
>
>
> Current Mahout-376 patch requries 'new' hadoop API. Certain elements of that
> API (namely, multiple outputs) are not available in standard hadoop 0.20.2
> release. As such, that may work only with either CDH or 0.21 distributions.
> In order to bring it into sync with current Mahout dependencies, a backport
> of the patch to 'old' API is needed.
> Also, some work is needed to resolve math dependencies. Existing patch relies
> on apache commons-math 2.1 for eigen decomposition of small matrices. This
> dependency is not currently set up in the mahout core. So, certain snippets
> of code are either required to go to mahout-math or use Colt eigen
> decompositon (last time i tried, my results were mixed with that one. It
> seems to produce results inconsistent with those from mahout-math
> eigensolver, at the very least, it doesn't produce singular values in sorted
> order).
> So this patch is mainly moing some Mahout-376 code around.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira