[
https://issues.apache.org/jira/browse/MAHOUT-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672149#comment-13672149
]
Jake Mannix commented on MAHOUT-874:
------------------------------------
So marking hadoop as provided is nice, a smaller jar is great, but what I as I
mentioned above, the size was never my primary concern, it was the dependency
graph: It's really nice that mahout-math is a nice little non-hadoop-depending
package which just does stats, linear algebra, and ml which don't have to think
about hadoop stuff, even for compile time. -core is big, because it's what
mahout "is". What I has been wanting is something a little in between, that
depends on hadoop (but with provided scope), and mahout-math, but has the
writables so that someone can work with mahout data inputs/outputs without
actually linking to -core.
Essentially, it's the distinction between a "mahout-api" vs "mahout-impl"
package. Since our "API" is file-format, the "mahout-api" module is really
just the set of writables needed to be able to marshall/unmarshall our binary
data.
> Extract Writables into a separate module to allow smaller dependencies
> ----------------------------------------------------------------------
>
> Key: MAHOUT-874
> URL: https://issues.apache.org/jira/browse/MAHOUT-874
> Project: Mahout
> Issue Type: Improvement
> Reporter: Ted Dunning
>
> The theory is that we can have a smaller jar if we only include writable
> classes and their exact dependencies.
> I have a prototype, but it has some funky characteristics which I would like
> to discuss.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira