[ 
https://issues.apache.org/jira/browse/MAHOUT-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672149#comment-13672149
 ] 

Jake Mannix commented on MAHOUT-874:
------------------------------------

So marking hadoop as provided is nice, a smaller jar is great, but what I as I 
mentioned above, the size was never my primary concern, it was the dependency 
graph: It's really nice that mahout-math is a nice little non-hadoop-depending 
package which just does stats, linear algebra, and ml which don't have to think 
about hadoop stuff, even for compile time.  -core is big, because it's what 
mahout "is".  What I has been wanting is something a little in between, that 
depends on hadoop (but with provided scope), and mahout-math, but has the 
writables so that someone can work with mahout data inputs/outputs without 
actually linking to -core.

Essentially, it's the distinction between a "mahout-api" vs "mahout-impl" 
package.  Since our "API" is file-format, the "mahout-api" module is really 
just the set of writables needed to be able to marshall/unmarshall our binary 
data.
                
> Extract Writables into a separate module to allow smaller dependencies
> ----------------------------------------------------------------------
>
>                 Key: MAHOUT-874
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-874
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Ted Dunning
>
> The theory is that we can have a smaller jar if we only include writable 
> classes and their exact dependencies.
> I have a prototype, but it has some funky characteristics which I would like 
> to discuss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to