On Feb 9, 2010, at 3:31 PM, Jake Mannix wrote: > On Tue, Feb 9, 2010 at 12:20 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: >> >> >> I think that the real issue for me is that we have two meanings of utils. >> One is "generally useful stuff in core" and the other is "things that use >> mahout to do cool things". >> > > This is my problem too: *examples* is "things that use mahout to do useful > things". Because utils depends on core and not vice versa, I kinda end up > never wanting to put *anything* in there, because then I know I can't use it > later in core, arg!
Here's my philosophy: - Core are the algorithms to do stuff Ex: LDA, Bayes, etc. Depends on Math/Collections - Examples are code that demonstrate core in a specific instance Ex: 20 newsgroups, Traveling salesman Depends on Core and Utils(?) - Utils (the module) are useful things that help particular applications get stuff ready for core or get stuff out of core but aren't for everyone. For instance, not everyone is going to need to extract from Lucene or even from Text files. Ex: Extract vectors from Lucene/Solr, dump out clusters to the console, etc. Depends on core and other third party stuff (like Lucene, Tika) But, of course, let's be pragmatic, here. It's really not that hard to move things, esp. pre 1.0. -Grant