I don't feel super strong about it, but it feels like a utility to me.  
Something that is used to prep "raw" content for input into the actual 
algorithm.  That being said, I can see the NLP case too.  Given this is the 
only "NLP" we have so far, I'd lean towards util.  If and when we have more, 
then we can move it.

(BTW, nice work!  This is good stuff!  Although, pretty soon we are just going 
to have to rename Mahout to be Apache LLR! :-)  )

On Feb 8, 2010, at 2:23 PM, Drew Farris wrote:

> What's the general consensus (if such exists) about what goes in core vs.util?
> 
> Over on MAHOUT-242 there is some discussion about where to put the
> n-gram / LLR collocation utilities, and since I'm relatively new here
> I don't feel like I can make a point about it going one place or
> another without an understanding of the purpose of the different
> modules.
> 
> In some ways I can see 242 being a utility -- used for the preparation
> of language models or something, upon which core algorithms depend. On
> the other hand I could see mahout including a suite of nlp algorithms
> in core where 242 is simply a starting point.
> 
> Drew


Reply via email to