I've checked in some baseline work. Eyeballs appreciated. Might be interesting to build on the build-asf-email stuff and try implementing an EmailVectorizer to see how that plays out. I think once we've done a couple of these, some abstractions will fall out.
On Nov 4, 2011, at 4:43 PM, Grant Ingersoll wrote: > MAHOUT-873. Patch shortly, along with some rework of AbstractJob and > HadoopJob to make the prepareJob functionality useful to others outside of > AbstractJob. > > > On Nov 4, 2011, at 1:25 PM, Ted Dunning wrote: > >> Awesome. >> >> On Fri, Nov 4, 2011 at 10:07 AM, Grant Ingersoll <[email protected]>wrote: >> >>> I'm going to start w/ the basics of, key + blob of text. Run them through >>> the LuceneTextValueEncoder. I have a feeling we will be able to abstract a >>> Vectorizer framework that is more conducive to all these things. >>> > > -------------------------------------------- > Grant Ingersoll > http://www.lucidimagination.com > > >
