My high-level view is that Hadoop was very excellent for its intended use
case, and that because of this, people have abused it to do things quite
unlike what it was designed for. It's amazing that a glorified logs
processing framework could do anything like machine learning well. Mahout
embodies that interesting struggle.

I can only believe that most any of the "next gen" frameworks discussed
here, which are necessarily more general-purpose, will be better for things
like machine learning. I am not so interesting in MR 2.0 -- nothing wrong
with it just not something better conceptually for machine learning. I like
projects like Ciel from MS Research -- simply more general purpose graph-
and data-flow-oriented frameworks.

I personally believe that while Mahout *could* be anything, that it's
reached about the level of scope it can possibly sustain given the amount of
effort coming in, in trying to do something interesting on top of MapReduce.
This will be useful for a couple years to come yet.

That is to say: I think it will be interesting to explore another
machine-learning-at-scale project in 2 years or so on top of one of these
next-gen frameworks.

(Was that the question?)

Reply via email to