Re: Discussion Of ML environment/MR, Mahout

Dmitriy Lyubimov Wed, 13 Mar 2013 13:18:39 -0700

On Wed, Mar 13, 2013 at 12:55 PM, Sean Owen <[email protected]> wrote:


> I think "just classifiers on just Hadoop" is enough for a project.

Not quite follow the rational of that since other areas of ML (clustering,
CF, dimensionality reduction) would share the same basis. No need to
rebuild it elsewhere. In a sense of scope, yes, it is a big enough scope
which can be reduced to "popular" scope though.

> I think
> "most ML on just Hadoop" is quite a big project, and even that hasn't
> nearly been completed here. I think "most ML on most platforms" is far too
> big in scope. It inevitably results in a village lightly-connected
> expeirments. Not bad, but not a project.
>

Er.. i'd summarize my vision as "popular big data ML with coherent
underpinning" scope. As soon as you exclude the "most" qualifier and
constrain yourself to 90% highest in real life demand, it would seem the
scope would deflat to really just a handful techniques.

Strike "most platforms". Put in "reasonably popular enablers." Surprisingly
even Giraph will probably not qualify at the moment as it would fit
"enabler" criterion but miss the "popular" one. Add "operationally
collocatable". And scope reduces even further.

Re: Discussion Of ML environment/MR, Mahout

Reply via email to