On Wed, Mar 13, 2013 at 12:55 PM, Sean Owen <[email protected]> wrote:
> I think "just classifiers on just Hadoop" is enough for a project. Not quite follow the rational of that since other areas of ML (clustering, CF, dimensionality reduction) would share the same basis. No need to rebuild it elsewhere. In a sense of scope, yes, it is a big enough scope which can be reduced to "popular" scope though. > I think > "most ML on just Hadoop" is quite a big project, and even that hasn't > nearly been completed here. I think "most ML on most platforms" is far too > big in scope. It inevitably results in a village lightly-connected > expeirments. Not bad, but not a project. > Er.. i'd summarize my vision as "popular big data ML with coherent underpinning" scope. As soon as you exclude the "most" qualifier and constrain yourself to 90% highest in real life demand, it would seem the scope would deflat to really just a handful techniques. Strike "most platforms". Put in "reasonably popular enablers." Surprisingly even Giraph will probably not qualify at the moment as it would fit "enabler" criterion but miss the "popular" one. Add "operationally collocatable". And scope reduces even further.
