I think I'm the biggest single contributor to Mahout over time (? was at one point), and so by extension Cloudera is. And this new project is all open source. Surely that's maximally "walking the walk" in these regards?
Mahout has served well for a long time as measured in Hadoop-years -- like 4+ years. It's still in usable life. I don't think the current state of the code means it's feasible to truly evolve it towards things like Hadoop 2, Spark, real-time. That is to say, there are legitimate reasons to start forward from a new project with different goals. CDH5 still supports Mahout for sure. Oryx will work on any Hadoop (2) distro. I hope there is no openness foul here. On Tue, Nov 12, 2013 at 1:04 PM, Ted Dunning <[email protected]> wrote: > Sean writes: > > We release Oryx today -- get some. > #cloudera<https://plus.google.com/s/%23cloudera> >> #oryx <https://plus.google.com/s/%23oryx> >> The Oryx open source project provides simple, real-time large-scale >> machine learning infrastructure. It implements a few classes of algorithm >> commonly used in business applications: collaborative filtering / >> recommendation, classification / regression, and clustering. It can >> continuously build models from a stream of data at large scale using Apache >> Hadoop's MapReduce. It also serves queries of those models in real-time via >> an HTTP REST API, and can update models approximately in response to new >> data. Models are exchanged in PMML format. > > > > I personally find it a pity that Cloudera talks the open source talk, but > doesn't walk the walk by contributing to, for example, Mahout. > > Their decision. > > Sean's decision as well, I guess.
