Personally -- note, personally -- I think that's a whole other project. I doubt Mahout will ever be anything but Hadoop-based, plus some sequential / pure Java bits. Or, put another way: that's way too much scope, to span a third (fourth?) computation model, in a project already sprawling.
I think this is certainly could, should, just be another project. BSP-based or graph-based ML algorithms. No reason it can't be done by same or similar people or reuse code, etc. It's a good idea. I don't see a reason such a thing has to intersect with Mahout directly. Sean On Mon, May 28, 2012 at 5:08 PM, Robin Anil <[email protected]> wrote: > OK. So say mahout moves to using bsp. There are obviously risks you > mentioned. > > if possible we need to be abstracting out the underlying execution. So an > iterative algorithm should be written using a wrapper library that hides > giraph, bsp and map reduce. That's something I think will be attractive to > mahout community, because the risks would no longer be there. We would > implement any algorithm without betting on the future of any execution > model. And it will serve as a place where providers of each execution model > will strive to improve benchmarking against a common platform > > Is this something bsp dev would be willing to push?. Because the way I see > it things are stacked in favour of hadoop map reduce. And a common > execution library will help bsp push people to go away from map reduce > without the risk > > Robin > On May 28, 2012 6:41 AM, "Suraj Menon" <[email protected]> wrote: > > > First of all we would like to mention that the ugly side in this > > thread was totally not intended. > > From the options you gave, (c) would be a waste of time. > > > > The original intention of this thread was to politely check with > > Mahout community, if it would consider another programming model than > > Map-Reduce to implement machine learning algorithms. My previous mail > > was to check if there is any specific feature set (e.g. > > fault-tolerance, proven scalability, etc.) that is required before > > Mahout community would consider a new model. > > > > But, we do understand now that adoption of a new model could be based > > on popularity of the system among ML programmers which in turn builds > > a strong community for that project. > > > > Thanks, > > Suraj > > > > On Sun, May 27, 2012 at 12:11 PM, Robin Anil <[email protected]> > wrote: > > > I am confused, what is the actual ask from the Hama community to Mahout > > > community? > > > > > > Is that > > > a) Port Mahout algorithms to use BSP? > > > b) Rewrite Mahout algorithms to use BSP? > > > c) Argue that Hama is better than Giraph and vice versa? > > > > > > Because the response will depend on what the actual question is? This > > > thread seems to have lost the intended question. > > > > > > > > > ------ > > > Robin Anil > > > > > > > > > On Sat, May 26, 2012 at 4:03 PM, Ted Dunning <[email protected]> > > wrote: > > > > > >> The key thing to look for is implementation on a platform that is > widely > > >> accepted for practical data mining. > > >> > > >> We have only recently begun considering Pig as an implementation > > platform > > >> after deciding not to use it before. What has changed is the fairly > > wide > > >> adoption of Pig. > > >> > > >> On Sat, May 26, 2012 at 11:22 AM, Suraj Menon <[email protected]> > > >> wrote: > > >> > > >> > Steering back to relevance, it would be nice to know if there is an > > >> > expectation on features and benchmarks for any system to be > considered > > >> > as a platform to implement machine learning algorithms on Mahout. > > >> > > > >> > > >
