sure. I assume this should include statements that something crushes something without providing a link to a published analysis of what it is something that crushes something another and due to what something.
On Wed, Apr 30, 2014 at 4:16 PM, Ted Dunning <[email protected]> wrote: > It seems to me that Sebastian and Ellen have hit on the right tack. > > Let's get back to work making something cool here. Let's build this > community up instead of having endlessly divisive discussions. > > Let's get back to the Apache emphasis on do-acracy. > > > > On Wed, Apr 30, 2014 at 11:36 AM, Ellen Friedman < > [email protected] > > wrote: > > > I am weighing in here on issues of great concern but non-technical. > > > > 1. One of the great things about Mahout is the community – not an easy > > thing to have achieved given that people are dispersed geographically > > and there is no single focus or company backing the project. In short, > > the people who make Mahout are doing something cool. > > > > Suggestions to try to break it into different groups, Mahout-Spark and > > Mahout2o, run counter to this success. Why fragment it at exactly the > > moment when new contributors (from 0xdata) are coming forward ? The > > spirit of this project has been inclusive. Let's not change that now. > > > > 2. Sebastian pointed out: > > > > "We agreed to give the h2O guys a shot for exploration of a possible > > integration into Mahout. We should be grateful that they are investing > > a lot of time into this, and should help whereever we can. Once they > > come up with a concrete proposal or patch, we will have a look at it, > > have a deep, technical and polite discussion, and make a decision > > afterwards." > > > > +1 > > > > We agreed to explore the h2o option. Why use of lots of time and > > energy in re-visiting and second guessing that decision? Let it go > > forward, likely some great things will emerge for Mahout, and if not, > > then we say "thank you" to h2o contributors for giving it a try. > > > > As the guys from h2o are adding new resources to do this development, > > it is not really detracting anything from Mahout's resources except > > when someone opens one of these discussions that lead to fragmentation > > and distraction. I'm not a coder and not as technical as any of you, > > but from my view It seems to be the talk and not the development that > > is distracting. > > > > 3. Over the last year, there has been growing and widespread interest > > in Mahout from the outside world, and now, with the new changes to > > support Scala, Spark and h2o (possibly Stratosphere later) the growing > > interest has turned into excitement. This is a great time for the > > project – tons of effort but moving toward a big result. > > > > Users will have some excellent new choices, all parts of Mahout will > > benefit. And if in the future it is seen that some of the new features > > are not being widely or successfully used, they will be deprecated, as > > was done during the big clean-up of the 0.8 release. New choices, new > > ways to use Mahout, new people getting involved – this is excellent. > > > > 4. My thought is, stick together, embrace change, welcome new comers > > and be very proud to be building the new Mahout. > > > > > > > > On 4/29/14, Sebastian Schelter <[email protected]> wrote: > > > For reasons of transparency in this discussion, I should add that I am > a > > > committer on the upcoming Stratosphere ASF podling, co-worker of the > > > main developers and have contributed to it as part of my PhD. > > > > > > On 04/29/2014 09:23 PM, Sebastian Schelter wrote: > > >> Anand, > > >> > > >> I'm trying to answer some of your questions, and my answers highlight > > >> the points that I would like to see clarified about h20. > > >> > > >> On 04/28/2014 11:13 PM, Anand Avati wrote: > > >> > > >>> 1. Why is the DSL claiming to have (in its vision) logical vs > physical > > >>> separation if not for providing multiple compute backends? > > >> > > >> This is not a claim or a vision, the DSL already has this separation. > > >> Take for example o.a.m.sparkbindings.drm.plan.OpAtA, thats the logical > > >> operator for executing a Transpose-Times-Self matrix multiplication. > In > > >> o.a.m.sparkbindings.blas.AtA you will find two physical operator > > >> implementations for that. The choice which one to use depends on > whether > > >> there is enough memory to hold certain intermediary results in memory. > > >> > > >> The primary intention of a separation into logical and physical > > >> operators is to allow for a declarative programming style on the users > > >> side and for an optimizer on the system side which automatically > chooses > > >> the optimal physical operator for the execution of a specific program. > > >> > > >> This choice of the physical operator might depend on the shape and > > >> amount of the data processed as well on the underlying available > > >> resources. *The separation into logical and physical operators clearly > > >> doesn't imply to have multiple backends*. It only makes it very easy > to > > >> support them. > > >> > > >>> > > >>> 2. Does the proposal of having a new DSL backend in the future (for > e.g > > >>> stratosphere as suggested elsewhere) make you: > > >> > > >>> -- worry that stratosphere would be a dependency to Mahout? > > >> > > >> Stratosphere has been accepted as a incubator project in the ASF > > >> recently, so the worry about such a dependency is naturally less than > > >> about an externally managed project like h20. > > >> > > >>> -- worry that as a user/commiter/contributor you have to worry about > a > > >>> new > > >>> framework? > > >> > > >> In my eyes, there is a big difference between Spark/Stratosphere and > > >> h20. Spark and Stratosphere have a clearly defined programming and > > >> execution model. They execute programs that are composed of a DAG of > > >> operators. The set of operators has clearly defined semantics and > > >> parallelization strategies. If you compare their operators, you will > > >> find that they offer pretty much the same in lightly different > flavors. > > >> For both, there are scientific papers that in detail explain all these > > >> things. > > >> > > >> I have asked about a detailed description of h20's programming model > and > > >> execution model and I searched the documentation, but I haven't been > > >> able to find something that clearly describes how things are done. I > > >> would love to read up on this, but until I'm presented with this, I > have > > >> to assume that such a principled foundation is missing. > > >> > > >> > > >> --sebastian > > >> > > > > > > > > >
