sure. I assume this should include statements that something crushes
something without providing a link to a published analysis of what it is
something that crushes something another and due to what something.


On Wed, Apr 30, 2014 at 4:16 PM, Ted Dunning <[email protected]> wrote:

> It seems to me that Sebastian and Ellen have hit on the right tack.
>
> Let's get back to work making something cool here.  Let's build this
> community up instead of having endlessly divisive discussions.
>
> Let's get back to the Apache emphasis on do-acracy.
>
>
>
> On Wed, Apr 30, 2014 at 11:36 AM, Ellen Friedman <
> [email protected]
> > wrote:
>
> > I am weighing in here on issues of great concern but non-technical.
> >
> > 1. One of the great things about Mahout is the community – not an easy
> > thing to have achieved given that people are dispersed geographically
> > and there is no single focus or company backing the project. In short,
> > the people who make Mahout are doing something cool.
> >
> > Suggestions to try to break it into different groups, Mahout-Spark and
> > Mahout2o, run counter to this success. Why fragment it at exactly the
> > moment when new contributors (from 0xdata) are coming forward ?  The
> > spirit of this project has been inclusive. Let's not  change that now.
> >
> > 2. Sebastian pointed out:
> >
> > "We agreed to give the h2O guys a shot for exploration of a possible
> > integration into Mahout. We should be grateful that they are investing
> > a lot of time into this, and should help whereever we can. Once they
> > come up with a concrete proposal or patch, we will have a look at it,
> > have a deep, technical and polite discussion, and make a decision
> > afterwards."
> >
> > +1
> >
> > We agreed to explore the h2o option. Why use of lots of time and
> > energy in re-visiting and second guessing that decision? Let it go
> > forward, likely some great things will emerge for Mahout, and if not,
> > then we say "thank you" to h2o contributors for giving it a try.
> >
> > As the guys from h2o are adding new resources to do this development,
> > it is not really detracting anything from Mahout's resources except
> > when someone opens one of these discussions that lead to fragmentation
> > and distraction. I'm not a coder and not as technical as any of you,
> > but from my view It seems to be the talk and not the development that
> > is distracting.
> >
> > 3. Over the last year, there has been growing and widespread interest
> > in Mahout from the outside world, and now, with the new changes to
> > support Scala, Spark and h2o (possibly Stratosphere later) the growing
> > interest has turned into excitement. This is a great time for the
> > project – tons of effort but moving toward a big result.
> >
> > Users will have some excellent new choices, all parts of Mahout will
> > benefit. And if in the future it is seen that some of the new features
> > are not being widely or successfully used, they will be deprecated, as
> > was done during the big clean-up of the 0.8 release. New choices, new
> > ways to use Mahout, new people getting involved – this is excellent.
> >
> > 4. My thought is, stick together, embrace change, welcome new comers
> > and be very proud to be building the new Mahout.
> >
> >
> >
> > On 4/29/14, Sebastian Schelter <[email protected]> wrote:
> > > For reasons of transparency in this discussion, I should add that I am
> a
> > > committer on the upcoming Stratosphere ASF podling, co-worker of the
> > > main developers and have contributed to it as part of my PhD.
> > >
> > > On 04/29/2014 09:23 PM, Sebastian Schelter wrote:
> > >> Anand,
> > >>
> > >> I'm trying to answer some of your questions, and my answers highlight
> > >> the points that I would like to see clarified about h20.
> > >>
> > >> On 04/28/2014 11:13 PM, Anand Avati wrote:
> > >>
> > >>> 1. Why is the DSL claiming to have (in its vision) logical vs
> physical
> > >>> separation if not for providing multiple compute backends?
> > >>
> > >> This is not a claim or a vision, the DSL already has this separation.
> > >> Take for example o.a.m.sparkbindings.drm.plan.OpAtA, thats the logical
> > >> operator for executing a Transpose-Times-Self matrix multiplication.
> In
> > >> o.a.m.sparkbindings.blas.AtA you will find two physical operator
> > >> implementations for that. The choice which one to use depends on
> whether
> > >> there is enough memory to hold certain intermediary results in memory.
> > >>
> > >> The primary intention of a separation into logical and physical
> > >> operators is to allow for a declarative programming style on the users
> > >> side and for an optimizer on the system side which automatically
> chooses
> > >> the optimal physical operator for the execution of a specific program.
> > >>
> > >> This choice of the physical operator might depend on the shape and
> > >> amount of the data processed as well on the underlying available
> > >> resources. *The separation into logical and physical operators clearly
> > >> doesn't imply to have multiple backends*. It only makes it very easy
> to
> > >> support them.
> > >>
> > >>>
> > >>> 2. Does the proposal of having a new DSL backend in the future (for
> e.g
> > >>> stratosphere as suggested elsewhere) make you:
> > >>
> > >>> -- worry that stratosphere would be a dependency to Mahout?
> > >>
> > >> Stratosphere has been accepted as a incubator project in the ASF
> > >> recently, so the worry about such a dependency is naturally less than
> > >> about an externally managed project like h20.
> > >>
> > >>> -- worry that as a user/commiter/contributor you have to worry about
> a
> > >>> new
> > >>> framework?
> > >>
> > >> In my eyes, there is a big difference between Spark/Stratosphere and
> > >> h20. Spark and Stratosphere have a clearly defined programming and
> > >> execution model. They execute programs that are composed of a DAG of
> > >> operators. The set of operators has clearly defined semantics and
> > >> parallelization strategies. If you compare their operators, you will
> > >> find that they offer pretty much the same in lightly different
> flavors.
> > >> For both, there are scientific papers that in detail explain all these
> > >> things.
> > >>
> > >> I have asked about a detailed description of h20's programming model
> and
> > >> execution model and I searched the documentation, but I haven't been
> > >> able to find something that clearly describes how things are done. I
> > >> would love to read up on this, but until I'm presented with this, I
> have
> > >> to assume that such a principled foundation is missing.
> > >>
> > >>
> > >> --sebastian
> > >>
> > >
> > >
> >
>

Reply via email to