That would be nice, but it is based on my personal and unpublished
evaluation based on personal use.  This isn't a formal evaluation.

We should encourage the 0xdata team to show us what they can do.



On Thu, May 1, 2014 at 1:25 AM, Dmitriy Lyubimov <[email protected]> wrote:

> sure. I assume this should include statements that something crushes
> something without providing a link to a published analysis of what it is
> something that crushes something another and due to what something.
>
>
> On Wed, Apr 30, 2014 at 4:16 PM, Ted Dunning <[email protected]>
> wrote:
>
> > It seems to me that Sebastian and Ellen have hit on the right tack.
> >
> > Let's get back to work making something cool here.  Let's build this
> > community up instead of having endlessly divisive discussions.
> >
> > Let's get back to the Apache emphasis on do-acracy.
> >
> >
> >
> > On Wed, Apr 30, 2014 at 11:36 AM, Ellen Friedman <
> > [email protected]
> > > wrote:
> >
> > > I am weighing in here on issues of great concern but non-technical.
> > >
> > > 1. One of the great things about Mahout is the community – not an easy
> > > thing to have achieved given that people are dispersed geographically
> > > and there is no single focus or company backing the project. In short,
> > > the people who make Mahout are doing something cool.
> > >
> > > Suggestions to try to break it into different groups, Mahout-Spark and
> > > Mahout2o, run counter to this success. Why fragment it at exactly the
> > > moment when new contributors (from 0xdata) are coming forward ?  The
> > > spirit of this project has been inclusive. Let's not  change that now.
> > >
> > > 2. Sebastian pointed out:
> > >
> > > "We agreed to give the h2O guys a shot for exploration of a possible
> > > integration into Mahout. We should be grateful that they are investing
> > > a lot of time into this, and should help whereever we can. Once they
> > > come up with a concrete proposal or patch, we will have a look at it,
> > > have a deep, technical and polite discussion, and make a decision
> > > afterwards."
> > >
> > > +1
> > >
> > > We agreed to explore the h2o option. Why use of lots of time and
> > > energy in re-visiting and second guessing that decision? Let it go
> > > forward, likely some great things will emerge for Mahout, and if not,
> > > then we say "thank you" to h2o contributors for giving it a try.
> > >
> > > As the guys from h2o are adding new resources to do this development,
> > > it is not really detracting anything from Mahout's resources except
> > > when someone opens one of these discussions that lead to fragmentation
> > > and distraction. I'm not a coder and not as technical as any of you,
> > > but from my view It seems to be the talk and not the development that
> > > is distracting.
> > >
> > > 3. Over the last year, there has been growing and widespread interest
> > > in Mahout from the outside world, and now, with the new changes to
> > > support Scala, Spark and h2o (possibly Stratosphere later) the growing
> > > interest has turned into excitement. This is a great time for the
> > > project – tons of effort but moving toward a big result.
> > >
> > > Users will have some excellent new choices, all parts of Mahout will
> > > benefit. And if in the future it is seen that some of the new features
> > > are not being widely or successfully used, they will be deprecated, as
> > > was done during the big clean-up of the 0.8 release. New choices, new
> > > ways to use Mahout, new people getting involved – this is excellent.
> > >
> > > 4. My thought is, stick together, embrace change, welcome new comers
> > > and be very proud to be building the new Mahout.
> > >
> > >
> > >
> > > On 4/29/14, Sebastian Schelter <[email protected]> wrote:
> > > > For reasons of transparency in this discussion, I should add that I
> am
> > a
> > > > committer on the upcoming Stratosphere ASF podling, co-worker of the
> > > > main developers and have contributed to it as part of my PhD.
> > > >
> > > > On 04/29/2014 09:23 PM, Sebastian Schelter wrote:
> > > >> Anand,
> > > >>
> > > >> I'm trying to answer some of your questions, and my answers
> highlight
> > > >> the points that I would like to see clarified about h20.
> > > >>
> > > >> On 04/28/2014 11:13 PM, Anand Avati wrote:
> > > >>
> > > >>> 1. Why is the DSL claiming to have (in its vision) logical vs
> > physical
> > > >>> separation if not for providing multiple compute backends?
> > > >>
> > > >> This is not a claim or a vision, the DSL already has this
> separation.
> > > >> Take for example o.a.m.sparkbindings.drm.plan.OpAtA, thats the
> logical
> > > >> operator for executing a Transpose-Times-Self matrix multiplication.
> > In
> > > >> o.a.m.sparkbindings.blas.AtA you will find two physical operator
> > > >> implementations for that. The choice which one to use depends on
> > whether
> > > >> there is enough memory to hold certain intermediary results in
> memory.
> > > >>
> > > >> The primary intention of a separation into logical and physical
> > > >> operators is to allow for a declarative programming style on the
> users
> > > >> side and for an optimizer on the system side which automatically
> > chooses
> > > >> the optimal physical operator for the execution of a specific
> program.
> > > >>
> > > >> This choice of the physical operator might depend on the shape and
> > > >> amount of the data processed as well on the underlying available
> > > >> resources. *The separation into logical and physical operators
> clearly
> > > >> doesn't imply to have multiple backends*. It only makes it very easy
> > to
> > > >> support them.
> > > >>
> > > >>>
> > > >>> 2. Does the proposal of having a new DSL backend in the future (for
> > e.g
> > > >>> stratosphere as suggested elsewhere) make you:
> > > >>
> > > >>> -- worry that stratosphere would be a dependency to Mahout?
> > > >>
> > > >> Stratosphere has been accepted as a incubator project in the ASF
> > > >> recently, so the worry about such a dependency is naturally less
> than
> > > >> about an externally managed project like h20.
> > > >>
> > > >>> -- worry that as a user/commiter/contributor you have to worry
> about
> > a
> > > >>> new
> > > >>> framework?
> > > >>
> > > >> In my eyes, there is a big difference between Spark/Stratosphere and
> > > >> h20. Spark and Stratosphere have a clearly defined programming and
> > > >> execution model. They execute programs that are composed of a DAG of
> > > >> operators. The set of operators has clearly defined semantics and
> > > >> parallelization strategies. If you compare their operators, you will
> > > >> find that they offer pretty much the same in lightly different
> > flavors.
> > > >> For both, there are scientific papers that in detail explain all
> these
> > > >> things.
> > > >>
> > > >> I have asked about a detailed description of h20's programming model
> > and
> > > >> execution model and I searched the documentation, but I haven't been
> > > >> able to find something that clearly describes how things are done. I
> > > >> would love to read up on this, but until I'm presented with this, I
> > have
> > > >> to assume that such a principled foundation is missing.
> > > >>
> > > >>
> > > >> --sebastian
> > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to