Re: consensus statement?

Dmitriy Lyubimov Tue, 06 May 2014 12:09:30 -0700

I guess it would just be more constructive to just formulate an alternative
here.

(1) Mahout is moving away from Java/Hadoop MapReduce programming models as
means of algorithm creation, both for performance and semantical reasons.

(2) Mahout is moving towards creating a clearer semantical alternative to
ML programming environment by growing R-Like domain specific languages
allowing much cleaner way for  pipeline customization, feature prepartion,
algorithm creation and script-based execution.

(3) For in-core computation, the environment adopts Mahout-math apis.

(4) For distributed computations, the environments adopts cost and
rewriting rule based optimizer approach for translation to a particular
back-end.

(5) Ongoing work is in direction of providing proper optimizers for various
backends which at present progresses at a significantly different rate.

This may be too detailed for a high level statement but it is is more in
line with stated goals and priorities of what is currently known as "scala
and spark bindings". Obviously we need to think of a better (shorter) name
for this effort.

On Tue, May 6, 2014 at 10:32 AM, Dmitriy Lyubimov <[email protected]> wrote:

>
>
>
> On Tue, May 6, 2014 at 9:23 AM, Ted Dunning <[email protected]> wrote:
>
>> I have been involved in side conversations to try to build a bit of unity
>> among our community and would like to propose this as a statement of what
>> we are doing:
>>
>>
>> This is secondary, tactical goal. The purpose of what i did has always
> been a flexible ML platform agnostic of a particular platform that
> purposely goes away from Java/MR paradigm.  Like i said, i have always
> emphasized that we are not building spark-(or any platform-specific)
> algorithms. The continuous efforts to tie my investigations to Spark esp.
> to Spark as a sole component of this work were a total mis-statement of the
> goals of this work.
>
>
>> Apache Mahout is moving immediately to a faster execution model. The first
>> of these is Spark. Outside contributions are always encouraged.
>>
>>
>> As a bit of commentary, it is clear that what the committers are working
>> on
>> is Spark and it is clear that Spark will be the first new platform for
>> Mahout.  It is also clear that there are non-committers (the 0xdata crew
>> for one) who are working with the community to extend Mahout beyond just
>> Spark.  As a statement of where the community is *right* now, however, I
>> don't think we need to say much more than that we encourage contributions.
>>
>> Sound fair?  Correct?
>>
>
>

Re: consensus statement?

Reply via email to