Re: A proposal for Spark 2.0

witgo Thu, 12 Nov 2015 06:48:26 -0800

Who has the idea of machine learning? Spark missing some features for machine 
learning, For example, the parameter server.



> 在 2015年11月12日，05:32，Matei Zaharia <matei.zaha...@gmail.com> 写道：
> 
> I like the idea of popping out Tachyon to an optional component too to reduce 
> the number of dependencies. In the future, it might even be useful to do this 
> for Hadoop, but it requires too many API changes to be worth doing now.
> 
> Regarding Scala 2.12, we should definitely support it eventually, but I don't 
> think we need to block 2.0 on that because it can be added later too. Has 
> anyone investigated what it would take to run on there? I imagine we don't 
> need many code changes, just maybe some REPL stuff.
> 
> Needless to say, but I'm all for the idea of making "major" releases as 
> undisruptive as possible in the model Reynold proposed. Keeping everyone 
> working with the same set of releases is super important.
> 
> Matei
> 
>> On Nov 11, 2015, at 4:58 AM, Sean Owen <so...@cloudera.com> wrote:
>> 
>> On Wed, Nov 11, 2015 at 12:10 AM, Reynold Xin <r...@databricks.com> wrote:
>>> to the Spark community. A major release should not be very different from a
>>> minor release and should not be gated based on new features. The main
>>> purpose of a major release is an opportunity to fix things that are broken
>>> in the current API and remove certain deprecated APIs (examples follow).
>> 
>> Agree with this stance. Generally, a major release might also be a
>> time to replace some big old API or implementation with a new one, but
>> I don't see obvious candidates.
>> 
>> I wouldn't mind turning attention to 2.x sooner than later, unless
>> there's a fairly good reason to continue adding features in 1.x to a
>> 1.7 release. The scope as of 1.6 is already pretty darned big.
>> 
>> 
>>> 1. Scala 2.11 as the default build. We should still support Scala 2.10, but
>>> it has been end-of-life.
>> 
>> By the time 2.x rolls around, 2.12 will be the main version, 2.11 will
>> be quite stable, and 2.10 will have been EOL for a while. I'd propose
>> dropping 2.10. Otherwise it's supported for 2 more years.
>> 
>> 
>>> 2. Remove Hadoop 1 support.
>> 
>> I'd go further to drop support for <2.2 for sure (2.0 and 2.1 were
>> sort of 'alpha' and 'beta' releases) and even <2.6.
>> 
>> I'm sure we'll think of a number of other small things -- shading a
>> bunch of stuff? reviewing and updating dependencies in light of
>> simpler, more recent dependencies to support from Hadoop etc?
>> 
>> Farming out Tachyon to a module? (I felt like someone proposed this?)
>> Pop out any Docker stuff to another repo?
>> Continue that same effort for EC2?
>> Farming out some of the "external" integrations to another repo (?
>> controversial)
>> 
>> See also anything marked version "2+" in JIRA.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: A proposal for Spark 2.0

Reply via email to