: Sean Owen , Vinoo Ganesh , dev
Subject: Re: time for Apache Spark 3.0?
As far as I know any JIRA that has implications for users is tagged this way
but I haven't examined all of them. All that are going in for 3.0 should have
it as Fix Version . Most changes won't have a user visible impact
As far as I know any JIRA that has implications for users is tagged this
way but I haven't examined all of them. All that are going in for 3.0
should have it as Fix Version . Most changes won't have a user visible
impact. Do you see any that seem to need the tag? Call em out or even fix
them by
The release-notes label on JIRA sounds good. Can we make it a point to have
that done retroactively now, and then moving forward?
On 11/12/18, 4:01 PM, "Sean Owen" wrote:
My non-definitive takes --
I would personally like to remove all deprecated methods for Spark 3.
I
My non-definitive takes --
I would personally like to remove all deprecated methods for Spark 3.
I started by removing 'old' deprecated methods in that commit. Things
deprecated in 2.4 are maybe less clear, whether they should be removed
Everything's fair game for removal or change in a major
ao Li , Matei Zaharia <
> matei.zaha...@gmail.com>, Ryan Blue , Mark Hamstra <
> m...@clearstorydata.com>, dev
> *Subject: *Re: time for Apache Spark 3.0?
>
>
>
> Makes sense, thanks Reynold.
>
>
>
> *From: *Reynold Xin
> *Date: *Monday, November 12
Subject: Re: time for Apache Spark 3.0?
Makes sense, thanks Reynold.
From: Reynold Xin
Date: Monday, November 12, 2018 at 16:57
To: Vinoo Ganesh
Cc: Xiao Li , Matei Zaharia ,
Ryan Blue , Mark Hamstra , dev
Subject: Re: time for Apache Spark 3.0?
Master branch now tracks 3.0.0
Makes sense, thanks Reynold.
From: Reynold Xin
Date: Monday, November 12, 2018 at 16:57
To: Vinoo Ganesh
Cc: Xiao Li , Matei Zaharia ,
Ryan Blue , Mark Hamstra , dev
Subject: Re: time for Apache Spark 3.0?
Master branch now tracks 3.0.0-SHAPSHOT version, so the next one will be 3.0
Xin
> *Cc: *Matei Zaharia , Ryan Blue <
> rb...@netflix.com>, Mark Hamstra , "
> u...@spark.apache.org"
> *Subject: *Re: time for Apache Spark 3.0?
>
>
>
> Yes. We should create a SPIP for each major breaking change.
>
>
>
> Reynold Xin 于201
e.org"
Subject: Re: time for Apache Spark 3.0?
Yes. We should create a SPIP for each major breaking change.
Reynold Xin mailto:r...@databricks.com>> 于2018年9月28日周五
下午11:05写道:
i think we should create spips for some of them, since they are pretty large
... i can create some
Yes. We should create a SPIP for each major breaking change.
Reynold Xin 于2018年9月28日周五 下午11:05写道:
> i think we should create spips for some of them, since they are pretty
> large ... i can create some tickets to start with
>
> --
> excuse the brevity and lower case due to wrist injury
>
>
> On
i think we should create spips for some of them, since they are pretty
large ... i can create some tickets to start with
--
excuse the brevity and lower case due to wrist injury
On Fri, Sep 28, 2018 at 11:01 PM Xiao Li wrote:
> Based on the above discussions, we have a "rough consensus" that
Based on the above discussions, we have a "rough consensus" that the next
release will be 3.0. Now, we can start working on the API breaking changes
(e.g., the ones mentioned in the original email from Reynold).
Cheers,
Xiao
Matei Zaharia 于2018年9月6日周四 下午2:21写道:
> Yes, you can start with
Yes, you can start with Unstable and move to Evolving and Stable when needed.
We’ve definitely had experimental features that changed across maintenance
releases when they were well-isolated. If your change risks breaking stuff in
stable components of Spark though, then it probably won’t be
I meant flexibility beyond the point releases. I think what Reynold was
suggesting was getting v2 code out more often than the point releases every
6 months. An Evolving API can change in point releases, but maybe we should
move v2 to Unstable so it can change more often? I don't really see
Yes, that is why we have these annotations in the code and the
corresponding labels appearing in the API documentation:
https://github.com/apache/spark/blob/master/common/tags/src/main/java/org/apache/spark/annotation/InterfaceStability.java
As long as it is properly annotated, we can change or
...@gmail.com
收件人:vaquar khanvaquar.k...@gmail.com
抄送:Reynold xinr...@databricks.com; Mridul muralidharanmri...@gmail.com; Mark
hamstram...@clearstorydata.com; 银狐andyye...@gmail.com;
user@spark.apache.org...@spark.apache.org
发送时间:2018年9月6日(周四) 23:59
主题:Re: time for Apache Spark 3.0?
Yesterday, the 2.4
It would be great to get more features out incrementally. For experimental
features, do we have more relaxed constraints?
On Thu, Sep 6, 2018 at 9:47 AM Reynold Xin wrote:
> +1 on 3.0
>
> Dsv2 stable can still evolve in across major releases. DataFrame, Dataset,
> dsv1 and a lot of other major
I definitely agree we shouldn't make dsv2 stable in the next release.
On Thu, Sep 6, 2018 at 9:48 AM Ryan Blue wrote:
> I definitely support moving to 3.0 to remove deprecations and update
> dependencies.
>
> For the v2 work, we know that there will be a major API changes and
> standardization
I definitely support moving to 3.0 to remove deprecations and update
dependencies.
For the v2 work, we know that there will be a major API changes and
standardization of behavior from the new logical plans going into the next
release. I think it is a safe bet that this isn’t going to be
+1 on 3.0
Dsv2 stable can still evolve in across major releases. DataFrame, Dataset,
dsv1 and a lot of other major features all were developed throughout the
1.x and 2.x lines.
I do want to explore ways for us to get dsv2 incremental changes out there
more frequently, to get feedback. Maybe that
I think this doesn't necessarily mean 3.0 is coming soon (thoughts on
timing? 6 months?) but simply next. Do you mean you'd prefer that change to
happen before 3.x? if it's a significant change, seems reasonable for a
major version bump rather than minor. Is the concern that tying it to 3.0
means
My concern is that the v2 data source API is still evolving and not very
close to stable. I had hoped to have stabilized the API and behaviors for a
3.0 release. But we could also wait on that for a 4.0 release, depending on
when we think that will be.
Unless there is a pressing need to move to
Yesterday, the 2.4 branch was created. Based on the above discussion, I
think we can bump the master branch to 3.0.0-SNAPSHOT. Any concern?
Thanks,
Xiao
vaquar khan 于2018年6月16日周六 上午10:21写道:
> +1 for 2.4 next, followed by 3.0.
>
> Where we can get Apache Spark road map for 2.4 and 2.5
+1 for 2.4 next, followed by 3.0.
Where we can get Apache Spark road map for 2.4 and 2.5 3.0 ?
is it possible we can share future release proposed specification same
like releases (https://spark.apache.org/releases/spark-release-2-3-0.html)
Regards,
Viquar khan
On Sat, Jun 16, 2018 at
Plz ignore last email link (you tube )not sure how it added .
Apologies not sure how to delete it.
On Sat, Jun 16, 2018 at 11:58 AM, vaquar khan wrote:
> +1
>
> https://www.youtube.com/watch?v=-ik7aJ5U6kg
>
> Regards,
> Vaquar khan
>
> On Fri, Jun 15, 2018 at 4:55 PM, Reynold Xin wrote:
>
>>
+1
https://www.youtube.com/watch?v=-ik7aJ5U6kg
Regards,
Vaquar khan
On Fri, Jun 15, 2018 at 4:55 PM, Reynold Xin wrote:
> Yes. At this rate I think it's better to do 2.4 next, followed by 3.0.
>
>
> On Fri, Jun 15, 2018 at 10:52 AM Mridul Muralidharan
> wrote:
>
>> I agree, I dont see
+1
2018-06-15 14:55 GMT-07:00 Reynold Xin :
> Yes. At this rate I think it's better to do 2.4 next, followed by 3.0.
>
>
> On Fri, Jun 15, 2018 at 10:52 AM Mridul Muralidharan
> wrote:
>
>> I agree, I dont see pressing need for major version bump as well.
>>
>>
>> Regards,
>> Mridul
>> On Fri,
Yes. At this rate I think it's better to do 2.4 next, followed by 3.0.
On Fri, Jun 15, 2018 at 10:52 AM Mridul Muralidharan
wrote:
> I agree, I dont see pressing need for major version bump as well.
>
>
> Regards,
> Mridul
> On Fri, Jun 15, 2018 at 10:25 AM Mark Hamstra
> wrote:
> >
> >
I agree, I dont see pressing need for major version bump as well.
Regards,
Mridul
On Fri, Jun 15, 2018 at 10:25 AM Mark Hamstra wrote:
>
> Changing major version numbers is not about new features or a vague notion
> that it is time to do something that will be seen to be a significant
>
Changing major version numbers is not about new features or a vague notion
that it is time to do something that will be seen to be a significant
release. It is about breaking stable public APIs.
I still remain unconvinced that the next version can't be 2.4.0.
On Fri, Jun 15, 2018 at 1:34 AM Andy
*Dear all:*
It have been 2 months since this topic being proposed. Any progress now?
2018 has been passed about 1/2.
I agree with that the new version should be some exciting new feature. How
about this one:
*6. ML/DL framework to be integrated as core component and feature. (Such
as Angel /
That certainly sounds beneficial, to maybe several other projects. If
there's no downside and it takes away API issues, seems like a win.
On Thu, Apr 19, 2018 at 5:28 AM Dean Wampler wrote:
> I spoke with Martin Odersky and Lightbend's Scala Team about the known API
>
I spoke with Martin Odersky and Lightbend's Scala Team about the known API
issue with method disambiguation. They offered to implement a small patch
in a new release of Scala 2.12 to handle the issue without requiring a
Spark API change. They would cut a 2.12.6 release for it. I'm told that
Scala
On Thu, Apr 5, 2018 at 10:30 AM, Matei Zaharia wrote:
> Sorry, but just to be clear here, this is the 2.12 API issue:
> https://issues.apache.org/jira/browse/SPARK-14643, with more details in this
> doc:
>
On 5 Apr 2018, at 18:04, Matei Zaharia
> wrote:
Java 9/10 support would be great to add as well.
Be aware that the work moving hadoop core to java 9+ is still a big piece of
work being undertaken by Akira Ajisaka & colleagues at NTT
Oh, forgot to add, but splitting the source tree in Scala also creates the
issue of a big maintenance burden for third-party libraries built on Spark. As
Josh said on the JIRA:
"I think this is primarily going to be an issue for end users who want to use
an existing source tree to
Sorry, but just to be clear here, this is the 2.12 API issue:
https://issues.apache.org/jira/browse/SPARK-14643, with more details in this
doc:
https://docs.google.com/document/d/1P_wmH3U356f079AYgSsN53HKixuNdxSEvo8nw_tgLgM/edit.
Basically, if we are allowed to change Spark’s API a little to
I remember seeing somewhere that Scala still has some issues with Java
9/10 so that might be hard...
But on that topic, it might be better to shoot for Java 11
compatibility. 9 and 10, following the new release model, aren't
really meant to be long-term releases.
In general, agree with Sean
Java 9/10 support would be great to add as well.
Regarding Scala 2.12, I thought that supporting it would become easier if we
change the Spark API and ABI slightly. Basically, it is of course possible to
create an alternate source tree today, but it might be possible to share the
same source
Hi all,
I also agree with Mark that we should add Java 9/10 support to an eventual
Spark 3.0 release, because supporting Java 9 is not a trivial task since we
are using some internal APIs for the memory management which changed:
either we find a solution which works on both (but I am not sure it
As with Sean, I'm not sure that this will require a new major version, but
we should also be looking at Java 9 & 10 support -- particularly with
regard to their better functionality in a containerized environment (memory
limits from cgroups, not sysconf; support for cpusets). In that regard, we
On Wed, Apr 4, 2018 at 6:20 PM Reynold Xin wrote:
> The primary motivating factor IMO for a major version bump is to support
> Scala 2.12, which requires minor API breaking changes to Spark’s APIs.
> Similar to Spark 2.0, I think there are also opportunities for other
>
42 matches
Mail list logo