Re: Apache Spark 3.4.1 Release?

2023-06-11 Thread Dongjoon Hyun
Thank you all.

I'll check and prepare `branch-3.4` for the target date, June 20th.

Dongjoon.


On Fri, Jun 9, 2023 at 10:47 PM yangjie01  wrote:

> +1
>
>
>
> Thank you Dongjoon ~
>
>
>
> *发件人**: *Ruifeng Zheng 
> *日期**: *2023年6月10日 星期六 09:39
> *收件人**: *Xiao Li 
> *抄送**: *Wenchen Fan , Xinrong Meng <
> xinr...@apache.org>, dev 
> *主题**: *Re: Apache Spark 3.4.1 Release?
>
>
>
> +1
>
>
>
> Thank you Dongjoon!
>
>
>
>
>
> On Fri, Jun 9, 2023 at 11:54 PM Xiao Li 
> wrote:
>
> +1
>
>
>
> On Fri, Jun 9, 2023 at 08:30 Wenchen Fan  wrote:
>
> +1
>
>
>
> On Fri, Jun 9, 2023 at 8:52 PM Xinrong Meng  wrote:
>
> +1. Thank you Doonjoon!
>
>
>
> Thanks,
>
>
>
> Xinrong Meng
>
>
>
> Mridul Muralidharan 于2023年6月9日 周五上午5:22写道:
>
>
>
> +1, thanks Dongjoon !
>
>
>
> Regards,
>
> Mridul
>
>
>
> On Thu, Jun 8, 2023 at 7:16 PM Jia Fan 
> wrote:
>
> +1
>
>
> 
>
>
>
> Jia Fan
>
>
>
>
> 2023年6月9日 08:00,Yuming Wang  写道:
>
>
>
> +1.
>
>
>
> On Fri, Jun 9, 2023 at 7:14 AM Chao Sun  wrote:
>
> +1 too
>
> On Thu, Jun 8, 2023 at 2:34 PM kazuyuki tanimura
>  wrote:
> >
> > +1 (non-binding), Thank you Dongjoon
> >
> > Kazu
> >
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
>
> --
>
> [image: 图像已被发件人删除。]
>
>


Re: ASF policy violation and Scala version issues

2023-06-11 Thread yangjie01
Yes, you're right.

发件人: Jungtaek Lim 
日期: 2023年6月12日 星期一 11:37
收件人: Dongjoon Hyun 
抄送: yangjie01 , Grisha Weintraub 
, Nan Zhu , Sean Owen 
, "dev@spark.apache.org" 
主题: Re: ASF policy violation and Scala version issues

Are we concerned that a library does not release a new version which bumps the 
Scala version, which the Scala version is announced in less than a week?
Shall we respect the efforts of all maintainers of open source projects we use 
as dependencies, regardless whether they are ASF projects or individuals? 
Individual projects consist of volunteers (unlike projects which are backed by 
small and big companies). Please remember they have their daily job different 
from these projects.

Also, if you look at the thread for 
2.13.11,
 they found two regressions in only 3 days, even before they announced the 
version. Bumping a bugfix version is not always safe, especially for Scala 
where they use semver as one level down - their minor version is almost 
another's major version (similar amount of pain on upgrading).

Btw, I see this is an effort of supporting JDK 21, but GA of JDK 21 is planned 
for September 19, according to the post in 
InfoQ.
 Do we need to be coupled with a Java version which is not even released yet? 
Shall we postpone this to Spark 4.0, as we say supporting JDK 21 is a stretched 
goal for Spark 3.5 rather than a blocker?
This is not a complete view, but one post about JDK usage among LTS 
versions
 shows that JDK 17 is still less than 10% although it was released 1.5 years 
ago, and in last year it was less than 0.5%. In the real world, Java 11 is 
still a majority and growing up, and 17 is slowly catching up. Even though JDK 
21 will be released tomorrow, we will have more than one year to support it.



On Mon, Jun 12, 2023 at 4:54 AM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
Yes, that's exactly the pain point. I totally agree with you.
For now, we are focusing on other stuffs more, but we need to resolve this 
situation soon.

Dongjoon.


On Sun, Jun 11, 2023 at 1:21 AM yangjie01 
mailto:yangji...@baidu.com>> wrote:
Perhaps we should reconsider our reliance on and use of Ammonite? There are 
still no new available versions of Ammonite one week after the release of Scala 
2.12.18 and 2.13.11. The question related to version release in the Ammonite 
community also did not receive a response, which makes me feel this is 
unexpected. Of course, we can also wait for a while before making a decision.

```
Scala version upgrade is blocked by the Ammonite library dev cycle currently.

Although we discussed it here and it had good intentions,
the current master branch cannot use the latest Scala.

- 
https://lists.apache.org/thread/4nk5ddtmlobdt8g3z8xbqjclzkhlsdfk
"Ammonite as REPL for Spark Connect"
 SPARK-42884 Add Ammonite REPL integration

Specifically, the following are blocked and I'm monitoring the Ammonite 
repository.
- SPARK-40497 Upgrade Scala to 2.13.11
- SPARK-43832 Upgrade Scala to 2.12.18
- According to 
https://github.com/com-lihaoyi/Ammonite/issues
 ,
  Scala 3.3.0 LTS support also looks infeasible.

Although we may be able to wait for a while, there are two fundamental 
solutions
to unblock this situation in a long-term maintenance perspective.
- Replace it with a Scala-shell based implementation
- Move `connector/connect/client/jvm/pom.xml` outside from Spark repo.
   Maybe, we can put it into the new repo like Rust and Go client.
```
发件人: Grisha Weintraub 
mailto:grisha.weintr...@gmail.com>>
日期: 2023年6月8日 星期四 04:05
收件人: Dongjoon Hyun mailto:dongjoon.h...@gmail.com>>
抄送: Nan Zhu mailto:zhunanmcg...@gmail.com>>, Sean Owen 
mailto:sro...@gmail.com>>, 
"dev@spark.apache.org" 
mailto:dev@spark.apache.org>>
主题: Re: ASF policy violation and Scala version issues

Dongjoon,

I followed the conversation, and in my opinion, your concern is totally legit.
It just feels that the discussion is focused solely on Databricks, and as I 
said above, the same issue occurs in other vendors as well.


On Wed, Jun 7, 2023 at 10:28 PM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
To Grisha, we are talking about what is the right way and how to comply with 
ASF legal advice 

Re: ASF policy violation and Scala version issues

2023-06-11 Thread Jungtaek Lim
Are we concerned that a library does not release a new version which bumps
the Scala version, which the Scala version is announced in less than a week?
Shall we respect the efforts of all maintainers of open source projects we
use as dependencies, regardless whether they are ASF projects or
individuals? Individual projects consist of volunteers (unlike projects
which are backed by small and big companies). Please remember they have
their daily job different from these projects.

Also, if you look at the thread for 2.13.11
,
they found two regressions in only 3 days, even before they announced the
version. Bumping a bugfix version is not always safe, especially for Scala
where they use semver as one level down - their minor version is almost
another's major version (similar amount of pain on upgrading).

Btw, I see this is an effort of supporting JDK 21, but GA of JDK 21 is
planned for September 19, according to the post in InfoQ
. Do we
need to be coupled with a Java version which is not even released yet?
Shall we postpone this to Spark 4.0, as we say supporting JDK 21 is a
stretched goal for Spark 3.5 rather than a blocker?
This is not a complete view, but one post about JDK usage among LTS versions

shows that JDK 17 is still less than 10% although it was released 1.5 years
ago, and in last year it was less than 0.5%. In the real world, Java 11 is
still a majority and growing up, and 17 is slowly catching up. Even though
JDK 21 will be released tomorrow, we will have more than one year to
support it.



On Mon, Jun 12, 2023 at 4:54 AM Dongjoon Hyun 
wrote:

> Yes, that's exactly the pain point. I totally agree with you.
> For now, we are focusing on other stuffs more, but we need to resolve this
> situation soon.
>
> Dongjoon.
>
>
> On Sun, Jun 11, 2023 at 1:21 AM yangjie01  wrote:
>
>> Perhaps we should reconsider our reliance on and use of Ammonite? There
>> are still no new available versions of Ammonite one week after the release
>> of Scala 2.12.18 and 2.13.11. The question related to version release in
>> the Ammonite community also did not receive a response, which makes me feel
>> this is unexpected. Of course, we can also wait for a while before making a
>> decision.
>>
>>
>>
>> ```
>>
>> Scala version upgrade is blocked by the Ammonite library dev cycle
>> currently.
>>
>> Although we discussed it here and it had good intentions,
>> the current master branch cannot use the latest Scala.
>>
>> - https://lists.apache.org/thread/4nk5ddtmlobdt8g3z8xbqjclzkhlsdfk
>> 
>> "Ammonite as REPL for Spark Connect"
>>  SPARK-42884 Add Ammonite REPL integration
>>
>> Specifically, the following are blocked and I'm monitoring the
>> Ammonite repository.
>> - SPARK-40497 Upgrade Scala to 2.13.11
>> - SPARK-43832 Upgrade Scala to 2.12.18
>> - According to https://github.com/com-lihaoyi/Ammonite/issues
>> 
>>  ,
>>   Scala 3.3.0 LTS support also looks infeasible.
>>
>> Although we may be able to wait for a while, there are two
>> fundamental solutions
>> to unblock this situation in a long-term maintenance perspective.
>> - Replace it with a Scala-shell based implementation
>> - Move `connector/connect/client/jvm/pom.xml` outside from Spark repo.
>>Maybe, we can put it into the new repo like Rust and Go client.
>>
>> ```
>>
>> *发件人**: *Grisha Weintraub 
>> *日期**: *2023年6月8日 星期四 04:05
>> *收件人**: *Dongjoon Hyun 
>> *抄送**: *Nan Zhu , Sean Owen , "
>> dev@spark.apache.org" 
>> *主题**: *Re: ASF policy violation and Scala version issues
>>
>>
>>
>> Dongjoon,
>>
>>
>>
>> I followed the conversation, and in my opinion, your concern is totally
>> legit.
>> It just feels that the discussion is focused solely on Databricks, and as
>> I said above, the same issue occurs in other vendors as well.
>>
>>
>>
>>
>>
>> On Wed, Jun 7, 2023 at 10:28 PM Dongjoon Hyun 
>> wrote:
>>
>> To Grisha, we are talking about what is the right way and how to comply
>> with ASF legal advice which I shared in this thread from "legal-discuss@"
>> mailing thread.
>>
>>
>>
>> https://lists.apache.org/thread/mzhggd0rpz8t4d7vdsbhkp38mvd3lty4
>> 
>>  (legal-discuss@)
>>
>> https://www.apache.org/foundation/marks/downstream.html#source
>> 
>>  (ASF
>> 

Re: ASF policy violation and Scala version issues

2023-06-11 Thread Dongjoon Hyun
Yes, that's exactly the pain point. I totally agree with you.
For now, we are focusing on other stuffs more, but we need to resolve this
situation soon.

Dongjoon.


On Sun, Jun 11, 2023 at 1:21 AM yangjie01  wrote:

> Perhaps we should reconsider our reliance on and use of Ammonite? There
> are still no new available versions of Ammonite one week after the release
> of Scala 2.12.18 and 2.13.11. The question related to version release in
> the Ammonite community also did not receive a response, which makes me feel
> this is unexpected. Of course, we can also wait for a while before making a
> decision.
>
>
>
> ```
>
> Scala version upgrade is blocked by the Ammonite library dev cycle
> currently.
>
> Although we discussed it here and it had good intentions,
> the current master branch cannot use the latest Scala.
>
> - https://lists.apache.org/thread/4nk5ddtmlobdt8g3z8xbqjclzkhlsdfk
> 
> "Ammonite as REPL for Spark Connect"
>  SPARK-42884 Add Ammonite REPL integration
>
> Specifically, the following are blocked and I'm monitoring the
> Ammonite repository.
> - SPARK-40497 Upgrade Scala to 2.13.11
> - SPARK-43832 Upgrade Scala to 2.12.18
> - According to https://github.com/com-lihaoyi/Ammonite/issues
> 
>  ,
>   Scala 3.3.0 LTS support also looks infeasible.
>
> Although we may be able to wait for a while, there are two fundamental
> solutions
> to unblock this situation in a long-term maintenance perspective.
> - Replace it with a Scala-shell based implementation
> - Move `connector/connect/client/jvm/pom.xml` outside from Spark repo.
>Maybe, we can put it into the new repo like Rust and Go client.
>
> ```
>
> *发件人**: *Grisha Weintraub 
> *日期**: *2023年6月8日 星期四 04:05
> *收件人**: *Dongjoon Hyun 
> *抄送**: *Nan Zhu , Sean Owen , "
> dev@spark.apache.org" 
> *主题**: *Re: ASF policy violation and Scala version issues
>
>
>
> Dongjoon,
>
>
>
> I followed the conversation, and in my opinion, your concern is totally
> legit.
> It just feels that the discussion is focused solely on Databricks, and as
> I said above, the same issue occurs in other vendors as well.
>
>
>
>
>
> On Wed, Jun 7, 2023 at 10:28 PM Dongjoon Hyun 
> wrote:
>
> To Grisha, we are talking about what is the right way and how to comply
> with ASF legal advice which I shared in this thread from "legal-discuss@"
> mailing thread.
>
>
>
> https://lists.apache.org/thread/mzhggd0rpz8t4d7vdsbhkp38mvd3lty4
> 
>  (legal-discuss@)
>
> https://www.apache.org/foundation/marks/downstream.html#source
> 
>  (ASF
> Website)
>
>
>
> Dongjoon
>
>
>
>
>
> On Wed, Jun 7, 2023 at 12:16 PM Grisha Weintraub <
> grisha.weintr...@gmail.com> wrote:
>
> Yes, in Spark UI you have it as "3.1.2-amazon", but when you create a
> cluster it's just Spark 3.1.2.
>
>
>
> On Wed, Jun 7, 2023 at 10:05 PM Nan Zhu  wrote:
>
>
>
>  for EMR, I think they show 3.1.2-amazon in Spark UI, no?
>
>
>
>
>
> On Wed, Jun 7, 2023 at 11:30 Grisha Weintraub 
> wrote:
>
> Hi,
>
>
>
> I am not taking sides here, but just for fairness, I think it should be
> noted that AWS EMR does exactly the same thing.
>
> We choose the EMR version (e.g., 6.4.0) and it has an associated Spark
> version (e.g., 3.1.2).
>
> The Spark version here is not the original Apache version but AWS Spark
> distribution.
>
>
>
> On Wed, Jun 7, 2023 at 8:24 PM Dongjoon Hyun 
> wrote:
>
> I disagree with you in several ways.
>
>
>
> The following is not a *minor* change like the given examples (alterations
> to the start-up and shutdown scripts, configuration files, file layout
> etc.).
>
>
>
> > The change you cite meets the 4th point, minor change, made for
> integration reasons.
>
>
>
> The following is also wrong. There is no such point of state of Apache
> Spark 3.4.0 after 3.4.0 tag creation. Apache Spark community didn't allow
> Scala reverting patches in both `master` branch and `branch-3.4`.
>
>
>
> > There is no known technical objection; this was after all at one point
> the state of Apache Spark.
>
>
>
> Is the following your main point? So, you are selling a box "including
> Harry Potter by J. K. Rolling whose main character is Barry instead of
> Harry", but it's okay because you didn't sell the book itself? And, as a
> cloud-vendor, you borrowed the box instead of selling it like private
> libraries?
>
>
>
> > There is no standalone distribution of Apache Spark anywhere here.
>
>
>
> We are not asking a big thing. Why are you so 

Re: ASF policy violation and Scala version issues

2023-06-11 Thread yangjie01
Perhaps we should reconsider our reliance on and use of Ammonite? There are 
still no new available versions of Ammonite one week after the release of Scala 
2.12.18 and 2.13.11. The question related to version release in the Ammonite 
community also did not receive a response, which makes me feel this is 
unexpected. Of course, we can also wait for a while before making a decision.

```
Scala version upgrade is blocked by the Ammonite library dev cycle currently.

Although we discussed it here and it had good intentions,
the current master branch cannot use the latest Scala.

- 
https://lists.apache.org/thread/4nk5ddtmlobdt8g3z8xbqjclzkhlsdfk
"Ammonite as REPL for Spark Connect"
 SPARK-42884 Add Ammonite REPL integration

Specifically, the following are blocked and I'm monitoring the Ammonite 
repository.
- SPARK-40497 Upgrade Scala to 2.13.11
- SPARK-43832 Upgrade Scala to 2.12.18
- According to 
https://github.com/com-lihaoyi/Ammonite/issues
 ,
  Scala 3.3.0 LTS support also looks infeasible.

Although we may be able to wait for a while, there are two fundamental 
solutions
to unblock this situation in a long-term maintenance perspective.
- Replace it with a Scala-shell based implementation
- Move `connector/connect/client/jvm/pom.xml` outside from Spark repo.
   Maybe, we can put it into the new repo like Rust and Go client.
```
发件人: Grisha Weintraub 
日期: 2023年6月8日 星期四 04:05
收件人: Dongjoon Hyun 
抄送: Nan Zhu , Sean Owen , 
"dev@spark.apache.org" 
主题: Re: ASF policy violation and Scala version issues

Dongjoon,

I followed the conversation, and in my opinion, your concern is totally legit.
It just feels that the discussion is focused solely on Databricks, and as I 
said above, the same issue occurs in other vendors as well.


On Wed, Jun 7, 2023 at 10:28 PM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
To Grisha, we are talking about what is the right way and how to comply with 
ASF legal advice which I shared in this thread from "legal-discuss@" mailing 
thread.

https://lists.apache.org/thread/mzhggd0rpz8t4d7vdsbhkp38mvd3lty4
 (legal-discuss@)
https://www.apache.org/foundation/marks/downstream.html#source
 (ASF Website)

Dongjoon


On Wed, Jun 7, 2023 at 12:16 PM Grisha Weintraub 
mailto:grisha.weintr...@gmail.com>> wrote:
Yes, in Spark UI you have it as "3.1.2-amazon", but when you create a cluster 
it's just Spark 3.1.2.

On Wed, Jun 7, 2023 at 10:05 PM Nan Zhu 
mailto:zhunanmcg...@gmail.com>> wrote:

 for EMR, I think they show 3.1.2-amazon in Spark UI, no?


On Wed, Jun 7, 2023 at 11:30 Grisha Weintraub 
mailto:grisha.weintr...@gmail.com>> wrote:
Hi,

I am not taking sides here, but just for fairness, I think it should be noted 
that AWS EMR does exactly the same thing.
We choose the EMR version (e.g., 6.4.0) and it has an associated Spark version 
(e.g., 3.1.2).
The Spark version here is not the original Apache version but AWS Spark 
distribution.

On Wed, Jun 7, 2023 at 8:24 PM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
I disagree with you in several ways.

The following is not a *minor* change like the given examples (alterations to 
the start-up and shutdown scripts, configuration files, file layout etc.).

> The change you cite meets the 4th point, minor change, made for integration 
> reasons.

The following is also wrong. There is no such point of state of Apache Spark 
3.4.0 after 3.4.0 tag creation. Apache Spark community didn't allow Scala 
reverting patches in both `master` branch and `branch-3.4`.

> There is no known technical objection; this was after all at one point the 
> state of Apache Spark.

Is the following your main point? So, you are selling a box "including Harry 
Potter by J. K. Rolling whose main character is Barry instead of Harry", but 
it's okay because you didn't sell the book itself? And, as a cloud-vendor, you 
borrowed the box instead of selling it like private libraries?

> There is no standalone distribution of Apache Spark anywhere here.

We are not asking a big thing. Why are you so reluctant to say you are not 
"Apache Spark 3.4.0" by simply saying "Apache Spark 3.4.0-databricks". What is 
the marketing reason here?

Dongjoon.


On Wed, Jun 7, 2023 at 9:27 AM Sean Owen 
mailto:sro...@gmail.com>> wrote:
Hi Dongjoon, I think this conversation is not advancing anymore. I personally 
consider the matter closed unless you can find other support or respond with