Re: ASF policy violation and Scala version issues

2023-06-06 Thread Dongjoon Hyun
Hi, All and Matei (as the Chair of Spark PMC).

For the ASF policy violation part, here is a legal recommendation
documentation (draft) from `legal-discuss@`.

https://www.apache.org/foundation/marks/downstream.html#source

> A version number must be used that both clearly differentiates it from an
Apache Software Foundation release and clearly identifies the Apache
Software Foundation version on which the software is based.

In short, Databricks should not claim its product like "Apache Spark
3.4.0". The version number should clearly differentiate it from Apache
Spark 3.4.0. I hope we can conclude this together in this way and move our
focus forward to the other remaining issues.

To Matei, could you do the legal follow-up officially with Databricks with
the above info?

If there is a person to do this, I believe you are the best person to drive
this.

Thank you in advance.

Dongjoon.


On Tue, Jun 6, 2023 at 2:49 PM Dongjoon Hyun  wrote:

> It goes to "legal-discuss@".
>
> https://lists.apache.org/thread/mzhggd0rpz8t4d7vdsbhkp38mvd3lty4
>
> I hope we can conclude the legal part clearly and shortly in one way or
> another which we will follow with confidence.
>
> Dongjoon
>
> On 2023/06/06 20:06:42 Dongjoon Hyun wrote:
> > Thank you, Sean, Mich, Holden, again.
> >
> > For this specific part, let's ask the ASF board via bo...@apache.org to
> > find a right answer because it's a controversial legal issue here.
> >
> > > I think you'd just prefer Databricks make a different choice, which is
> > legitimate, but, an issue to take up with Databricks, not here.
> >
> > Dongjoon.
> >
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: JDK version support policy?

2023-06-06 Thread Dongjoon Hyun
I'm also +1 on dropping both Java 8 and 11 in Apache Spark 4.0, too.

Dongjoon.

On 2023/06/07 02:42:19 yangjie01 wrote:
> +1 on dropping Java 8 in Spark 4.0, and I even hope Spark 4.0 can only 
> support Java 17 and the upcoming Java 21.
> 
> 发件人: Denny Lee 
> 日期: 2023年6月7日 星期三 07:10
> 收件人: Sean Owen 
> 抄送: David Li , "dev@spark.apache.org" 
> 
> 主题: Re: JDK version support policy?
> 
> +1 on dropping Java 8 in Spark 4.0, saying this as a fan of the fast-paced 
> (positive) updates to Arrow, eh?!
> 
> On Tue, Jun 6, 2023 at 4:02 PM Sean Owen 
> mailto:sro...@gmail.com>> wrote:
> I haven't followed this discussion closely, but I think we could/should drop 
> Java 8 in Spark 4.0, which is up next after 3.5?
> 
> On Tue, Jun 6, 2023 at 2:44 PM David Li 
> mailto:lidav...@apache.org>> wrote:
> Hello Spark developers,
> 
> I'm from the Apache Arrow project. We've discussed Java version support [1], 
> and crucially, whether to continue supporting Java 8 or not. As Spark is a 
> big user of Arrow in Java, I was curious what Spark's policy here was.
> 
> If Spark intends to stay on Java 8, for instance, we may also want to stay on 
> Java 8 or otherwise provide some supported version of Arrow for Java 8.
> 
> We've seen dependencies dropping or planning to drop support. gRPC may drop 
> Java 8 at any time [2], possibly this September [3], which may affect Spark 
> (due to Spark Connect). And today we saw that Arrow had issues running tests 
> with Mockito on Java 20, but we couldn't update Mockito since it had dropped 
> Java 8 support. (We pinned the JDK version in that CI pipeline for now.)
> 
> So at least, I am curious if Arrow could start the long process of migrating 
> Java versions without impacting Spark, or if we should continue to cooperate. 
> Arrow Java doesn't see quite so much activity these days, so it's not quite 
> critical, but it's possible that these dependency issues will start to affect 
> us more soon. And looking forward, Java is working on APIs that should also 
> allow us to ditch the --add-opens flag requirement too.
> 
> [1]: 
> https://lists.apache.org/thread/phpgpydtt3yrgnncdyv4qdq1gf02s0yj
> [2]: 
> https://github.com/grpc/proposal/blob/master/P5-jdk-version-support.md
> [3]: 
> https://github.com/grpc/grpc-java/issues/9386
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: JDK version support policy?

2023-06-06 Thread yangjie01
+1 on dropping Java 8 in Spark 4.0, and I even hope Spark 4.0 can only support 
Java 17 and the upcoming Java 21.

发件人: Denny Lee 
日期: 2023年6月7日 星期三 07:10
收件人: Sean Owen 
抄送: David Li , "dev@spark.apache.org" 

主题: Re: JDK version support policy?

+1 on dropping Java 8 in Spark 4.0, saying this as a fan of the fast-paced 
(positive) updates to Arrow, eh?!

On Tue, Jun 6, 2023 at 4:02 PM Sean Owen 
mailto:sro...@gmail.com>> wrote:
I haven't followed this discussion closely, but I think we could/should drop 
Java 8 in Spark 4.0, which is up next after 3.5?

On Tue, Jun 6, 2023 at 2:44 PM David Li 
mailto:lidav...@apache.org>> wrote:
Hello Spark developers,

I'm from the Apache Arrow project. We've discussed Java version support [1], 
and crucially, whether to continue supporting Java 8 or not. As Spark is a big 
user of Arrow in Java, I was curious what Spark's policy here was.

If Spark intends to stay on Java 8, for instance, we may also want to stay on 
Java 8 or otherwise provide some supported version of Arrow for Java 8.

We've seen dependencies dropping or planning to drop support. gRPC may drop 
Java 8 at any time [2], possibly this September [3], which may affect Spark 
(due to Spark Connect). And today we saw that Arrow had issues running tests 
with Mockito on Java 20, but we couldn't update Mockito since it had dropped 
Java 8 support. (We pinned the JDK version in that CI pipeline for now.)

So at least, I am curious if Arrow could start the long process of migrating 
Java versions without impacting Spark, or if we should continue to cooperate. 
Arrow Java doesn't see quite so much activity these days, so it's not quite 
critical, but it's possible that these dependency issues will start to affect 
us more soon. And looking forward, Java is working on APIs that should also 
allow us to ditch the --add-opens flag requirement too.

[1]: 
https://lists.apache.org/thread/phpgpydtt3yrgnncdyv4qdq1gf02s0yj
[2]: 
https://github.com/grpc/proposal/blob/master/P5-jdk-version-support.md
[3]: 
https://github.com/grpc/grpc-java/issues/9386


Re: ASF policy violation and Scala version issues

2023-06-06 Thread Mich Talebzadeh
Hello,

This explanation is splendidly detailed and requires further understanding.
However, on a first thought with regard to the point raised below and I
quote:

"... There is a company claiming something non-Apache like "Apache Spark
3.4.0 minus SPARK-40436" with the name "Apache Spark 3.4.0."

There is a potential risk for the consumers of this product offered that
can be justified as below:
To maintain the integrity of the Apache Spark project and ensure reliable
and secure software, it is  a common practice to use official releases from
the ASF. If a third party company is claiming to provide a modified version
of Apache Spark (in the form of software as a service), it is strongly
recommended " for consumers" to carefully review the modifications
involved, understand the reasoning behind these modifications and/or
omissions, and evaluate the potential implications before using and
maintaining this offering in production environments. The third party
company has to clearly state and advertise  the reasoning behind this
so-called hacking, specifically with reference to
 "### 3. Action Items ### --We should communicate and help the company to
fix the misleading messages and remove Scala-version segmentation
situations per Spark version".

HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 5 Jun 2023 at 08:46, Dongjoon Hyun  wrote:

> Hi, All and Matei (as the Chair of Apache Spark PMC).
>
> Sorry for a long email, I want to share two topics and corresponding
> action items.
> You can go to "Section 3: Action Items" directly for the conclusion.
>
>
> ### 1. ASF Policy Violation ###
>
> ASF has a rule for "MAY I CALL MY MODIFIED CODE 'APACHE'?"
>
> https://www.apache.org/foundation/license-faq.html#Name-changes
>
> For example, when we call `Apache Spark 3.4.0`, it's supposed to be the
> same with one of our official distributions.
>
> https://downloads.apache.org/spark/spark-3.4.0/
>
> Specifically, in terms of the Scala version, we believe it should have
> Scala 2.12.17 because of 'SPARK-40436 Upgrade Scala to 2.12.17'.
>
> There is a company claiming something non-Apache like "Apache Spark 3.4.0
> minus SPARK-40436" with the name "Apache Spark 3.4.0."
>
> - The company website shows "X.Y (includes Apache Spark 3.4.0, Scala
> 2.12)"
> - The runtime logs "23/06/05 04:23:27 INFO SparkContext: Running Spark
> version 3.4.0"
> - UI shows Apache Spark logo and `3.4.0`.
> - However, Scala Version is '2.12.15'
>
> [image: Screenshot 2023-06-04 at 9.37.16 PM.png][image: Screenshot
> 2023-06-04 at 10.14.45 PM.png]
>
> Lastly, this is not a single instance. For example, the same company also
> claims "Apache Spark 3.3.2" with a mismatched Scala version.
>
>
> ### 2. Scala Issues ###
>
> In addition to (1), although we proceeded with good intentions and great
> care
> including dev mailing list discussion, there are several concerning areas
> which
> need more attention and our love.
>
> a) Scala Spark users will experience UX inconvenience from Spark 3.5.
>
> SPARK-42493 Make Python the first tab for code examples
>
> For the record, we discussed it here.
> - https://lists.apache.org/thread/1p8s09ysrh4jqsfd47qdtrl7rm4rrs05
>   "[DISCUSS] Show Python code examples first in Spark documentation"
>
> b) Scala version upgrade is blocked by the Ammonite library dev cycle
> currently.
>
> Although we discussed it here and it had good intentions,
> the current master branch cannot use the latest Scala.
>
> - https://lists.apache.org/thread/4nk5ddtmlobdt8g3z8xbqjclzkhlsdfk
> "Ammonite as REPL for Spark Connect"
>  SPARK-42884 Add Ammonite REPL integration
>
> Specifically, the following are blocked and I'm monitoring the
> Ammonite repository.
> - SPARK-40497 Upgrade Scala to 2.13.11
> - SPARK-43832 Upgrade Scala to 2.12.18
> - According to https://github.com/com-lihaoyi/Ammonite/issues ,
>   Scala 3.3.0 LTS support also looks infeasible.
>
> Although we may be able to wait for a while, there are two fundamental
> solutions
> to unblock this situation in a long-term maintenance perspective.
> - Replace it with a Scala-shell based implementation
> - Move `connector/connect/client/jvm/pom.xml` outside from Spark repo.
>Maybe, we can put it into the new repo like Rust and Go client.
>
> c) Scala 2.13 and above needs Apache Spark 4.0.
>
> In "Apache Spark 3.5.0 Expectations?" and 

Re: JDK version support policy?

2023-06-06 Thread Denny Lee
+1 on dropping Java 8 in Spark 4.0, saying this as a fan of the fast-paced
(positive) updates to Arrow, eh?!

On Tue, Jun 6, 2023 at 4:02 PM Sean Owen  wrote:

> I haven't followed this discussion closely, but I think we could/should
> drop Java 8 in Spark 4.0, which is up next after 3.5?
>
> On Tue, Jun 6, 2023 at 2:44 PM David Li  wrote:
>
>> Hello Spark developers,
>>
>> I'm from the Apache Arrow project. We've discussed Java version support
>> [1], and crucially, whether to continue supporting Java 8 or not. As Spark
>> is a big user of Arrow in Java, I was curious what Spark's policy here was.
>>
>> If Spark intends to stay on Java 8, for instance, we may also want to
>> stay on Java 8 or otherwise provide some supported version of Arrow for
>> Java 8.
>>
>> We've seen dependencies dropping or planning to drop support. gRPC may
>> drop Java 8 at any time [2], possibly this September [3], which may affect
>> Spark (due to Spark Connect). And today we saw that Arrow had issues
>> running tests with Mockito on Java 20, but we couldn't update Mockito since
>> it had dropped Java 8 support. (We pinned the JDK version in that CI
>> pipeline for now.)
>>
>> So at least, I am curious if Arrow could start the long process of
>> migrating Java versions without impacting Spark, or if we should continue
>> to cooperate. Arrow Java doesn't see quite so much activity these days, so
>> it's not quite critical, but it's possible that these dependency issues
>> will start to affect us more soon. And looking forward, Java is working on
>> APIs that should also allow us to ditch the --add-opens flag requirement
>> too.
>>
>> [1]: https://lists.apache.org/thread/phpgpydtt3yrgnncdyv4qdq1gf02s0yj
>> [2]:
>> https://github.com/grpc/proposal/blob/master/P5-jdk-version-support.md
>> [3]: https://github.com/grpc/grpc-java/issues/9386
>>
>


Re: JDK version support policy?

2023-06-06 Thread Sean Owen
I haven't followed this discussion closely, but I think we could/should
drop Java 8 in Spark 4.0, which is up next after 3.5?

On Tue, Jun 6, 2023 at 2:44 PM David Li  wrote:

> Hello Spark developers,
>
> I'm from the Apache Arrow project. We've discussed Java version support
> [1], and crucially, whether to continue supporting Java 8 or not. As Spark
> is a big user of Arrow in Java, I was curious what Spark's policy here was.
>
> If Spark intends to stay on Java 8, for instance, we may also want to stay
> on Java 8 or otherwise provide some supported version of Arrow for Java 8.
>
> We've seen dependencies dropping or planning to drop support. gRPC may
> drop Java 8 at any time [2], possibly this September [3], which may affect
> Spark (due to Spark Connect). And today we saw that Arrow had issues
> running tests with Mockito on Java 20, but we couldn't update Mockito since
> it had dropped Java 8 support. (We pinned the JDK version in that CI
> pipeline for now.)
>
> So at least, I am curious if Arrow could start the long process of
> migrating Java versions without impacting Spark, or if we should continue
> to cooperate. Arrow Java doesn't see quite so much activity these days, so
> it's not quite critical, but it's possible that these dependency issues
> will start to affect us more soon. And looking forward, Java is working on
> APIs that should also allow us to ditch the --add-opens flag requirement
> too.
>
> [1]: https://lists.apache.org/thread/phpgpydtt3yrgnncdyv4qdq1gf02s0yj
> [2]:
> https://github.com/grpc/proposal/blob/master/P5-jdk-version-support.md
> [3]: https://github.com/grpc/grpc-java/issues/9386
>


Re: ASF policy violation and Scala version issues

2023-06-06 Thread Dongjoon Hyun
It goes to "legal-discuss@".

https://lists.apache.org/thread/mzhggd0rpz8t4d7vdsbhkp38mvd3lty4

I hope we can conclude the legal part clearly and shortly in one way or another 
which we will follow with confidence.

Dongjoon

On 2023/06/06 20:06:42 Dongjoon Hyun wrote:
> Thank you, Sean, Mich, Holden, again.
> 
> For this specific part, let's ask the ASF board via bo...@apache.org to
> find a right answer because it's a controversial legal issue here.
> 
> > I think you'd just prefer Databricks make a different choice, which is
> legitimate, but, an issue to take up with Databricks, not here.
> 
> Dongjoon.
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: ASF policy violation and Scala version issues

2023-06-06 Thread Dongjoon Hyun
Thank you, Sean, Mich, Holden, again.

For this specific part, let's ask the ASF board via bo...@apache.org to
find a right answer because it's a controversial legal issue here.

> I think you'd just prefer Databricks make a different choice, which is
legitimate, but, an issue to take up with Databricks, not here.

Dongjoon.


JDK version support policy?

2023-06-06 Thread David Li
Hello Spark developers,

I'm from the Apache Arrow project. We've discussed Java version support [1], 
and crucially, whether to continue supporting Java 8 or not. As Spark is a big 
user of Arrow in Java, I was curious what Spark's policy here was.

If Spark intends to stay on Java 8, for instance, we may also want to stay on 
Java 8 or otherwise provide some supported version of Arrow for Java 8.

We've seen dependencies dropping or planning to drop support. gRPC may drop 
Java 8 at any time [2], possibly this September [3], which may affect Spark 
(due to Spark Connect). And today we saw that Arrow had issues running tests 
with Mockito on Java 20, but we couldn't update Mockito since it had dropped 
Java 8 support. (We pinned the JDK version in that CI pipeline for now.)

So at least, I am curious if Arrow could start the long process of migrating 
Java versions without impacting Spark, or if we should continue to cooperate. 
Arrow Java doesn't see quite so much activity these days, so it's not quite 
critical, but it's possible that these dependency issues will start to affect 
us more soon. And looking forward, Java is working on APIs that should also 
allow us to ditch the --add-opens flag requirement too.

[1]: https://lists.apache.org/thread/phpgpydtt3yrgnncdyv4qdq1gf02s0yj
[2]: https://github.com/grpc/proposal/blob/master/P5-jdk-version-support.md
[3]: https://github.com/grpc/grpc-java/issues/9386

Re: ASF policy violation and Scala version issues

2023-06-06 Thread Holden Karau
So I think if the Spark PMC wants to ask Databricks something that could be
reasonable (although I'm a little fuzzy as to the ask), but that
conversation might belong on private@ (I could be wrong of course).

On Tue, Jun 6, 2023 at 3:29 AM Mich Talebzadeh 
wrote:

> I concur with you Sean.
>
> If I understand correctly the point raised by the thread owner, in
> heterogeneous environments that we work, it is up to the practitioner to
> ensure that there is version compatibility among OS versions, spark version
> and the target artefact in consideration. For example if I try to connect
> to Google BigQuery from spark 3.4.0, my OS or for that matter, the docker
> needs to run Java 8 regardless of  spark Java version, otherwise it will
> fail.
>
> I think these details should be left to the trenches, because these
> arguments about versioning become tangential in the big picture.  Case in
> point, my current OS scala version is 2.13.8 but works fine with Spark
> built on 2.12.17.
>
> HTH
>
> Mich Talebzadeh,
> Lead Solutions Architect/Engineering Lead
> Palantir Technologies Limited
> London
> United Kingdom
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Tue, 6 Jun 2023 at 01:37, Sean Owen  wrote:
>
>> I think the issue is whether a distribution of Spark is so materially
>> different from OSS that it causes problems for the larger community of
>> users. There's a legitimate question of whether such a thing can be called
>> "Apache Spark + changes", as describing it that way becomes meaningfully
>> inaccurate. And if it's inaccurate, then it's a trademark usage issue, and
>> a matter for the PMC to act on. I certainly recall this type of problem
>> from the early days of Hadoop - the project itself had 2 or 3 live branches
>> in development (was it 0.20.x vs 0.23.x vs 1.x? YARN vs no YARN?) picked up
>> by different vendors and it was unclear what "Apache Hadoop" meant in a
>> vendor distro. Or frankly, upstream.
>>
>> In comparison, variation in Scala maintenance release seems trivial. I'm
>> not clear from the thread what actual issue this causes to users. Is there
>> more to it - does this go hand in hand with JDK version and Ammonite, or
>> are those separate? What's an example of the practical user issue. Like, I
>> compile vs Spark 3.4.0 and because of Scala version differences it doesn't
>> run on some vendor distro? That's not great, but seems like a vendor
>> problem. Unless you tell me we are getting tons of bug reports to OSS Spark
>> as a result or something.
>>
>> Is the implication that something in OSS Spark is being blocked to prefer
>> some set of vendor choices? because the changes you're pointing to seem to
>> be going into Apache Spark, actually. It'd be more useful to be specific
>> and name names at this point, seems fine.
>>
>> The rest of this is just a discussion about Databricks choices. (If it's
>> not clear, I'm at Databricks but do not work on the Spark distro). We can
>> discuss but it seems off-topic _if_ it can't be connected to a problem for
>> OSS Spark. Anyway:
>>
>> If it helps, _some_ important patches are described at
>> https://docs.databricks.com/release-notes/runtime/maintenance-updates.html
>> ; I don't think this is exactly hidden.
>>
>> Out of curiosity, how would you describe this software in the UI instead?
>> "3.4.0" is shorthand, because this is a little dropdown menu; the terminal
>> output is likewise not a place to list all patches. You would propose
>> requesting calling this "3.4.0 + patches"? That's the best I can think of,
>> but I don't think it addresses what you're getting at anyway. I think you'd
>> just prefer Databricks make a different choice, which is legitimate, but,
>> an issue to take up with Databricks, not here.
>>
>>
>> On Mon, Jun 5, 2023 at 6:58 PM Dongjoon Hyun 
>> wrote:
>>
>>> Hi, Sean.
>>>
>>> "+ patches" or "powered by Apache Spark 3.4.0" is not a problem as you
>>> mentioned. For the record, I also didn't bring up any old story here.
>>>
>>> > "Apache Spark 3.4.0 + patches"
>>>
>>> However, "including Apache Spark 3.4.0" still causes confusion even in a
>>> different way because of those missing patches, SPARK-40436 (Upgrade Scala
>>> to 2.12.17) and SPARK-39414 (Upgrade Scala to 2.12.16). Technically,
>>> Databricks Runtime doesn't include Apache Spark 3.4.0 while it claims it to
>>> the users.
>>>
>>> [image: image.png]
>>>
>>> It's a sad story from the Apache Spark Scala perspective because the
>>> users cannot even try to use the correct Scala 2.12.17 version in the
>>> runtime.

Re: ASF policy violation and Scala version issues

2023-06-06 Thread Mich Talebzadeh
I concur with you Sean.

If I understand correctly the point raised by the thread owner, in
heterogeneous environments that we work, it is up to the practitioner to
ensure that there is version compatibility among OS versions, spark version
and the target artefact in consideration. For example if I try to connect
to Google BigQuery from spark 3.4.0, my OS or for that matter, the docker
needs to run Java 8 regardless of  spark Java version, otherwise it will
fail.

I think these details should be left to the trenches, because these
arguments about versioning become tangential in the big picture.  Case in
point, my current OS scala version is 2.13.8 but works fine with Spark
built on 2.12.17.

HTH

Mich Talebzadeh,
Lead Solutions Architect/Engineering Lead
Palantir Technologies Limited
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Tue, 6 Jun 2023 at 01:37, Sean Owen  wrote:

> I think the issue is whether a distribution of Spark is so materially
> different from OSS that it causes problems for the larger community of
> users. There's a legitimate question of whether such a thing can be called
> "Apache Spark + changes", as describing it that way becomes meaningfully
> inaccurate. And if it's inaccurate, then it's a trademark usage issue, and
> a matter for the PMC to act on. I certainly recall this type of problem
> from the early days of Hadoop - the project itself had 2 or 3 live branches
> in development (was it 0.20.x vs 0.23.x vs 1.x? YARN vs no YARN?) picked up
> by different vendors and it was unclear what "Apache Hadoop" meant in a
> vendor distro. Or frankly, upstream.
>
> In comparison, variation in Scala maintenance release seems trivial. I'm
> not clear from the thread what actual issue this causes to users. Is there
> more to it - does this go hand in hand with JDK version and Ammonite, or
> are those separate? What's an example of the practical user issue. Like, I
> compile vs Spark 3.4.0 and because of Scala version differences it doesn't
> run on some vendor distro? That's not great, but seems like a vendor
> problem. Unless you tell me we are getting tons of bug reports to OSS Spark
> as a result or something.
>
> Is the implication that something in OSS Spark is being blocked to prefer
> some set of vendor choices? because the changes you're pointing to seem to
> be going into Apache Spark, actually. It'd be more useful to be specific
> and name names at this point, seems fine.
>
> The rest of this is just a discussion about Databricks choices. (If it's
> not clear, I'm at Databricks but do not work on the Spark distro). We can
> discuss but it seems off-topic _if_ it can't be connected to a problem for
> OSS Spark. Anyway:
>
> If it helps, _some_ important patches are described at
> https://docs.databricks.com/release-notes/runtime/maintenance-updates.html
> ; I don't think this is exactly hidden.
>
> Out of curiosity, how would you describe this software in the UI instead?
> "3.4.0" is shorthand, because this is a little dropdown menu; the terminal
> output is likewise not a place to list all patches. You would propose
> requesting calling this "3.4.0 + patches"? That's the best I can think of,
> but I don't think it addresses what you're getting at anyway. I think you'd
> just prefer Databricks make a different choice, which is legitimate, but,
> an issue to take up with Databricks, not here.
>
>
> On Mon, Jun 5, 2023 at 6:58 PM Dongjoon Hyun 
> wrote:
>
>> Hi, Sean.
>>
>> "+ patches" or "powered by Apache Spark 3.4.0" is not a problem as you
>> mentioned. For the record, I also didn't bring up any old story here.
>>
>> > "Apache Spark 3.4.0 + patches"
>>
>> However, "including Apache Spark 3.4.0" still causes confusion even in a
>> different way because of those missing patches, SPARK-40436 (Upgrade Scala
>> to 2.12.17) and SPARK-39414 (Upgrade Scala to 2.12.16). Technically,
>> Databricks Runtime doesn't include Apache Spark 3.4.0 while it claims it to
>> the users.
>>
>> [image: image.png]
>>
>> It's a sad story from the Apache Spark Scala perspective because the
>> users cannot even try to use the correct Scala 2.12.17 version in the
>> runtime.
>>
>> All items I've shared are connected via a single theme, hurting Apache
>> Spark Scala users.
>> From (1) building Spark, (2) creating a fragmented Scala Spark runtime
>> environment and (3) hidden user-facing documentation.
>>
>> Of course, I don't think those are designed in an organized way
>> intentionally. It just happens at the same time.
>>
>> Based on your comments, let me ask you two