1/ Regarding naming - I believe releasing "Apache Foo X.Y + patches" is
acceptable, if it is substantially Apache Foo X.Y. This is common practice
for downstream vendors. It's fair nominative use. The principle here is
consumer confusion. Is anyone substantially misled? Here I don't think so.
I know that we have in the past decided it would not be OK, for example, to
release a product with "Apache Spark 4.0" now as there is no such release,
even building from master. A vendor should elaborate the changes
somewhere, ideally. I'm sure this one is about Databricks but I'm also sure
Cloudera, Hortonworks, etc had Spark releases with patches, too.

2a/ That issue seems to be about just flipping which code sample is shown
by default. It seemed widely agree that this would slightly help more users
than it harms. I agree with the change and don't see a need to escalate.
the question of further Python parity is a big one but is separate.

2b/ If a single dependency blocks important updates, yeah it's fair to
remove it, IMHO. I wouldn't remove in 3.5 unless the other updates are
critical, and it's not clear they are. In 4.0 yes.

2c/ Scala 2.13 is already supported in 3.x, and does not require 4.0. This
was about what the default non-Scala release convenience binaries use.
Sticking to 2.12 in 3.x doesn't seem like an issue, even desirable.

2d/ Same as 2b

3/ I don't think 1/ is an incident. Yes to moving towards 4.0 after 3.5,
IMHO, and to removing Ammonite in 4.0 if there is no resolution forthcoming

On Mon, Jun 5, 2023 at 2:46 AM Dongjoon Hyun <dongjoon.h...@gmail.com>
wrote:

> Hi, All and Matei (as the Chair of Apache Spark PMC).
>
> Sorry for a long email, I want to share two topics and corresponding
> action items.
> You can go to "Section 3: Action Items" directly for the conclusion.
>
>
> ### 1. ASF Policy Violation ###
>
> ASF has a rule for "MAY I CALL MY MODIFIED CODE 'APACHE'?"
>
>     https://www.apache.org/foundation/license-faq.html#Name-changes
>
> For example, when we call `Apache Spark 3.4.0`, it's supposed to be the
> same with one of our official distributions.
>
>     https://downloads.apache.org/spark/spark-3.4.0/
>
> Specifically, in terms of the Scala version, we believe it should have
> Scala 2.12.17 because of 'SPARK-40436 Upgrade Scala to 2.12.17'.
>
> There is a company claiming something non-Apache like "Apache Spark 3.4.0
> minus SPARK-40436" with the name "Apache Spark 3.4.0."
>
>     - The company website shows "X.Y (includes Apache Spark 3.4.0, Scala
> 2.12)"
>     - The runtime logs "23/06/05 04:23:27 INFO SparkContext: Running Spark
> version 3.4.0"
>     - UI shows Apache Spark logo and `3.4.0`.
>     - However, Scala Version is '2.12.15'
>
> [image: Screenshot 2023-06-04 at 9.37.16 PM.png][image: Screenshot
> 2023-06-04 at 10.14.45 PM.png]
>
> Lastly, this is not a single instance. For example, the same company also
> claims "Apache Spark 3.3.2" with a mismatched Scala version.
>
>
> ### 2. Scala Issues ###
>
> In addition to (1), although we proceeded with good intentions and great
> care
> including dev mailing list discussion, there are several concerning areas
> which
> need more attention and our love.
>
> a) Scala Spark users will experience UX inconvenience from Spark 3.5.
>
>     SPARK-42493 Make Python the first tab for code examples
>
>     For the record, we discussed it here.
>     - https://lists.apache.org/thread/1p8s09ysrh4jqsfd47qdtrl7rm4rrs05
>       "[DISCUSS] Show Python code examples first in Spark documentation"
>
> b) Scala version upgrade is blocked by the Ammonite library dev cycle
> currently.
>
>     Although we discussed it here and it had good intentions,
>     the current master branch cannot use the latest Scala.
>
>     - https://lists.apache.org/thread/4nk5ddtmlobdt8g3z8xbqjclzkhlsdfk
>     "Ammonite as REPL for Spark Connect"
>      SPARK-42884 Add Ammonite REPL integration
>
>     Specifically, the following are blocked and I'm monitoring the
> Ammonite repository.
>     - SPARK-40497 Upgrade Scala to 2.13.11
>     - SPARK-43832 Upgrade Scala to 2.12.18
>     - According to https://github.com/com-lihaoyi/Ammonite/issues ,
>       Scala 3.3.0 LTS support also looks infeasible.
>
>     Although we may be able to wait for a while, there are two fundamental
> solutions
>     to unblock this situation in a long-term maintenance perspective.
>     - Replace it with a Scala-shell based implementation
>     - Move `connector/connect/client/jvm/pom.xml` outside from Spark repo.
>        Maybe, we can put it into the new repo like Rust and Go client.
>
> c) Scala 2.13 and above needs Apache Spark 4.0.
>
>     In "Apache Spark 3.5.0 Expectations?" and "Apache Spark 4.0
> Timeframe?" threads,
>     we discussed Spark 3.5.0 scope and decided to revert
>     "SPARK-43836 Make Scala 2.13 as default in Spark 3.5".
>     Apache Spark 4.0.0 is the only way to support Scala 2.13 or higher.
>
>     - https://lists.apache.org/thread/3x6dh17bmy20n3frtt3crgxjydnxh2o0
> ("Apache Spark 3.5.0 Expectations?")
>     - https://lists.apache.org/thread/xhkgj60j361gdpywoxxz7qspp2w80ry6
> ("Apache Spark 4.0 Timeframe?")
>
>      A candidate(or mentioned) timeframe was "Spark 4.0.0: 2024.06" and
> Scala 3.3.0 LTS.
>      - https://scala-lang.org/blog/2023/05/30/scala-3.3.0-released.html
>
> d) Java 21 LTS is Apache Spark 3.5.0's stretched goal
>
>     SPARK-43831 Build and Run Spark on Java 21
>
>     However, this needs SPARK-40497 (Scala 2.13.11) and SPARK-43832 (Scala
> 2.12.18)
>     which are blocked by Ammonite library as mentioned in (b)
>
>
> ### 3. Action Items ###
>
> To provide a clarity to the Apache Spark Scala community,
>
> - We should communicate and help the company to fix the misleading
> messages and
>   remove Scala-version segmentation situations per Spark version.
>
> - Apache Spark PMC should include this incident report and the result
>   in the next Apache Spark Quarterly Report (August).
>
> - I will start a vote for Apache Spark 4.0.0 timeframe next week after
> receiving more feedback.
>   Since 4.0.0 is not limited to the Scala issues, we will vote on the
> timeline only.
>
> - Lastly, we need to re-evaluate the risk of  `Ammonite` library before
> Apache Spark 3.5.0 release.
>   If it blocks Scala upgrade and Java 21 support, we had better avoid it
> at all cost.
>
>
> WDTY?
>
> Thanks,
> Dongjoon.
>

Reply via email to