1/ Regarding naming - I believe releasing "Apache Foo X.Y + patches" is acceptable, if it is substantially Apache Foo X.Y. This is common practice for downstream vendors. It's fair nominative use. The principle here is consumer confusion. Is anyone substantially misled? Here I don't think so. I know that we have in the past decided it would not be OK, for example, to release a product with "Apache Spark 4.0" now as there is no such release, even building from master. A vendor should elaborate the changes somewhere, ideally. I'm sure this one is about Databricks but I'm also sure Cloudera, Hortonworks, etc had Spark releases with patches, too.
2a/ That issue seems to be about just flipping which code sample is shown by default. It seemed widely agree that this would slightly help more users than it harms. I agree with the change and don't see a need to escalate. the question of further Python parity is a big one but is separate. 2b/ If a single dependency blocks important updates, yeah it's fair to remove it, IMHO. I wouldn't remove in 3.5 unless the other updates are critical, and it's not clear they are. In 4.0 yes. 2c/ Scala 2.13 is already supported in 3.x, and does not require 4.0. This was about what the default non-Scala release convenience binaries use. Sticking to 2.12 in 3.x doesn't seem like an issue, even desirable. 2d/ Same as 2b 3/ I don't think 1/ is an incident. Yes to moving towards 4.0 after 3.5, IMHO, and to removing Ammonite in 4.0 if there is no resolution forthcoming On Mon, Jun 5, 2023 at 2:46 AM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote: > Hi, All and Matei (as the Chair of Apache Spark PMC). > > Sorry for a long email, I want to share two topics and corresponding > action items. > You can go to "Section 3: Action Items" directly for the conclusion. > > > ### 1. ASF Policy Violation ### > > ASF has a rule for "MAY I CALL MY MODIFIED CODE 'APACHE'?" > > https://www.apache.org/foundation/license-faq.html#Name-changes > > For example, when we call `Apache Spark 3.4.0`, it's supposed to be the > same with one of our official distributions. > > https://downloads.apache.org/spark/spark-3.4.0/ > > Specifically, in terms of the Scala version, we believe it should have > Scala 2.12.17 because of 'SPARK-40436 Upgrade Scala to 2.12.17'. > > There is a company claiming something non-Apache like "Apache Spark 3.4.0 > minus SPARK-40436" with the name "Apache Spark 3.4.0." > > - The company website shows "X.Y (includes Apache Spark 3.4.0, Scala > 2.12)" > - The runtime logs "23/06/05 04:23:27 INFO SparkContext: Running Spark > version 3.4.0" > - UI shows Apache Spark logo and `3.4.0`. > - However, Scala Version is '2.12.15' > > [image: Screenshot 2023-06-04 at 9.37.16 PM.png][image: Screenshot > 2023-06-04 at 10.14.45 PM.png] > > Lastly, this is not a single instance. For example, the same company also > claims "Apache Spark 3.3.2" with a mismatched Scala version. > > > ### 2. Scala Issues ### > > In addition to (1), although we proceeded with good intentions and great > care > including dev mailing list discussion, there are several concerning areas > which > need more attention and our love. > > a) Scala Spark users will experience UX inconvenience from Spark 3.5. > > SPARK-42493 Make Python the first tab for code examples > > For the record, we discussed it here. > - https://lists.apache.org/thread/1p8s09ysrh4jqsfd47qdtrl7rm4rrs05 > "[DISCUSS] Show Python code examples first in Spark documentation" > > b) Scala version upgrade is blocked by the Ammonite library dev cycle > currently. > > Although we discussed it here and it had good intentions, > the current master branch cannot use the latest Scala. > > - https://lists.apache.org/thread/4nk5ddtmlobdt8g3z8xbqjclzkhlsdfk > "Ammonite as REPL for Spark Connect" > SPARK-42884 Add Ammonite REPL integration > > Specifically, the following are blocked and I'm monitoring the > Ammonite repository. > - SPARK-40497 Upgrade Scala to 2.13.11 > - SPARK-43832 Upgrade Scala to 2.12.18 > - According to https://github.com/com-lihaoyi/Ammonite/issues , > Scala 3.3.0 LTS support also looks infeasible. > > Although we may be able to wait for a while, there are two fundamental > solutions > to unblock this situation in a long-term maintenance perspective. > - Replace it with a Scala-shell based implementation > - Move `connector/connect/client/jvm/pom.xml` outside from Spark repo. > Maybe, we can put it into the new repo like Rust and Go client. > > c) Scala 2.13 and above needs Apache Spark 4.0. > > In "Apache Spark 3.5.0 Expectations?" and "Apache Spark 4.0 > Timeframe?" threads, > we discussed Spark 3.5.0 scope and decided to revert > "SPARK-43836 Make Scala 2.13 as default in Spark 3.5". > Apache Spark 4.0.0 is the only way to support Scala 2.13 or higher. > > - https://lists.apache.org/thread/3x6dh17bmy20n3frtt3crgxjydnxh2o0 > ("Apache Spark 3.5.0 Expectations?") > - https://lists.apache.org/thread/xhkgj60j361gdpywoxxz7qspp2w80ry6 > ("Apache Spark 4.0 Timeframe?") > > A candidate(or mentioned) timeframe was "Spark 4.0.0: 2024.06" and > Scala 3.3.0 LTS. > - https://scala-lang.org/blog/2023/05/30/scala-3.3.0-released.html > > d) Java 21 LTS is Apache Spark 3.5.0's stretched goal > > SPARK-43831 Build and Run Spark on Java 21 > > However, this needs SPARK-40497 (Scala 2.13.11) and SPARK-43832 (Scala > 2.12.18) > which are blocked by Ammonite library as mentioned in (b) > > > ### 3. Action Items ### > > To provide a clarity to the Apache Spark Scala community, > > - We should communicate and help the company to fix the misleading > messages and > remove Scala-version segmentation situations per Spark version. > > - Apache Spark PMC should include this incident report and the result > in the next Apache Spark Quarterly Report (August). > > - I will start a vote for Apache Spark 4.0.0 timeframe next week after > receiving more feedback. > Since 4.0.0 is not limited to the Scala issues, we will vote on the > timeline only. > > - Lastly, we need to re-evaluate the risk of `Ammonite` library before > Apache Spark 3.5.0 release. > If it blocks Scala upgrade and Java 21 support, we had better avoid it > at all cost. > > > WDTY? > > Thanks, > Dongjoon. >