Dear all,

I am proposing to drop the support of Spark 2.4 and Scala 2.11 in the next
Sedona release. The version number will be 1.3.0 if we drop this support,
otherwise it will be 1.2.1.

Here is the status of Spark 2.4 and Sedona for Spark 2.4
1. Spark community has announced Spark 2.4 EOL on March 03 2021:
https://www.mail-archive.com/dev@spark.apache.org/msg27476.html
2. Spark 3.0 was released on 06-16-2020.
3. Spark 3.3.0 was released a few days ago. And starting from Spark 3.2,
Spark releases binaries for both Scala 2.12 and 2.13.
4. Only a few Sedona users are using Spark 2.4. According to the statistics
of Maven Central (Scala/Java API only), only around 1K out of 100K
downloads are using Sedona for Spark 2.4. (core-2.4_2.11, core-2.4_2.12,
python-adapter-2.4_2.11, python-adapter-2.4_2.12)

Benefits of dropping the support:
1. Reduce the complexity of maintaining the source code for different Spark
versions. Currently, several files have two versions for Spark 2.4 and 3.x,
controlled by "anchor" keywords. I wrote a Python script to pre-process the
source code all the time:
https://github.com/apache/incubator-sedona/blob/master/spark-version-converter.py
2. Reduce the overhead of releasing binary packages. Currently, the main
POM.xml is quite complex in order to compile against different Spark
versions. Therefore, we weren't able to release Sedona for Scala 2.13.

Plan of Sedona for Spark 3.X
1. Sedona source code already supports Scala 2.13 but no Sedona binary
release. We will release Sedona for both Scala 2.12 and 2.13, but no Scala
2.11.
2. Sedona already releases binaries for Spark 3.0, 3.1, 3.2
3. The two latest PRs of Sedona are adding the support for Spark 3.3.
https://github.com/apache/incubator-sedona/pull/636
https://github.com/apache/incubator-sedona/pull/635

What do you think of this proposal? If you don't like this, what is the
best time to drop the support of Spark 2.4 and Scala 2.11?

I will let this discussion open for at least 3 days. If no objection, I
will remove Spark 2.4 from POM.xml and GitHub Actions, but leave the Spark
2.4 support in the source code. So whoever wants to use Sedona on Spark 2.4
can still compile the source code by themselves.

Thanks,
Jia

Reply via email to