I'm OK with this. It simplifies maintenance a bit, and specifically may allow us to finally move off of the ancient version of Guava (?)
On Mon, Oct 3, 2022 at 10:16 PM Dongjoon Hyun <dongjoon.h...@gmail.com> wrote: > Hi, All. > > I'm wondering if the following Apache Spark Hadoop2 Binary Distribution > is still used by someone in the community or not. If it's not used or not > useful, > we may remove it from Apache Spark 3.4.0 release. > > > https://downloads.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop2.tgz > > Here is the background of this question. > Since Apache Spark 2.2.0 (SPARK-19493, SPARK-19550), the Apache > Spark community has been building and releasing with Java 8 only. > I believe that the user applications also use Java8+ in these days. > Recently, I received the following message from the Hadoop PMC. > > > "if you really want to claim hadoop 2.x compatibility, then you have to > > be building against java 7". Otherwise a lot of people with hadoop 2.x > > clusters won't be able to run your code. If your projects are java8+ > > only, then they are implicitly hadoop 3.1+, no matter what you use > > in your build. Hence: no need for branch-2 branches except > > to complicate your build/test/release processes [1] > > If Hadoop2 binary distribution is no longer used as of today, > or incomplete somewhere due to Java 8 building, the following three > existing alternative Hadoop 3 binary distributions could be > the better official solution for old Hadoop 2 clusters. > > 1) Scala 2.12 and without-hadoop distribution > 2) Scala 2.12 and Hadoop 3 distribution > 3) Scala 2.13 and Hadoop 3 distribution > > In short, is there anyone who is using Apache Spark 3.3.0 Hadoop2 Binary > distribution? > > Dongjoon > > [1] > https://issues.apache.org/jira/browse/ORC-1251?focusedCommentId=17608247&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17608247 >