Hi, Dongjoon Our company(Baidu) is still using the combination of Spark 3.3 + Hadoop 2.7.4 in the production environment. Hadoop 2.7.4 is an internally maintained version compiled by Java 8. Although we are using Hadoop 2, I still support this proposal because it is positive and exciting.
Regards, YangJie 发件人: Dongjoon Hyun <dongjoon.h...@gmail.com> 日期: 2022年10月4日 星期二 11:16 收件人: dev <dev@spark.apache.org> 主题: Dropping Apache Spark Hadoop2 Binary Distribution? Hi, All. I'm wondering if the following Apache Spark Hadoop2 Binary Distribution is still used by someone in the community or not. If it's not used or not useful, we may remove it from Apache Spark 3.4.0 release. https://downloads.apache.org/spark/spark-3.3.0/spark-3.3.0-bin-hadoop2.tgz<https://mailshield.baidu.com/check?q=nFKjwur0WPBgNfrarJ1k%2fUbMkNasnbh1TmZiNzBvSuAAb596rlYk182hUiEqyXWjksmdGeptL3s8ghXMv%2buNxwrpF0RZUXK4QQKzVPN3u3Q%3d> Here is the background of this question. Since Apache Spark 2.2.0 (SPARK-19493, SPARK-19550), the Apache Spark community has been building and releasing with Java 8 only. I believe that the user applications also use Java8+ in these days. Recently, I received the following message from the Hadoop PMC. > "if you really want to claim hadoop 2.x compatibility, then you have to > be building against java 7". Otherwise a lot of people with hadoop 2.x > clusters won't be able to run your code. If your projects are java8+ > only, then they are implicitly hadoop 3.1+, no matter what you use > in your build. Hence: no need for branch-2 branches except > to complicate your build/test/release processes [1] If Hadoop2 binary distribution is no longer used as of today, or incomplete somewhere due to Java 8 building, the following three existing alternative Hadoop 3 binary distributions could be the better official solution for old Hadoop 2 clusters. 1) Scala 2.12 and without-hadoop distribution 2) Scala 2.12 and Hadoop 3 distribution 3) Scala 2.13 and Hadoop 3 distribution In short, is there anyone who is using Apache Spark 3.3.0 Hadoop2 Binary distribution? Dongjoon [1] https://issues.apache.org/jira/browse/ORC-1251?focusedCommentId=17608247&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17608247<https://mailshield.baidu.com/check?q=ydfs6JNIgVYX0c7s35hEbDKduTWJZfdqBlri9w1eAUmmi3MLIwhMNIpBPI11b4Ue4yyJduNrNLK%2bO6wv0EJEtYrfL79ZSK18xbM73fm3xOMIk17zxsTfggWFeJdpVDezLVjcWYU0dEW42Y%2bQGV6D7%2fdI48KLX9PGGjGB%2fy8OdRIr%2fu3WQWqH9dNa8Zmn4WvJib9TNaozHE4kzjjZrx8BAJkuUxTlBZOg>