Doogjoon and Luca, it's great to learn that there is a way to run different JVM versions for Spark and Hadoop binaries. I had concerns about Java compatibility issues without this solution. Thank you!
Luca, thank you for providing a how-to guide for this. It's really helpful! On Sat, Dec 9, 2023 at 1:39 AM Luca Canali <luca.can...@cern.ch> wrote: > Jason, In case you need a pointer on how to run Spark with a version of > Java different than the version used by the Hadoop processes, as indicated > by Dongjoon, this is an example of what we do on our Hadoop clusters: > https://github.com/LucaCanali/Miscellaneous/blob/master/Spark_Notes/Spark_Set_Java_Home_Howto.md > > > > Best, > > Luca > > > > *From:* Dongjoon Hyun <dongjoon.h...@gmail.com> > *Sent:* Saturday, December 9, 2023 09:39 > *To:* Jason Xu <jasonxu.sp...@gmail.com> > *Cc:* dev@spark.apache.org > *Subject:* Re: Spark on Yarn with Java 17 > > > > Please try Apache Spark 3.3+ (SPARK-33772) with Java 17 on your cluster > simply, Jason. > > I believe you can set up for your Spark 3.3+ jobs to run with Java 17 > while your cluster(DataNode/NameNode/ResourceManager/NodeManager) is still > sitting on Java 8. > > Dongjoon. > > > > On Fri, Dec 8, 2023 at 11:12 PM Jason Xu <jasonxu.sp...@gmail.com> wrote: > > Dongjoon, thank you for the fast response! > > > > Apache Spark 4.0.0 depends on only Apache Hadoop client library. > > To better understand your answer, does that mean a Spark application built > with Java 17 can successfully run on a Hadoop cluster on version 3.3 and > Java 8 runtime? > > > > On Fri, Dec 8, 2023 at 4:33 PM Dongjoon Hyun <dongj...@apache.org> wrote: > > Hi, Jason. > > Apache Spark 4.0.0 depends on only Apache Hadoop client library. > > You can track all `Apache Spark 4` activities including Hadoop dependency > here. > > https://issues.apache.org/jira/browse/SPARK-44111 > (Prepare Apache Spark 4.0.0) > > According to the release history, the original suggested timeline was > June, 2024. > - Spark 1: 2014.05 (1.0.0) ~ 2016.11 (1.6.3) > - Spark 2: 2016.07 (2.0.0) ~ 2021.05 (2.4.8) > - Spark 3: 2020.06 (3.0.0) ~ 2026.xx (3.5.x) > - Spark 4: 2024.06 (4.0.0, NEW) > > Thanks, > Dongjoon. > > On 2023/12/08 23:50:15 Jason Xu wrote: > > Hi Spark devs, > > > > According to the Spark 3.5 release notes, Spark 4 will no longer support > > Java 8 and 11 (link > > < > https://spark.apache.org/releases/spark-release-3-5-0.html#upcoming-removal > > > > ). > > > > My company is using Spark on Yarn with Java 8 now. When considering a > > future upgrade to Spark 4, one issue we face is that the latest version > of > > Hadoop (3.3) does not yet support Java 17. There is an open ticket ( > > HADOOP-17177 <https://issues.apache.org/jira/browse/HADOOP-17177>) for > this > > issue, which has been open for over two years. > > > > My question is: Does the release of Spark 4 depend on the availability of > > Java 17 support in Hadoop? Additionally, do we have a rough estimate for > > the release of Spark 4? Thanks! > > > > > > Cheers, > > > > Jason Xu > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >