+1 on Sean's opinion On Wed, Apr 7, 2021 at 2:17 PM Sean Owen <sro...@gmail.com> wrote:
> You shouldn't be modifying your cluster install. You may at this point > have conflicting, excess JARs in there somewhere. I'd start it over if you > can. > > On Wed, Apr 7, 2021 at 7:15 AM Gabor Somogyi <gabor.g.somo...@gmail.com> > wrote: > >> Not sure what you mean not working. You've added 3.1.1 to packages which >> uses: >> * 2.6.0 kafka-clients: >> https://github.com/apache/spark/blob/1d550c4e90275ab418b9161925049239227f3dc9/pom.xml#L136 >> * 2.6.2 commons pool: >> https://github.com/apache/spark/blob/1d550c4e90275ab418b9161925049239227f3dc9/pom.xml#L183 >> >> I think it worth an end-to-end dep-tree analysis what is really happening >> on the cluster... >> >> G >> >> >> On Wed, Apr 7, 2021 at 11:11 AM Mich Talebzadeh < >> mich.talebza...@gmail.com> wrote: >> >>> Hi Gabor et. al., >>> >>> To be honest I am not convinced this package --packages >>> org.apache.spark:spark-sql-kafka-0-10_2.12:3.1.1 is really working! >>> >>> I know for definite that spark-sql-kafka-0-10_2.12-3.1.0.jar works fine. >>> I reported the package working before because under $SPARK_HOME/jars on all >>> nodes there was a copy 3.0.1 jar file. Also in $SPARK_HOME/conf we had the >>> following entries: >>> >>> spark.yarn.archive=hdfs://rhes75:9000/jars/spark-libs.jar >>> spark.driver.extraClassPath $SPARK_HOME/jars/*.jar >>> spark.executor.extraClassPath $SPARK_HOME/jars/*.jar >>> >>> So the jar file was picked up first anyway. >>> >>> The concern I have is that that the package uses older version of jar >>> files, namely: the following in .ivy2/jars >>> >>> -rw-r--r-- 1 hduser hadoop 6407352 Dec 19 13:14 >>> com.github.luben_zstd-jni-1.4.8-1.jar >>> -rw-r--r-- 1 hduser hadoop 129174 Apr 6 2019 >>> org.apache.commons_commons-pool2-2.6.2.jar >>> -rw-r--r-- 1 hduser hadoop 3754508 Jul 28 2020 >>> org.apache.kafka_kafka-clients-2.6.0.jar >>> -rw-r--r-- 1 hduser hadoop 387494 Feb 22 03:57 >>> org.apache.spark_spark-sql-kafka-0-10_2.12-3.1.1.jar >>> -rw-r--r-- 1 hduser hadoop 55766 Feb 22 03:58 >>> org.apache.spark_spark-token-provider-kafka-0-10_2.12-3.1.1.jar >>> -rw-r--r-- 1 hduser hadoop 649950 Jan 18 2020 >>> org.lz4_lz4-java-1.7.1.jar >>> -rw-r--r-- 1 hduser hadoop 41472 Dec 16 2019 >>> org.slf4j_slf4j-api-1.7.30.jar >>> -rw-r--r-- 1 hduser hadoop 2777 Oct 22 2014 >>> org.spark-project.spark_unused-1.0.0.jar >>> -rw-r--r-- 1 hduser hadoop 1969177 Nov 28 18:10 >>> org.xerial.snappy_snappy-java-1.1.8.2.jar >>> >>> >>> So I am not sure. Hence I want someone to verify this independently in >>> anger >>> >>>