scaladevspark opened a new issue, #4628: URL: https://github.com/apache/iceberg/issues/4628
Trying to run Scala Spark Structured Streaming app with iceberg via Spark operator on Spark 3.1.1. Spark submit fails on Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/catalyst/plans/logical/SetWriteDistributionAndOrdering I am unable to find a jar to include to get this class. There seems to be a context with this name, but not the class itself https://iceberg.apache.org/javadoc/0.12.1/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsParser.SetWriteDistributionAndOrderingContext.html my dependencies "org.apache.spark" %% "spark-core" % "3.1.1" % Provided, "org.apache.spark" %% "spark-sql" % "3.1.1" % Provided, "org.apache.hadoop" % "hadoop-aws" % "3.2.0", "org.apache.hadoop" % "hadoop-common" % "3.2.0", "org.apache.hadoop" % "hadoop-client" % "3.2.0", "org.apache.hadoop" % "hadoop-mapreduce-client-core" % "3.2.0", "org.apache.hadoop" % "hadoop-minikdc" % "3.2.0", "com.amazonaws" % "aws-java-sdk-bundle" % "1.11.375", "com.typesafe" % "config" % "1.3.1" , "joda-time" % "joda-time" % "2.9.9" , "org.apache.iceberg" % "iceberg-spark3-extensions" % "0.13.1" , "org.apache.iceberg" % "iceberg-common" % "0.13.1" , "org.apache.iceberg" % "iceberg-spark3-runtime" % "0.13.1" Thanks for any help! Full stacktrace: Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/sql/catalyst/plans/logical/SetWriteDistributionAndOrdering at org.apache.spark.sql.execution.datasources.v2.ExtendedDataSourceV2Strategy.apply(ExtendedDataSourceV2Strategy.scala:77) at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63) at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489) at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93) at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:67) at org.apache.spark.sql.execution.QueryExecution$.createSparkPlan(QueryExecution.scala:391) at org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$1(QueryExecution.scala:104) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:143) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:143) at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:104) at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:97) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$1(QueryExecution.scala:117) at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111) at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:143) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:143) at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:117) at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:110) at org.apache.spark.sql.execution.QueryExecution.$anonfun$simpleString$2(QueryExecution.scala:161) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:206) at org.apache.spark.sql.execution.QueryExecution.explainString(QueryExecution.scala:175) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:98) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3685) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
