[GitHub] [iceberg] scaladevspark opened a new issue, #4628: missing SetWriteDistributionAndOrdering class for spark sql plan

GitBox Mon, 25 Apr 2022 07:32:00 -0700


scaladevspark opened a new issue, #4628:
URL: https://github.com/apache/iceberg/issues/4628


   Trying to run Scala Spark Structured Streaming app with iceberg via Spark 
operator on Spark 3.1.1.
   
   Spark submit fails on Exception in thread "main" 
java.lang.NoClassDefFoundError: 
org/apache/spark/sql/catalyst/plans/logical/SetWriteDistributionAndOrdering
   
   I am unable to find a jar to include to get this class. There seems to be a 
context with this name, but not the class itself 
https://iceberg.apache.org/javadoc/0.12.1/org/apache/spark/sql/catalyst/parser/extensions/IcebergSqlExtensionsParser.SetWriteDistributionAndOrderingContext.html
   
   my dependencies
     "org.apache.spark" %% "spark-core" % "3.1.1" % Provided,
     "org.apache.spark" %% "spark-sql" % "3.1.1" % Provided,
     "org.apache.hadoop" % "hadoop-aws" % "3.2.0",
     "org.apache.hadoop" % "hadoop-common" % "3.2.0",
     "org.apache.hadoop" % "hadoop-client" % "3.2.0",
     "org.apache.hadoop" % "hadoop-mapreduce-client-core" % "3.2.0",
     "org.apache.hadoop" % "hadoop-minikdc" % "3.2.0",
     "com.amazonaws" % "aws-java-sdk-bundle" % "1.11.375",
     "com.typesafe" % "config" % "1.3.1"
     , "joda-time" % "joda-time" % "2.9.9"
     , "org.apache.iceberg" % "iceberg-spark3-extensions" % "0.13.1"
     , "org.apache.iceberg" % "iceberg-common" % "0.13.1"
     , "org.apache.iceberg" % "iceberg-spark3-runtime" % "0.13.1"
   
   Thanks for any help! Full stacktrace:
   Exception in thread "main" java.lang.NoClassDefFoundError: 
org/apache/spark/sql/catalyst/plans/logical/SetWriteDistributionAndOrdering
        at 
org.apache.spark.sql.execution.datasources.v2.ExtendedDataSourceV2Strategy.apply(ExtendedDataSourceV2Strategy.scala:77)
        at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)
        at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:484)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490)
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489)
        at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
        at 
org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:67)
        at 
org.apache.spark.sql.execution.QueryExecution$.createSparkPlan(QueryExecution.scala:391)
        at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$1(QueryExecution.scala:104)
        at 
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
        at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:143)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
        at 
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:143)
        at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:104)
        at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:97)
        at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$1(QueryExecution.scala:117)
        at 
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:111)
        at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:143)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
        at 
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:143)
        at 
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:117)
        at 
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:110)
        at 
org.apache.spark.sql.execution.QueryExecution.$anonfun$simpleString$2(QueryExecution.scala:161)
        at 
org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$explainString(QueryExecution.scala:206)
        at 
org.apache.spark.sql.execution.QueryExecution.explainString(QueryExecution.scala:175)
        at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:98)
        at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
        at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
        at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3685)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] scaladevspark opened a new issue, #4628: missing SetWriteDistributionAndOrdering class for spark sql plan

Reply via email to