[GitHub] [hudi] navbalaraman opened a new issue, #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

GitBox Tue, 25 Oct 2022 14:10:39 -0700


navbalaraman opened a new issue, #7060:
URL: https://github.com/apache/hudi/issues/7060


   We are using spark 3.1.2 with hudi 0.9.0 in our application and AWS S3 and 
AWS Glue Catalog to store and expose the data ingested. As part of a source 
data change where some of the new records are now coming in as null but this 
column exists in the table schema as it was built based on earlier records 
which had values against these columns. Based on the some of the issues 
reported (eg: [HUDI-4276](https://github.com/apache/hudi/pull/6017/commits)], 
we identified that this issue could be resolved with upgrading to hudi 0.12.0. 
   When upgrading hudi we are facing below error. Can you please provide info 
on what is causing this issue? (Only pom version changes have been done, no 
code changes)
   
   Error:
   org.apache.spark.sql.adapter.Spark3_1Adapter
   java.lang.ClassNotFoundException: 
org.apache.spark.sql.adapter.Spark3_1Adapter
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
        at 
org.apache.hudi.SparkAdapterSupport.sparkAdapter(SparkAdapterSupport.scala:39)
        at 
org.apache.hudi.SparkAdapterSupport.sparkAdapter$(SparkAdapterSupport.scala:29)
        at 
org.apache.hudi.HoodieSparkUtils$.sparkAdapter$lzycompute(HoodieSparkUtils.scala:65)
        at 
org.apache.hudi.HoodieSparkUtils$.sparkAdapter(HoodieSparkUtils.scala:65)
        at 
org.apache.hudi.AvroConversionUtils$.convertStructTypeToAvroSchema(AvroConversionUtils.scala:150)
        at 
org.apache.hudi.HoodieSparkSqlWriter$.bulkInsertAsRow(HoodieSparkSqlWriter.scala:540)
        at 
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:178)
        at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:183)
        at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
        at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:180)
        at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:218)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:215)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:176)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:132)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:131)
        at 
org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
        at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
        at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
        at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
        at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
        at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
        at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
        at 
org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:293)
   
   
   **pom.xml** 
   `<dependency>
         <groupId>org.scala-lang</groupId>
         <artifactId>scala-library</artifactId>
         <version>2.12.12</version>
       </dependency>
       <dependency>
         <groupId>org.apache.spark</groupId>
         <artifactId>spark-core_2.12</artifactId>
         <version>3.1.2</version>
         <scope>provided</scope>
       </dependency>
       <dependency>
         <groupId>org.apache.spark</groupId>
         <artifactId>spark-sql_2.12</artifactId>
         <version>3.1.2</version>
         <scope>provided</scope>
       </dependency>
       <dependency>
         <groupId>org.apache.spark</groupId>
         <artifactId>spark-hive_2.12</artifactId>
         <version>3.1.2</version>
         <scope>provided</scope>
       </dependency>
       <dependency>
         <groupId>org.apache.hudi</groupId>
         <artifactId>hudi-spark3-bundle_2.12</artifactId>
         <version>0.12.0</version>
       </dependency>
       <dependency>
         <groupId>org.mongodb.spark</groupId>
         <artifactId>mongo-spark-connector_2.12</artifactId>
         <version>3.0.1</version>
       </dependency>
   `
     


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] navbalaraman opened a new issue, #7060: Error when upgrading to hudi 0.12.0 from 0.9.0

Reply via email to