[GitHub] [hudi] Estkli-Wandeln opened a new issue, #9576: [SUPPORT] AWS EMR 6.7.0 & HUDI MoR Write

via GitHub Wed, 30 Aug 2023 01:53:51 -0700


Estkli-Wandeln opened a new issue, #9576:
URL: https://github.com/apache/hudi/issues/9576


   I'm getting
   
   ```
   3/08/29 08:32:51 ERROR Client: Application diagnostics message: User class 
threw exception: java.lang.NoSuchMethodError: 
org.apache.spark.sql.execution.datasources.PartitionedFile.<init>(Lorg/apache/spark/sql/catalyst/InternalRow;Ljava/lang/String;JJ[Ljava/lang/String;)V
        at 
org.apache.hudi.MergeOnReadSnapshotRelation.$anonfun$buildSplits$2(MergeOnReadSnapshotRelation.scala:127)
        at scala.Option.map(Option.scala:230)
        at 
org.apache.hudi.MergeOnReadSnapshotRelation.$anonfun$buildSplits$1(MergeOnReadSnapshotRelation.scala:125)
   ```
   I'm using EMR 6.7.0 and these libraries on my .jar (program)
   ```
       implementation "org.apache.hudi:hudi-spark3.2-bundle_2.12:0.11.1"
       implementation "org.scala-lang:scala-library:2.12.8"
       implementation "org.apache.spark:spark-core_2.12:3.2.1"
       implementation "org.apache.spark:spark-sql_2.12:3.2.1"
       implementation "org.apache.hadoop:hadoop-aws:3.2.1"
   ```
   It is interesting because I've just changed the input which the program 
reads (and the  code of the program for reading MoR of course ;) ) and the 
program writes on MoR.
   
   - If it reads CoW files from S3, it works.
   - If it reads MoW files from S3, it throws the exception from above
   
   Any clue? I've seen that people are suggesting to use EMR 6.9.0... 
https://github.com/apache/hudi/issues/8903#issuecomment-1624977292 but I would 
like to see if the issue could be resolved on EMR 6.7.0 so that I don't have to 
upgrade the whole libraries from my project :/
   
   **Environment Description**
   
   * Hudi version : 0.11.1
   
   * Spark version : 3.2.1
   
   * Hadoop version : 3.2.1
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] Estkli-Wandeln opened a new issue, #9576: [SUPPORT] AWS EMR 6.7.0 & HUDI MoR Write

Reply via email to