Estkli-Wandeln opened a new issue, #9576:
URL: https://github.com/apache/hudi/issues/9576
I'm getting
```
3/08/29 08:32:51 ERROR Client: Application diagnostics message: User class
threw exception: java.lang.NoSuchMethodError:
org.apache.spark.sql.execution.datasources.PartitionedFile.<init>(Lorg/apache/spark/sql/catalyst/InternalRow;Ljava/lang/String;JJ[Ljava/lang/String;)V
at
org.apache.hudi.MergeOnReadSnapshotRelation.$anonfun$buildSplits$2(MergeOnReadSnapshotRelation.scala:127)
at scala.Option.map(Option.scala:230)
at
org.apache.hudi.MergeOnReadSnapshotRelation.$anonfun$buildSplits$1(MergeOnReadSnapshotRelation.scala:125)
```
I'm using EMR 6.7.0 and these libraries on my .jar (program)
```
implementation "org.apache.hudi:hudi-spark3.2-bundle_2.12:0.11.1"
implementation "org.scala-lang:scala-library:2.12.8"
implementation "org.apache.spark:spark-core_2.12:3.2.1"
implementation "org.apache.spark:spark-sql_2.12:3.2.1"
implementation "org.apache.hadoop:hadoop-aws:3.2.1"
```
It is interesting because I've just changed the input which the program
reads (and the code of the program for reading MoR of course ;) ) and the
program writes on MoR.
- If it reads CoW files from S3, it works.
- If it reads MoW files from S3, it throws the exception from above
Any clue? I've seen that people are suggesting to use EMR 6.9.0...
https://github.com/apache/hudi/issues/8903#issuecomment-1624977292 but I would
like to see if the issue could be resolved on EMR 6.7.0 so that I don't have to
upgrade the whole libraries from my project :/
**Environment Description**
* Hudi version : 0.11.1
* Spark version : 3.2.1
* Hadoop version : 3.2.1
* Storage (HDFS/S3/GCS..) : S3
* Running on Docker? (yes/no) : no
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]