LINGQ1991 commented on issue #8903:
URL: https://github.com/apache/hudi/issues/8903#issuecomment-1612912367

   > @ad1happy2go I use emr-6.5.0. It's error with " 
java.lang.NoSuchMethodError: 
org.apache.spark.sql.execution.datasources.PartitionedFile".
   > 
   > But i have package with oss spark and hudi bundle. Work ok now.
   > 
   > ```java
   > <plugin>
   >                 <groupId>org.apache.maven.plugins</groupId>
   >                 <artifactId>maven-shade-plugin</artifactId>
   >                 <version>3.2.1</version>
   >                 <configuration>
   >                     <finalName>hudi-${spark.version}-plugin</finalName>
   >                     
<createDependencyReducedPom>false</createDependencyReducedPom>
   >                 </configuration>
   >                 <executions>
   >                     <execution>
   >                         <phase>package</phase>
   >                         <goals>
   >                             <goal>shade</goal>
   >                         </goals>
   >                         <configuration>
   >                             <relocations>
   >                                 <relocation>
   >                                     
<pattern>org.apache.spark.sql.execution.datasources.PartitionedFile</pattern>
   >                                     
<shadedPattern>org.local.spark.sql.execution.datasources.PartitionedFile</shadedPattern>
   >                                 </relocation>
   >                                 <relocation>
   >                                     <pattern>org.apache.curator</pattern>
   >                                     
<shadedPattern>org.local.curator</shadedPattern>
   >                                 </relocation>
   >                             </relocations>
   >                             <transformers>
   >                                 <transformer
   >                                         
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"/>
   >                             </transformers>
   >                             <filters>
   >                                 <filter>
   >                                     <artifact>*:*</artifact>
   >                                     <excludes>
   >                                         
<exclude>module-info.class</exclude>
   >                                         
<exclude>org/apache/spark/unused/**</exclude>
   >                                     </excludes>
   >                                 </filter>
   >                                 <filter>
   >                                     <artifact>*:*</artifact>
   >                                     <excludes>
   >                                         <exclude>META-INF/*.SF</exclude>
   >                                         <exclude>META-INF/*.DSA</exclude>
   >                                         <exclude>META-INF/*.RSA</exclude>
   >                                     </excludes>
   >                                 </filter>
   >                             </filters>
   >                         </configuration>
   >                     </execution>
   >                 </executions>
   >             </plugin>
   > ```
   
   I have package with hudi bundle. But the following error occurred
   `Caused by: java.lang.ClassCastException: 
org.apache.hudi.spark.org.apache.spark.sql.execution.datasources.PartitionedFile
 cannot be cast to org.apache.spark.sql.execution.datasources.PartitionedFile
        at 
org.apache.hudi.HoodieMergeOnReadRDD.read(HoodieMergeOnReadRDD.scala:113)
        at 
org.apache.hudi.HoodieMergeOnReadRDD.compute(HoodieMergeOnReadRDD.scala:65)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
        at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:131)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)`
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to