hudi-bot opened a new issue, #15770:
URL: https://github.com/apache/hudi/issues/15770

   I am trying to use hudi-spark3.3. bundle in EMR cluster using OSS spark. 
   
    
   
   Used: 
   
   
[https://archive.apache.org/dist/spark/spark-3.3.0/spark-3.3.0-bin-hadoop3.tgz] 
   
   for spark. 
   {code:java}
   ./bin/spark-shell --driver-memory 4g --executor-memory 6g   --master yarn 
--deploy-mode client  --conf 
'spark.serializer=org.apache.spark.serializer.KryoSerializer'  --conf 
'spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog'
 --conf 
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension' 
--jars /home/hadoop/hudi-spark3.3-bundle_2.12-0.13.0-rc2.jar --conf 
spark.driver.extraJavaOptions="-Dlog4j.configuration=file:/home/hadoop/log4j.properties"
 --conf 
spark.executor.extraJavaOptions="-Dlog4j.configuration=file:/home/hadoop/log4j.properties"
   23/02/08 21:20:34 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
   java.lang.NoClassDefFoundError: 
org/apache/hadoop/shaded/javax/ws/rs/core/NoContentException
     at 
org.apache.hadoop.yarn.util.timeline.TimelineUtils.<clinit>(TimelineUtils.java:60)
     at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:200)
     at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
     at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:191)
     at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
     at 
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:222)
     at org.apache.spark.SparkContext.<init>(SparkContext.scala:585)
     at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2704)
     at 
org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:953)
     at scala.Option.getOrElse(Option.scala:189)
     at 
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:947)
     at org.apache.spark.repl.Main$.createSparkSession(Main.scala:106)
     ... 55 elided
   Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.shaded.javax.ws.rs.core.NoContentException
     at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
     at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
     ... 67 more
   <console>:14: error: not found: value spark
          import spark.implicits._
                 ^
   <console>:14: error: not found: value spark
          import spark.sql
                 ^
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /___/ .__/\_,_/_/ /_/\_\   version 3.3.0
         /_/
            
   Using Scala version 2.12.15 (OpenJDK 64-Bit Server VM, Java 1.8.0_352)
   Type in expressions to have them evaluated.
   Type :help for more information.
   
   
   scala>  {code}
   From SO, found that we can bypass by disabling timeline server in yarn. 
[https://stackoverflow.com/questions/74451254/caused-by-java-lang-classnotfoundexception-org-apache-hadoop-shaded-javax-ws-r]
   
   had to set 
   
   yarn.timeline-service.enabled = false in /etc/hadoop/conf/yarn-site.xml
   
    
   
   After this, I don't see any issues. 
   
   I tired all versions and its the same behavior. 
   
   hudi-0.12.0, 0.12.2 and 0.13.0 rc2
   
    
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-5732
   - Type: Bug


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to