curio77 opened a new issue, #11797: URL: https://github.com/apache/hudi/issues/11797
**Describe the problem you faced** Trying to follow the official Docker demo tutorial [here](https://hudi.apache.org/docs/docker_demo/), at step 2, I get an error executing a command inside one of the containers. **To Reproduce** Steps to reproduce the behavior: 1. Follow the tutorial up to _Step 2_. Clone the Hudi repo at tag `release-0.15.0`. Use a v1.8 JDK (I've used OpenJDK v1.8.0_422). Build with just `mvn clean package -Pintegration-tests -DskipTests`. The build should complete without errors. 2. Run the first command of _Step 2_ inside the (`adhoc-2`) Docker container. This throws a Scala exception. **Expected behavior** I expect that not to throw an exception. **Environment Description** * Hudi version : 0.15.0 * Spark version : 3.5 * Hive version : unsure * Hadoop version : unsure * Storage (HDFS/S3/GCS..) : irrelevant * Running on Docker? (yes/no) : yes **Additional context** **Stacktrace** ``` Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V at org.apache.spark.sql.hudi.HoodieSparkSessionExtension.<init>(HoodieSparkSessionExtension.scala:28) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.apache.spark.sql.SparkSession$Builder.liftedTree1$1(SparkSession.scala:945) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943) at org.apache.spark.sql.SQLContext$.getOrCreate(SQLContext.scala:1066) at org.apache.spark.sql.SQLContext.getOrCreate(SQLContext.scala) at org.apache.hudi.client.common.HoodieSparkEngineContext.<init>(HoodieSparkEngineContext.java:72) at org.apache.hudi.utilities.streamer.HoodieStreamer.<init>(HoodieStreamer.java:166) at org.apache.hudi.utilities.streamer.HoodieStreamer.<init>(HoodieStreamer.java:150) at org.apache.hudi.utilities.streamer.HoodieStreamer.<init>(HoodieStreamer.java:136) at org.apache.hudi.utilities.streamer.HoodieStreamer.main(HoodieStreamer.java:606) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) ``` **Full output:** ``` root@adhoc-2:/opt# spark-submit \ > --class org.apache.hudi.utilities.streamer.HoodieStreamer $HUDI_UTILITIES_BUNDLE \ > --table-type COPY_ON_WRITE \ > --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \ > --source-ordering-field ts \ > --target-base-path /user/hive/warehouse/stock_ticks_cow \ > --target-table stock_ticks_cow --props /var/demo/config/kafka-source.properties \ > --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider 24/08/19 14:50:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 24/08/19 14:50:38 WARN streamer.SchedulerConfGenerator: Job Scheduling Configs will not be in effect as spark.scheduler.mode is not set to FAIR at instantiation time. Continuing without scheduling configs 24/08/19 14:50:38 INFO spark.SparkContext: Running Spark version 2.4.4 24/08/19 14:50:38 INFO spark.SparkContext: Submitted application: streamer-stock_ticks_cow 24/08/19 14:50:38 INFO spark.SecurityManager: Changing view acls to: root 24/08/19 14:50:38 INFO spark.SecurityManager: Changing modify acls to: root 24/08/19 14:50:38 INFO spark.SecurityManager: Changing view acls groups to: 24/08/19 14:50:38 INFO spark.SecurityManager: Changing modify acls groups to: 24/08/19 14:50:38 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 24/08/19 14:50:38 INFO Configuration.deprecation: mapred.output.compression.codec is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.codec 24/08/19 14:50:38 INFO Configuration.deprecation: mapred.output.compress is deprecated. Instead, use mapreduce.output.fileoutputformat.compress 24/08/19 14:50:38 INFO Configuration.deprecation: mapred.output.compression.type is deprecated. Instead, use mapreduce.output.fileoutputformat.compress.type 24/08/19 14:50:39 INFO util.Utils: Successfully started service 'sparkDriver' on port 35325. 24/08/19 14:50:39 INFO spark.SparkEnv: Registering MapOutputTracker 24/08/19 14:50:39 INFO spark.SparkEnv: Registering BlockManagerMaster 24/08/19 14:50:39 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 24/08/19 14:50:39 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 24/08/19 14:50:39 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-6e3a4f57-0460-41ca-a384-2c35b2906e4c 24/08/19 14:50:39 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB 24/08/19 14:50:39 INFO spark.SparkEnv: Registering OutputCommitCoordinator 24/08/19 14:50:39 INFO util.log: Logging initialized @1449ms 24/08/19 14:50:39 INFO server.Server: jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown 24/08/19 14:50:39 INFO server.Server: Started @1492ms 24/08/19 14:50:39 INFO server.AbstractConnector: Started ServerConnector@44ea608c{HTTP/1.1,[http/1.1]}{0.0.0.0:8090} 24/08/19 14:50:39 INFO util.Utils: Successfully started service 'SparkUI' on port 8090. 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3f3ddbd9{/jobs,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@62b3df3a{/jobs/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@420745d7{/jobs/job,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5fa47fea{/jobs/job/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2392212b{/stages,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5b43e173{/stages/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@28f8e165{/stages/stage,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@22fa55b2{/stages/stage/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4d666b41{/stages/pool,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6594402a{/stages/pool/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@30f4b1a6{/storage,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@405325cf{/storage/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3e1162e7{/storage/rdd,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@79c3f01f{/storage/rdd/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6c2f1700{/environment,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@350b3a17{/environment/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@38600b{/executors,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@669d2b1b{/executors/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@721eb7df{/executors/threadDump,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1ea9f009{/executors/threadDump/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@5d52e3ef{/static,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2c0f7678{/,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@44d70181{/api,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@88a8218{/jobs/job/kill,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@50b1f030{/stages/stage/kill,null,AVAILABLE,@Spark} 24/08/19 14:50:39 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://adhoc-2:8090 24/08/19 14:50:39 INFO spark.SparkContext: Added JAR file:/var/hoodie/ws/docker/hoodie/hadoop/hive_base/target/hoodie-utilities.jar at spark://adhoc-2:35325/jars/hoodie-utilities.jar with timestamp 1724079039233 24/08/19 14:50:39 INFO executor.Executor: Starting executor ID driver on host localhost 24/08/19 14:50:39 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38309. 24/08/19 14:50:39 INFO netty.NettyBlockTransferService: Server created on adhoc-2:38309 24/08/19 14:50:39 INFO storage.BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 24/08/19 14:50:39 INFO storage.BlockManagerMaster: Registering BlockManager BlockManagerId(driver, adhoc-2, 38309, None) 24/08/19 14:50:39 INFO storage.BlockManagerMasterEndpoint: Registering block manager adhoc-2:38309 with 366.3 MB RAM, BlockManagerId(driver, adhoc-2, 38309, None) 24/08/19 14:50:39 INFO storage.BlockManagerMaster: Registered BlockManager BlockManagerId(driver, adhoc-2, 38309, None) 24/08/19 14:50:39 INFO storage.BlockManager: Initialized BlockManager: BlockManagerId(driver, adhoc-2, 38309, None) 24/08/19 14:50:39 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@14fc5d40{/metrics/json,null,AVAILABLE,@Spark} 24/08/19 14:50:39 WARN config.DFSPropertiesConfiguration: Cannot find HUDI_CONF_DIR, please set it as the dir of hudi-defaults.conf 24/08/19 14:50:39 WARN config.DFSPropertiesConfiguration: Properties file file:/etc/hudi/conf/hudi-defaults.conf not found. Ignoring to load props file 24/08/19 14:50:39 INFO server.AbstractConnector: Stopped Spark@44ea608c{HTTP/1.1,[http/1.1]}{0.0.0.0:8090} 24/08/19 14:50:39 INFO ui.SparkUI: Stopped Spark web UI at http://adhoc-2:8090 24/08/19 14:50:39 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 24/08/19 14:50:39 INFO memory.MemoryStore: MemoryStore cleared 24/08/19 14:50:39 INFO storage.BlockManager: BlockManager stopped 24/08/19 14:50:39 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 24/08/19 14:50:39 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 24/08/19 14:50:39 INFO spark.SparkContext: Successfully stopped SparkContext Exception in thread "main" java.lang.NoSuchMethodError: scala.Function1.$init$(Lscala/Function1;)V at org.apache.spark.sql.hudi.HoodieSparkSessionExtension.<init>(HoodieSparkSessionExtension.scala:28) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.apache.spark.sql.SparkSession$Builder.liftedTree1$1(SparkSession.scala:945) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943) at org.apache.spark.sql.SQLContext$.getOrCreate(SQLContext.scala:1066) at org.apache.spark.sql.SQLContext.getOrCreate(SQLContext.scala) at org.apache.hudi.client.common.HoodieSparkEngineContext.<init>(HoodieSparkEngineContext.java:72) at org.apache.hudi.utilities.streamer.HoodieStreamer.<init>(HoodieStreamer.java:166) at org.apache.hudi.utilities.streamer.HoodieStreamer.<init>(HoodieStreamer.java:150) at org.apache.hudi.utilities.streamer.HoodieStreamer.<init>(HoodieStreamer.java:136) at org.apache.hudi.utilities.streamer.HoodieStreamer.main(HoodieStreamer.java:606) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 24/08/19 14:50:39 INFO util.ShutdownHookManager: Shutdown hook called 24/08/19 14:50:39 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-f23d0542-c7eb-4a7e-9f50-e0f6fcfd5722 24/08/19 14:50:39 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-7911bc3e-ab03-479f-b199-af17b548d6e7 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
