hikiyoung opened a new issue #1499: [SUPPORT] DeltaStreamer - 
NoClassDefFoundError for HiveDriver
URL: https://github.com/apache/incubator-hudi/issues/1499
 
 
   **_Tips before filing an issue_**
   
   - Have you gone through our 
[FAQs](https://cwiki.apache.org/confluence/display/HUDI/FAQ)?
   
   - Join the mailing list to engage in conversations and get faster support at 
dev-subscr...@hudi.apache.org.
   
   - If you have triaged this as a bug, then file an 
[issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   Using DeltaStreamer with --enable-hive-sync and it throws NoSuchMethodError: 
org.apache.hadoop.hive.ql.metadata.Hive.get(Lorg/apache/hudi/org/apache/hadoop_hive/conf/HiveConf;)Lorg/apache/hadoop/hive/ql/metadata/Hive;
 error.
   Should I change something in the default compilation process to include this 
class?
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. Properties file
   ```
   include=base.properties
   hoodie.datasource.write.recordkey.field=ORDERNUMBER
   hoodie.datasource.write.partitionpath.field=PARTITIONPATH
   hoodie.datasource.hive_sync.assume_date_partitioning=false
   
hoodie.deltastreamer.schemaprovider.source.schema.file=file:///home/hadoop/hudi/config/orders_hudi_schema.avro
   
hoodie.deltastreamer.schemaprovider.target.schema.file=file:///home/hadoop/hudi/config/orders_hudi_schema.avro
   
   hoodie.deltastreamer.source.kafka.topic=orders_hudi_v1
   bootstrap.servers=kafka-broker-1:9092
   auto.offset.reset=smallest
   
   hoodie.datasource.hive_sync.database=hudi
   hoodie.datasource.hive_sync.table=orders_hudi_cow
   hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://localhost:10000
   hoodie.datasource.hive_sync.username=hive
   hoodie.datasource.hive_sync.password=hive
   hoodie.datasource.hive_sync.partition_fields=PARTITIONPATH
   ionValueExtractor
   ```
   2. Launch script with HoodieDeltaStreamer
   ```
   TARGET_DATABASE="hudi"
   TRAGET_TABLE="orders_hudi"
   HUDI_UTILITIES_BUNDLE="file:///usr/lib/hudi/hudi-utilities-bundle.jar"
   TARGET_BASE_PATH="s3://data-store/$TARGET_DATABASE/$TRAGET_TABLE"
   PROPS="file:///home/hadoop/hudi/config/kafka-source.properties"
   
CHECKPOINT_BASE_PATH="s3://data-store/checkpoint/$TARGET_DATABASE/$TRAGET_TABLE"
   
   spark-submit \
     --conf 
'spark.jars=/usr/lib/hudi/hudi-hadoop-mr-bundle.jar,/usr/lib/hudi/hudi-hive-bundle.jar,/usr/lib/hudi/hudi-presto-bundle.jar,/usr/lib/hudi/hudi-spark-bundle.jar,/usr/lib/hudi/hudi-timeline-server-bundle.jar'
  \
     --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
     --master yarn \
     --deploy-mode client \
     --jars 
/usr/lib/spark/jars/httpclient-4.5.9.jar,/usr/lib/hudi/hudi-spark-bundle.jar,/usr/lib/spark/external/lib/spark-avro.jar
 \
     --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer 
$HUDI_UTILITIES_BUNDLE \
     --storage-type MERGE_ON_READ \
     --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \
     --target-base-path $TARGET_BASE_PATH \
     --target-table "$TARGET_DATABASE.$TRAGET_TABLE" \
     --source-ordering-field UPDATEDATE \
     --enable-hive-sync \
     --continuous \
     --props $PROPS \
     --schemaprovider-class 
org.apache.hudi.utilities.schema.FilebasedSchemaProvider
   ```
   
   
   **Expected behavior**
   
   Sync to hive
   
   **Environment Description**
   EMR 2.59.0
   
   * Hudi version : 0.5.0-inc
   
   * Spark version : 2.4.4
   
   * Hive version : 2.3.6
   
   * Hadoop version : 2.8.5
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```
   20/04/08 17:54:22 INFO YarnScheduler: Removed TaskSet 39.0, whose tasks have 
all completed, from pool
   20/04/08 17:54:22 INFO DAGScheduler: ResultStage 39 (collect at 
HoodieRealtimeTableCompactor.java:200) finished in 3.432 s
   20/04/08 17:54:22 INFO DAGScheduler: Job 13 finished: collect at 
HoodieRealtimeTableCompactor.java:200, took 3.436397 s
   20/04/08 17:54:22 INFO MultipartUploadOutputStream: close closed:false 
s3://data-store/hudi/orders_hudi_cow/.hoodie/.aux/20200408175418.compaction.requested
   20/04/08 17:54:22 INFO MultipartUploadOutputStream: close closed:false 
s3://data-store/hudi/orders_hudi_cow/.hoodie/.aux/20200408175418.compaction.requested
   20/04/08 17:54:22 INFO MultipartUploadOutputStream: close closed:false 
s3://data-store/hudi/orders_hudi_cow/.hoodie/20200408175418.compaction.requested
   20/04/08 17:54:22 INFO MultipartUploadOutputStream: close closed:false 
s3://data-store/hudi/orders_hudi_cow/.hoodie/20200408175418.compaction.requested
   20/04/08 17:54:22 INFO S3NativeFileSystem: Opening 
's3://data-store/hudi/orders_hudi_cow/.hoodie/hoodie.properties' for reading
   20/04/08 17:49:45 INFO Utils: Supplied authorities: localhost:10000
   20/04/08 17:49:45 INFO Utils: Resolved authority: localhost:10000
   20/04/08 17:49:45 INFO HiveConnection: Will try to open client transport 
with JDBC Uri: jdbc:hive2://localhost:10000
   20/04/08 17:49:46 WARN AbstractDeltaStreamerService: Gracefully shutting 
down compactor
   
   
   
   20/04/08 17:50:27 ERROR AbstractDeltaStreamerService: Service shutdown with 
error
   java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: 
org.apache.hadoop.hive.ql.metadata.Hive.get(Lorg/apache/hudi/org/apache/hadoop_hive/conf/HiveConf;)Lorg/apache/hadoop/hive/ql/metadata/Hive;
           at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
           at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
           at 
org.apache.hudi.utilities.deltastreamer.AbstractDeltaStreamerService.waitForShutdown(AbstractDeltaStreamerService.java:70)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:116)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:292)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
           at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
           at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
           at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
           at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
           at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:928)
           at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:937)
           at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: java.lang.NoSuchMethodError: 
org.apache.hadoop.hive.ql.metadata.Hive.get(Lorg/apache/hudi/org/apache/hadoop_hive/conf/HiveConf;)Lorg/apache/hadoop/hive/ql/metadata/Hive;
           at 
org.apache.hudi.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:111)
           at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:60)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncHive(DeltaSync.java:440)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:382)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:390)
           at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   20/04/08 17:50:27 ERROR AbstractDeltaStreamerService: Monitor noticed one or 
more threads failed. Requesting graceful shutdown of other threads
   java.util.concurrent.ExecutionException: java.lang.NoSuchMethodError: 
org.apache.hadoop.hive.ql.metadata.Hive.get(Lorg/apache/hudi/org/apache/hadoop_hive/conf/HiveConf;)Lorg/apache/hadoop/hive/ql/metadata/Hive;
           at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
           at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
           at 
org.apache.hudi.utilities.deltastreamer.AbstractDeltaStreamerService.lambda$monitorThreads$0(AbstractDeltaStreamerService.java:134)
           at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
           at java.util.concurrent.FutureTask.run(FutureTask.java:266)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.NoSuchMethodError: 
org.apache.hadoop.hive.ql.metadata.Hive.get(Lorg/apache/hudi/org/apache/hadoop_hive/conf/HiveConf;)Lorg/apache/hadoop/hive/ql/metadata/Hive;
           at 
org.apache.hudi.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:111)
           at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:60)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncHive(DeltaSync.java:440)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:382)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:390)
           at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
           ... 3 more
   20/04/08 17:50:27 INFO Javalin: Stopping Javalin ...
   20/04/08 17:50:27 INFO SparkUI: Stopped Spark web UI at 
http://xxx.yyy.compute.internal:4040
   20/04/08 17:50:27 INFO Javalin: Javalin has stopped
   20/04/08 17:50:27 INFO YarnClientSchedulerBackend: Interrupting monitor 
thread
   20/04/08 17:50:27 INFO YarnClientSchedulerBackend: Shutting down all 
executors
   20/04/08 17:50:27 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each 
executor to shut down
   20/04/08 17:50:27 INFO SchedulerExtensionServices: Stopping 
SchedulerExtensionServices
   (serviceOption=None,
    services=List(),
    started=false)
   20/04/08 17:50:27 INFO YarnClientSchedulerBackend: Stopped
   20/04/08 17:50:27 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   20/04/08 17:50:27 INFO MemoryStore: MemoryStore cleared
   20/04/08 17:50:27 INFO BlockManager: BlockManager stopped
   20/04/08 17:50:27 INFO BlockManagerMaster: BlockManagerMaster stopped
   20/04/08 17:50:27 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   20/04/08 17:50:27 INFO SparkContext: Successfully stopped SparkContext
   Exception in thread "main" java.util.concurrent.ExecutionException: 
java.lang.NoSuchMethodError: 
org.apache.hadoop.hive.ql.metadata.Hive.get(Lorg/apache/hudi/org/apache/hadoop_hive/conf/HiveConf;)Lorg/apache/hadoop/hive/ql/metadata/Hive;
           at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
           at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
           at 
org.apache.hudi.utilities.deltastreamer.AbstractDeltaStreamerService.waitForShutdown(AbstractDeltaStreamerService.java:70)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:116)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:292)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
           at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
           at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
           at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
           at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
           at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:928)
           at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:937)
           at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: java.lang.NoSuchMethodError: 
org.apache.hadoop.hive.ql.metadata.Hive.get(Lorg/apache/hudi/org/apache/hadoop_hive/conf/HiveConf;)Lorg/apache/hadoop/hive/ql/metadata/Hive;
           at 
org.apache.hudi.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:111)
           at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:60)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncHive(DeltaSync.java:440)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:382)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:390)
           at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to