eigakow opened a new issue #1398: [SUPPORT] DeltaStreamer - 
NoClassDefFoundError for HiveDriver
URL: https://github.com/apache/incubator-hudi/issues/1398
 
 
   **Describe the problem you faced**
   
   Using DeltaStreamer with --enable-hive-sync throws 
`java.lang.NoClassDefFoundError: org/apache/hive/jdbc/HiveDriver` error.
   Should I change something in the default compilation process to include this 
class?
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.  Properties file:
   ```
   hoodie.datasource.write.recordkey.field=ts
   hoodie.datasource.write.partitionpath.field=ts
   
hoodie.deltastreamer.schemaprovider.source.schema.file=file:///home/director/me/hudi-0.5.1-incubating/schema.avro
   
hoodie.deltastreamer.schemaprovider.target.schema.file=file:///home/director/me/hudi-0.5.1-incubating/schema.avro
   source-class=FR24JsonKafkaSource
   
bootstrap.servers=streaming-kafka-broker-1:9092,streaming-kafka-broker-2:9092,streaming-kafka-broker-3:9092
   group.id=hudi_testing
   hoodie.deltastreamer.source.kafka.topic=fr-bru
   enable.auto.commit=false
   schemaprovider-class=org.apache.hudi.utilities.schema.FilebasedSchemaProvider
   auto.offset.reset=earliest
   
   hoodie.datasource.hive_sync.database=fr24raw
   hoodie.datasource.hive_sync.table=test_hudi
   
hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://master-1.bigdatapoc.local:10000/default;principal=hive/master-1.bigdatapoc.local@BIGDATAPOC.LOCAL
   hoodie.datasource.hive_sync.assume_date_partitioning=true
   hoodie.datasource.hive_sync.useJdbc=false
   ```
   2. Launch spark-submit with HoodieDeltaStreamer
   ```
   spark-submit --master yarn  --class 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer --jars 
$(pwd)/../my-app-1-jar-with-dependencies.jar 
$(pwd)/packaging/hudi-utilities-bundle/target/hudi-utilities-bundle_2.11-0.5.1-incubating.jar
 --props hdfs:///tmp/hudi-fr24.properties --target-base-path 
adl://XXX.azuredatalakestore.net/test-hudi --table-type MERGE_ON_READ 
--target-table test_hudi --source-class FR24JsonKafkaSource  
--schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider 
--enable-hive-sync --continuous --source-limit 100
   ```
   **Expected behavior**
   
   Sync to hive works
   
   **Environment Description**
   
   * Hudi version : hudi-0.5.1-incubating
   
   * Spark version : 2.4.0-cdh6.1.0
   
   * Hive version : 2.1.1-cdh6.1.0
   
   * Hadoop version : 3.0.0-cdh6.1.0
   
   * Storage (HDFS/S3/GCS..) : ADLS
   
   * Running on Docker? (yes/no) : no
   
   
   **Stacktrace**
   
   ```
   0/03/11 16:04:47 INFO cluster.YarnScheduler: Removed TaskSet 37.0, whose 
tasks have all completed, from pool
   20/03/11 16:04:47 INFO scheduler.DAGScheduler: ResultStage 37 (collect at 
HoodieMergeOnReadTableCompactor.java:208) finished in 0.679 s
   20/03/11 16:04:47 INFO scheduler.DAGScheduler: Job 12 finished: collect at 
HoodieMergeOnReadTableCompactor.java:208, took 0.680344 s
   20/03/11 16:04:47 INFO compact.HoodieMergeOnReadTableCompactor: Total of 0 
compactions are retrieved
   20/03/11 16:04:47 INFO compact.HoodieMergeOnReadTableCompactor: Total number 
of latest files slices 4
   20/03/11 16:04:47 INFO compact.HoodieMergeOnReadTableCompactor: Total number 
of log files 0
   20/03/11 16:04:47 INFO compact.HoodieMergeOnReadTableCompactor: Total number 
of file slices 4
   20/03/11 16:04:47 WARN compact.HoodieMergeOnReadTableCompactor: After 
filtering, Nothing to compact for 
adl://ecintpocdl.azuredatalakestore.net/FlightRadar24/test-hudi3
   20/03/11 16:04:47 INFO deltastreamer.DeltaSync: Syncing target hoodie table 
with hive table(test_hudi). Hive metastore URL 
:jdbc:hive2://master-1.bigdatapoc.local:10000/default;principal=hive/master-1.bigdatapoc.local@BIGDATAPOC.LOCAL,
 basePath :adl://XXX.azuredatalakestore.net/test-hudi
   20/03/11 16:04:47 INFO deltastreamer.HoodieDeltaStreamer: Delta Sync 
shutdown. Error ?false
   20/03/11 16:04:47 WARN deltastreamer.HoodieDeltaStreamer: Gracefully 
shutting down compactor
   20/03/11 16:05:00 INFO deltastreamer.HoodieDeltaStreamer: Compactor shutting 
down properly!!
   20/03/11 16:05:00 ERROR deltastreamer.AbstractDeltaStreamerService: Service 
shutdown with error
   java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: 
org/apache/hive/jdbc/HiveDriver
           at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
           at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
           at 
org.apache.hudi.utilities.deltastreamer.AbstractDeltaStreamerService.waitForShutdown(AbstractDeltaStreamerService.java:72)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:117)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:295)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
           at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
           at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
           at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
           at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
           at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
           at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
           at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: java.lang.NoClassDefFoundError: org/apache/hive/jdbc/HiveDriver
           at 
org.apache.hudi.hive.HoodieHiveClient.<clinit>(HoodieHiveClient.java:80)
           at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:66)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncHive(DeltaSync.java:481)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:423)
           at 
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:238)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:393)
           at 
java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
           at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.ClassNotFoundException: org.apache.hive.jdbc.HiveDriver
           at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
           at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
           at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
           ... 10 more
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to