harishraju-govindaraju opened a new issue #4641: URL: https://github.com/apache/hudi/issues/4641
**Describe the problem you faced** I started an EMR Cluster and trying to run deltastreamer of HUDI. However, i get an error 👍 **Failed to load org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer. java.lang.ClassNotFoundException: org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer** I was trying to follow this documentation and do the steps. https://hudi.apache.org/blog/2021/08/23/s3-events-source A clear and concise description of the problem. **To Reproduce** Steps to reproduce the behavior: # To start S3EventsSource spark-submit \ --jars "/home/hadoop/hudi-utilities-bundle_2.11-0.9.0.jar,/usr/lib/spark/external/lib/spark-avro.jar,/home/hadoop/aws-java-sdk-sqs-1.12.22.jar" \ --master yarn --deploy-mode client \ --class "org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer" /home/hadoop/hudi-packages/hudi-utilities-bundle_2.11-0.9.0-SNAPSHOT.jar \ --table-type COPY_ON_WRITE --source-ordering-field eventTime \ --target-base-path s3://s3-eip-dev-uea1-hudipoc-001/hudi-trusted/metadata/ \ --target-table s3_meta_table --continuous \ --min-sync-interval-seconds 10 \ --hoodie-conf hoodie.datasource.write.recordkey.field="s3.object.key,eventName" \ --hoodie-conf hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.ComplexKeyGenerator \ --hoodie-conf hoodie.datasource.write.partitionpath.field=s3.bucket.name --enable-hive-sync \ --hoodie-conf hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.MultiPartKeysValueExtractor \ --hoodie-conf hoodie.datasource.write.hive_style_partitioning=true \ --hoodie-conf hoodie.datasource.hive_sync.database=default \ --hoodie-conf hoodie.datasource.hive_sync.table=s3_meta_table \ --hoodie-conf hoodie.datasource.hive_sync.partition_fields=bucket \ --source-class org.apache.hudi.utilities.sources.S3EventsSource \ --hoodie-conf hoodie.deltastreamer.source.queue.url=https://sqs.us-east-1.amazonaws.com/118897059965/sqshudi --hoodie-conf hoodie.deltastreamer.s3.source.queue.region=us-east-1 **Expected behavior** I wanted to run this spark submit successfully. **Environment Description** * Hudi version : * Hive version : *EMR Version : 5.33.1 Hive 2.3.7 Spark 2.4.7 Flink 1.12.1 * Storage (HDFS/S3/GCS..) : S3 * Running on Docker? (yes/no) : no **Additional context** Add any other context about the problem here. **Stacktrace** ```Add the stacktrace of the error.``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
