Hi All,

Hope you are doing well.
I am currently trying to implement the Hudi Utilities using Delta Streamer.
Below is the command line configuration I am passing

spark2-submit --master yarn --deploy-mode cluster --class
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer
/tmp/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar --props
/user/oozie/dataops/hoodie/config.properties   --schemaprovider-class
org.apache.hudi.utilities.schema.SchemaRegistryProvider   --source-class
org.apache.hudi.utilities.sources.AvroKafkaSource   --source-ordering-field
LastModified_dtmStamp
--target-base-path /tmp/hudi-deltastreamer-op_TEST --target-table
testTableHoodie   --op UPSERT   --enable-hive-sync   --storage-type
MERGE_ON_READ
Also, have attached the config file too.

Unfortunately, while writing the files in parquet, it throws an exception
as "java.lang.NoClassDefFoundError:
org/apache/parquet/hadoop/metadata/CompressionCodecName"
Full Error Trace has been attached for your reference.
There are few warnings with respect to configuration but not sure if that's
the problem.

I have tried giving the classpath as well. I am not sure what i am missing
here.
It would be great if anybody could help me here.

Hadoop version :- 2.6.0-cdh5.14.2
Spark version :- 2.3.0.cloudera2


*Regards,*
*Shahida R. Khan*
*+91 9167538366*

Reply via email to