rajgowtham24 opened a new issue #1835: URL: https://github.com/apache/hudi/issues/1835
Hi all, I'm new to Hudi and looking to leverage Delta Streamer for JSON sources that is available in my s3 bucket. Below is the code snippet that i'm using to execute the same Source File(Json Format) {"empno":"8006","ename":"stuart","job":"salesman","hiredate":"2020-01-01 00:00:00"} Code spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls /usr/lib/hudi/hudi-utilities-bundle_2.11-0.5.2-incubating.jar` --table-type COPY_ON_WRITE --source-class org.apache.hudi.utilities.sources.JsonDFSSource --target-base-path s3://gowtham_km/hudi/target> --target-table emp --hoodie-conf hoodie.datasource.write.recordkey.field=empno,hoodie.deltastreamer.source.dfs.root=s3://gowtham_km/hudi/source> --transformer-class org.apache.hudi.utilities.transform.AWSDmsTransformer --payload-class org.apache.hudi.payload.AWSDmsAvroPayload --props file:/usr/lib/hudi/hudi_utilities/delta-streamer-config/dfs-source.properties --schemaprovider-class org.apache.hudi.utilities.schema.SchemaProvider Error Exception in thread "main" java.io.IOException: Could not load schema provider class org.apache.hudi.utilities.schema.SchemaProvider at org.apache.hudi.utilities.UtilHelpers.createSchemaProvider(UtilHelpers.java:101) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.<init>(HoodieDeltaStreamer.java:364) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.<init>(HoodieDeltaStreamer.java:95) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.<init>(HoodieDeltaStreamer.java:89) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:294) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:928) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:937) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.hudi.exception.HoodieException: Unable to instantiate class at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:80) at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:89) at org.apache.hudi.utilities.UtilHelpers.createSchemaProvider(UtilHelpers.java:99) ... 16 more Caused by: java.lang.NoSuchMethodException: org.apache.hudi.utilities.schema.SchemaProvider.<init>(org.apache.hudi.common.util.TypedProperties, org.apache.spark.api.java.JavaSparkContext) at java.lang.Class.getConstructor0(Class.java:3110) at java.lang.Class.getConstructor(Class.java:1853) at org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:78) ... 18 more 20/07/15 15:51:36 INFO ShutdownHookManager: Shutdown hook called 20/07/15 15:51:36 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-51f5cf1d-db65-4c2b-853e-8e64c0666648 20/07/15 15:51:36 INFO ShutdownHookManager: Deleting directory /mnt/tmp/spark-c2584053-3620-48dc-9380-43318af38392 Expectation To start with learning would like to load the json file into target table and then later will add continuous option to load the new files into target table automatically. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org