rajgowtham24 opened a new issue #1835:
URL: https://github.com/apache/hudi/issues/1835


   Hi all,
   
   I'm new to Hudi and looking to leverage Delta Streamer for JSON sources that 
is available in my s3 bucket.
   
   Below is the code snippet that i'm using to execute the same
   
   Source File(Json Format)
   
   {"empno":"8006","ename":"stuart","job":"salesman","hiredate":"2020-01-01 
00:00:00"}
   
   Code
   spark-submit --class 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer `ls 
/usr/lib/hudi/hudi-utilities-bundle_2.11-0.5.2-incubating.jar` 
   --table-type COPY_ON_WRITE 
   --source-class org.apache.hudi.utilities.sources.JsonDFSSource 
   --target-base-path s3://gowtham_km/hudi/target> --target-table emp
   --hoodie-conf 
hoodie.datasource.write.recordkey.field=empno,hoodie.deltastreamer.source.dfs.root=s3://gowtham_km/hudi/source>
 
   --transformer-class org.apache.hudi.utilities.transform.AWSDmsTransformer 
   --payload-class org.apache.hudi.payload.AWSDmsAvroPayload 
   --props 
file:/usr/lib/hudi/hudi_utilities/delta-streamer-config/dfs-source.properties  
   --schemaprovider-class org.apache.hudi.utilities.schema.SchemaProvider
   
   Error
   Exception in thread "main" java.io.IOException: Could not load schema 
provider class org.apache.hudi.utilities.schema.SchemaProvider
           at 
org.apache.hudi.utilities.UtilHelpers.createSchemaProvider(UtilHelpers.java:101)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.<init>(HoodieDeltaStreamer.java:364)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.<init>(HoodieDeltaStreamer.java:95)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.<init>(HoodieDeltaStreamer.java:89)
           at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:294)
           at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
           at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
           at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
           at java.lang.reflect.Method.invoke(Method.java:498)
           at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
           at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
           at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
           at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
           at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
           at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:928)
           at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:937)
           at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: org.apache.hudi.exception.HoodieException: Unable to instantiate 
class
           at 
org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:80)
           at 
org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:89)
           at 
org.apache.hudi.utilities.UtilHelpers.createSchemaProvider(UtilHelpers.java:99)
           ... 16 more
   Caused by: java.lang.NoSuchMethodException: 
org.apache.hudi.utilities.schema.SchemaProvider.<init>(org.apache.hudi.common.util.TypedProperties,
 org.apache.spark.api.java.JavaSparkContext)
           at java.lang.Class.getConstructor0(Class.java:3110)
           at java.lang.Class.getConstructor(Class.java:1853)
           at 
org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:78)
           ... 18 more
   20/07/15 15:51:36 INFO ShutdownHookManager: Shutdown hook called
   20/07/15 15:51:36 INFO ShutdownHookManager: Deleting directory 
/mnt/tmp/spark-51f5cf1d-db65-4c2b-853e-8e64c0666648
   20/07/15 15:51:36 INFO ShutdownHookManager: Deleting directory 
/mnt/tmp/spark-c2584053-3620-48dc-9380-43318af38392
   
   
   Expectation
   To start with learning would like to load the json file into target table 
and then later will add continuous option to load the new files into target 
table automatically. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to