SureshK-T2S edited a comment on issue #2406:
URL: https://github.com/apache/hudi/issues/2406#issuecomment-774548997


   Thanks for your response. Till now using HoodieDeltaStreamer, I have not had 
to specify the Schema Provider Class when using ParquetDFS source.
   
   Looking at the Schema Providers 
[here](https://javadoc.io/doc/org.apache.hudi/hudi-utilities_2.11/latest/index.html),
 I was thinking NullTargetSchemaRegistryProvider would be good here but I 
experienced the following error:
   
   ```
   java.io.IOException: Could not load schema provider class 
org.apache.hudi.utilities.schema.NullTargetSchemaRegistryProvider
        at 
org.apache.hudi.utilities.UtilHelpers.createSchemaProvider(UtilHelpers.java:107)
        at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.<init>(HoodieDeltaStreamer.java:550)
        at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.<init>(HoodieDeltaStreamer.java:129)
        at 
org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.<init>(HoodieDeltaStreamer.java:104)
        at 
org.apache.hudi.utilities.deltastreamer.HoodieMultiTableDeltaStreamer.sync(HoodieMultiTableDeltaStreamer.java:354)
        at 
org.apache.hudi.utilities.deltastreamer.HoodieMultiTableDeltaStreamer.main(HoodieMultiTableDeltaStreamer.java:201)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:928)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:937)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: org.apache.hudi.exception.HoodieException: Unable to instantiate 
class 
        at 
org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:89)
        at 
org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:98)
        at 
org.apache.hudi.utilities.UtilHelpers.createSchemaProvider(UtilHelpers.java:105)
        ... 17 more
   Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at 
org.apache.hudi.common.util.ReflectionUtils.loadClass(ReflectionUtils.java:87)
        ... 19 more
   Caused by: org.apache.hudi.exception.HoodieNotSupportedException: Required 
property hoodie.deltastreamer.schemaprovider.registry.url is missing
        at 
org.apache.hudi.DataSourceUtils.lambda$checkRequiredProperties$0(DataSourceUtils.java:144)
        at java.util.Collections$SingletonList.forEach(Collections.java:4824)
        at 
org.apache.hudi.DataSourceUtils.checkRequiredProperties(DataSourceUtils.java:142)
        at 
org.apache.hudi.utilities.schema.SchemaRegistryProvider.<init>(SchemaRegistryProvider.java:63)
        at 
org.apache.hudi.utilities.schema.NullTargetSchemaRegistryProvider.<init>(NullTargetSchemaRegistryProvider.java:33)
        ... 24 more
   ```
   I tried adding hoodie.deltastreamer.schemaprovider.registry.url to the props 
with blank value but it gave me a malformed URL error.
   
   Please let me know if I should be using a different schema provider class or 
approach.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to