Hi,

That's surprising..Do you have --source-class
org.apache.hudi.utilities.sources.ParquetDFSSource?
I ask sine for Row based sources, the schema provider is auto configured as
show in the blog page..

Thanks
VInoth

On Tue, Mar 24, 2020 at 11:07 AM Joaquim S <joaqs...@gmail.com> wrote:

> Team,
>
> When following the blog "Change Capture Using AWS Database Migration
> Service and Hudi" with my own data set, the initial load works perfectly.
> When issuing the command with the DMS CDC files on S3, I get the following
> error:
>
> 20/03/24 17:56:28 ERROR HoodieDeltaStreamer: Got error running delta sync
> once. Shutting down
> org.apache.hudi.exception.HoodieException: Please provide a valid schema
> provider class! at
>
> org.apache.hudi.utilities.sources.InputBatch.getSchemaProvider(InputBatch.java:53)
>  at
>
> org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:312)
> at
>
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
>
> I tried using the  --schemaprovider-class
> org.apache.hudi.utilities.schema.FilebasedSchemaProvider.Source and provide
> the schema. The error does not occur but there are no write to Hudi.
>
> I am not performing any transformations (other than the DMS transform) and
> using default record key strategy.
>
> If the team has any pointers, please let me know.
>
> Thank you!
>

Reply via email to