Hi, That's surprising..Do you have --source-class org.apache.hudi.utilities.sources.ParquetDFSSource? I ask sine for Row based sources, the schema provider is auto configured as show in the blog page..
Thanks VInoth On Tue, Mar 24, 2020 at 11:07 AM Joaquim S <joaqs...@gmail.com> wrote: > Team, > > When following the blog "Change Capture Using AWS Database Migration > Service and Hudi" with my own data set, the initial load works perfectly. > When issuing the command with the DMS CDC files on S3, I get the following > error: > > 20/03/24 17:56:28 ERROR HoodieDeltaStreamer: Got error running delta sync > once. Shutting down > org.apache.hudi.exception.HoodieException: Please provide a valid schema > provider class! at > > org.apache.hudi.utilities.sources.InputBatch.getSchemaProvider(InputBatch.java:53) > at > > org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:312) > at > > org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226) > > I tried using the --schemaprovider-class > org.apache.hudi.utilities.schema.FilebasedSchemaProvider.Source and provide > the schema. The error does not occur but there are no write to Hudi. > > I am not performing any transformations (other than the DMS transform) and > using default record key strategy. > > If the team has any pointers, please let me know. > > Thank you! >