Vinoth Chandar created HUDI-735:
-----------------------------------
Summary: Improve deltastreamer error message when case mismatch of
commandline arguments.
Key: HUDI-735
URL: https://issues.apache.org/jira/browse/HUDI-735
Project: Apache Hudi (incubating)
Issue Type: Improvement
Components: DeltaStreamer, Usability, Utilities
Reporter: Vinoth Chandar
Team,
When following the blog "Change Capture Using AWS Database Migration
Service and Hudi" with my own data set, the initial load works perfectly.
When issuing the command with the DMS CDC files on S3, I get the following
error:
20/03/24 17:56:28 ERROR HoodieDeltaStreamer: Got error running delta sync
once. Shutting down
org.apache.hudi.exception.HoodieException: Please provide a valid schema
provider class! at
org.apache.hudi.utilities.sources.InputBatch.getSchemaProvider(InputBatch.java:53)
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:312)
at
org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:226)
I tried using the --schemaprovider-class
org.apache.hudi.utilities.schema.FilebasedSchemaProvider.Source and provide
the schema. The error does not occur but there are no write to Hudi.
I am not performing any transformations (other than the DMS transform) and
using default record key strategy.
If the team has any pointers, please let me know.
Thank you!
---
Thank you Vinoth. I was able to find the issue. All my column names were in
high caps case. I switched column names and table names to lower case and
it works perfectly.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)