[
https://issues.apache.org/jira/browse/HUDI-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-5001:
---------------------------------
Labels: pull-request-available (was: )
> Sanitize avro column names for RowSource
> ----------------------------------------
>
> Key: HUDI-5001
> URL: https://issues.apache.org/jira/browse/HUDI-5001
> Project: Apache Hudi
> Issue Type: New Feature
> Reporter: Vamshi Gudavarthi
> Assignee: Vamshi Gudavarthi
> Priority: Major
> Labels: pull-request-available
>
> This issue is within the scope of row sources. The actual issue is that if
> the names of the columns in the row sources contain invalid avro characters
> ref [here|https://avro.apache.org/docs/1.10.2/spec.html#names] then using
> configuration set we can sanitize the column names both in the schema and
> actual data and the data ingestion to hudi isn't failed. The schema provider
> is scoped out to filebasedschemaregistry as other schema registries might not
> allow to register invalid schema in the first place.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)