[ 
https://issues.apache.org/jira/browse/HUDI-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HUDI-5001:
---------------------------------
    Labels: pull-request-available  (was: )

> Sanitize avro column names for RowSource
> ----------------------------------------
>
>                 Key: HUDI-5001
>                 URL: https://issues.apache.org/jira/browse/HUDI-5001
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Vamshi Gudavarthi
>            Assignee: Vamshi Gudavarthi
>            Priority: Major
>              Labels: pull-request-available
>
> This issue is within the scope of row sources. The actual issue is that if 
> the names of the columns in the row sources contain invalid avro characters 
> ref [here|https://avro.apache.org/docs/1.10.2/spec.html#names] then using 
> configuration set we can sanitize the column names both in the schema and 
> actual data and the data ingestion to hudi isn't failed. The schema provider 
> is scoped out to filebasedschemaregistry as other schema registries might not 
> allow to register invalid schema in the first place.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to