[ 
https://issues.apache.org/jira/browse/HUDI-5001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vamshi Gudavarthi updated HUDI-5001:
------------------------------------
    Description: This issue is within the scope of row sources. The actual 
issue is that if the names of the columns in the row sources contain invalid 
avro characters ref [here|https://avro.apache.org/docs/1.10.2/spec.html#names] 
then using configuration set we can sanitize the column names both in the 
schema and actual data and the data ingestion to hudi isn't failed. The schema 
provider is scoped out to filebasedschemaregistry as other schema registries 
might not allow to register invalid schema in the first place.

> Sanitize avro column names for RowSource
> ----------------------------------------
>
>                 Key: HUDI-5001
>                 URL: https://issues.apache.org/jira/browse/HUDI-5001
>             Project: Apache Hudi
>          Issue Type: New Feature
>            Reporter: Vamshi Gudavarthi
>            Assignee: Vamshi Gudavarthi
>            Priority: Major
>
> This issue is within the scope of row sources. The actual issue is that if 
> the names of the columns in the row sources contain invalid avro characters 
> ref [here|https://avro.apache.org/docs/1.10.2/spec.html#names] then using 
> configuration set we can sanitize the column names both in the schema and 
> actual data and the data ingestion to hudi isn't failed. The schema provider 
> is scoped out to filebasedschemaregistry as other schema registries might not 
> allow to register invalid schema in the first place.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to