[
https://issues.apache.org/jira/browse/HUDI-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan closed HUDI-3596.
-------------------------------------
Assignee: sivabalan narayanan
Resolution: Fixed
> Understand why NULLSchemaProvider is used for empty batch in InputBatch
> instead of empty schema
> -----------------------------------------------------------------------------------------------
>
> Key: HUDI-3596
> URL: https://issues.apache.org/jira/browse/HUDI-3596
> Project: Apache Hudi
> Issue Type: Task
> Components: deltastreamer
> Reporter: sivabalan narayanan
> Assignee: sivabalan narayanan
> Priority: Major
> Fix For: 0.11.0
>
>
> In
> [InputBatch|https://github.com/apache/hudi/blob/548000b0d635067acf7574c4cf5122759e79b52b/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/InputBatch.java#L57]
> we use NullSchemaProvider when batch is empty. this will return null schema (
> Schema.create(Schema.Type.NULL)). We need to understand why can't we return
> an empty schema here (Scehma.create(Collections.emptyList()). Thats the
> schema for an empty spark dataframe and we should stick to that.
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)