Wenning Ding created HUDI-1376:
----------------------------------
Summary: Remove the schema of metadata columns in the commit files
Key: HUDI-1376
URL: https://issues.apache.org/jira/browse/HUDI-1376
Project: Apache Hudi
Issue Type: Bug
Reporter: Wenning Ding
When updating a Hudi table through Spark datasource, it will use the schema of
the input dataframe as the schema stored in the commit files. Thus, when
upserted with rows containing metadata columns, the upsert commit file will
store the metadata columns schema in the commit file which is unnecessary for
common cases. And also this will bring an issue for bootstrap table.
Since the schema of metadata columns is always the same, we should remove the
schema of metadata columns in the commit file for any insert/upsert/... action.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)