sivabalan narayanan created HUDI-9555:
-----------------------------------------
Summary: Fix secondary write to metadata table when streaming
writes to metadata table is enabled
Key: HUDI-9555
URL: https://issues.apache.org/jira/browse/HUDI-9555
Project: Apache Hudi
Issue Type: Improvement
Components: metadata
Reporter: sivabalan narayanan
When streaming writes to metadata table is enabled, we call upsertPrepped twice
to metadata table before finally completing the commit.
First one is meant for streaming writes and the second one is for batch write
including FILES partition.
Some coordination is required for these writes so that we create inflight
commit meta file in metadata table timeline only once.
When streaming dag patch was first put out, we did ensure this was taken care,
but along the way while addressing feedback, we refactored the second api to
re-use legacy apis. So, this lead to creating inflight commit file twice for a
given delta commit in metadata table.
In local FS, this is not an issue, but in AWS env, this led to failures.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)