[
https://issues.apache.org/jira/browse/HUDI-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo updated HUDI-7507:
----------------------------
Component/s: table-service
> ongoing concurrent writers with smaller timestamp can cause issues with
> table services
> ---------------------------------------------------------------------------------------
>
> Key: HUDI-7507
> URL: https://issues.apache.org/jira/browse/HUDI-7507
> Project: Apache Hudi
> Issue Type: Improvement
> Components: table-service
> Reporter: Krishen Bhan
> Priority: Major
> Attachments: Flowchart (1).png, Flowchart.png
>
>
> Although HUDI operations hold a table lock when creating a .requested
> instant, because HUDI writers do not generate a timestamp and create a
> .requsted plan in the same transaction, there can be a scenario where
> # Job 1 starts, chooses timestamp (x) , Job 2 starts and chooses timestamp
> (x - 1)
> # Job 1 schedules and creates requested file with instant timestamp (x)
> # Job 2 schedules and creates requested file with instant timestamp (x-1)
> # Both jobs continue running
> If one job is writing a commit and the other is a table service, this can
> cause issues:
> *
> ** If Job 2 is ingestion commit and Job 1 is compaction/log compaction, then
> when Job 1 runs before Job 2 and can create a compaction plan for all instant
> times (up to (x) ) that doesn’t include instant time (x-1) . Later Job 2
> will create instant time (x-1), but timeline will be in a corrupted state
> since compaction plan was supposed to include (x-1)
> ** There is a similar issue with clean. If Job2 is a long-running commit
> (that was stuck/delayed for a while before creating its .requested plan) and
> Job 1 is a clean, then Job 1 can perform a clean that updates the
> earliest-commit-to-retain without waiting for the inflight instant by Job 2
> at (x-1) to complete. This causes Job2 to be "skipped" by clean.
> [Edit] I added a diagram to visualize the issue, specifically the second
> scenario with clean
> !Flowchart (1).png!
>
> One way this can be resolved is by combining the operations of generating
> instant time and creating a requested file in the same HUDI table
> transaction. Specifically, executing the following steps whenever any instant
> (commit, table service, etc) is scheduled
> # Acquire table lock
> # Look at the latest instant C on the active timeline (completed or not).
> Generate a timestamp after C
> # Create the plan and requested file using this new timestamp ( that is
> greater than C)
> # Release table lock
> Unfortunately this has the following drawbacks
> * Every operation must now hold the table lock when computing its plan, even
> if its an expensive operation and will take a while
> * Users of HUDI cannot easily set their own instant time of an operation,
> and this restriction would break any public APIs that allow this
> An alternate approach (suggested by [~pwason] ) was to instead have all
> operations including table services perform conflict resolution checks before
> committing. For example, clean and compaction would generate their plan as
> usual. But when creating a transaction to write a .requested file, right
> before creating the file they should check if another lower timestamp instant
> has appeared in the timeline. And if so, they should fail/abort without
> creating the plan. Commit operations would also be updated/verified to have
> similar check, before creating a .requested file (during a transaction) the
> commit operation will check if a table service plan (clean/compact) with a
> greater instant time has been created. And if so, would abort/fail. This
> avoids the drawbacks of the first approach, but will lead to more transient
> failures that users have to handle.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)