Krishen Bhan created HUDI-7507:
----------------------------------

             Summary:  ongoing concurrent writers with smaller timestamp can 
cause issues with table services
                 Key: HUDI-7507
                 URL: https://issues.apache.org/jira/browse/HUDI-7507
             Project: Apache Hudi
          Issue Type: Improvement
            Reporter: Krishen Bhan


Although HUDI operations hold a table lock when creating a .requested instant, 
because HUDI writers do not generate a timestamp and create a .requsted plan in 
the same transaction, there can be a scenario where 
 # Job 1 starts, chooses timestamp (x) , Job 2 starts and chooses timestamp (x 
- 1)
 # Job 1 schedules and creates requested file with instant timestamp (x)
 # Job 2 schedules and creates requested file with instant timestamp (x-1)
 # Both jobs continue running

If one job is writing a commit and the other is a table service, this can cause 
issues:
 * 
 ** If Job 2 is ingestion commit and Job 1 is compaction/log compaction, then 
when Job 1 runs before Job 2 and can create a compaction plan for all instant 
times (up to (x) ) that doesn’t include instant time (x-1) .  Later Job 2 will 
create instant time (x-1), but timeline will be in a corrupted state since 
compaction plan was supposed to include (x-1)
 ** There is a similar issue with clean. If Job2 is a long-running commit (that 
was stuck/delayed for a while before creating its .requested plan) and Job 1 is 
a clean, then Job 1 can perform a clean that updates the 
earliest-commit-to-retain without waiting for the inflight instant by Job 2 at 
(x-1) to complete. This causes Job2 to be "skipped" by clean.

One way this can be resolved is by combining the operations of generating 
instant time and creating a requested file in the same HUDI table transaction. 
Specifically, executing the following steps whenever any instant (commit, table 
service, etc) is scheduled
 # Acquire table lock
 # Look at the latest instant C on the active timeline (completed or not). 
Generate a timestamp after C
 # Create the plan and requested file using this new timestamp ( that is 
greater than C)
 # Release table lock

Unfortunately this has the following drawbacks
 * Every operation must now hold the table lock when computing its plan, even 
if its an expensive operation and will take a while
 * Users of HUDI cannot easily set their own instant time of an operation, and 
this restriction would break any public APIs that allow this

An alternate approach (suggested by [~pwason] ) was to instead have all 
operations including table services perform conflict resolution checks before 
committing. For example, clean and compaction would generate their plan as 
usual. But when creating a transaction to write a .requested file, right before 
creating the file they should check if another lower timestamp instant has 
appeared in the timeline. And if so, they should fail/abort without creating 
the plan. Commit operations would also be updated/verified to have similar 
check, before creating a .requested file (during a transaction) the commit 
operation will check if a table service plan (clean/compact) with a greater 
instant time has been created. And if so, would abort/fail. This avoids the 
drawbacks of the first approach, but will lead to more transient failures that 
users have to handle.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to