Davis Zhang created HUDI-8887:
---------------------------------

             Summary: FileSystem lock provider can have more than 1 threads 
hold the lock at the same time
                 Key: HUDI-8887
                 URL: https://issues.apache.org/jira/browse/HUDI-8887
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Davis Zhang


related https://issues.apache.org/jira/browse/HUDI-7483

It is clear that when the test error out with file system LP, the instant file 
has conflict.

It is verified by the following unit test

 
{code:java}
// Copy paste to 
hudi-oss/hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/transaction/TestSimpleConcurrentFileWritesConflictResolutionStrategy.java

@Test
public void test22() throws Exception {
  // step1: create a pending replace/commit/compact instant: C1,C11,C12
  HoodieInstant compact = new HoodieInstant(HoodieInstant.State.REQUESTED, 
HoodieTimeline.COMPACTION_ACTION, "20250117104717388", 
InstantComparatorV2.REQUESTED_TIME_BASED_COMPARATOR);
  HoodieInstant ds = new HoodieInstant(HoodieInstant.State.INFLIGHT, 
HoodieTimeline.DELTA_COMMIT_ACTION, "20250117104722625", 
InstantComparatorV2.REQUESTED_TIME_BASED_COMPARATOR);
  ConcurrentOperation cc = new ConcurrentOperation(compact, metaClient);
  ConcurrentOperation cds = new ConcurrentOperation(ds, metaClient);
  SimpleConcurrentFileWritesConflictResolutionStrategy strategy = new 
SimpleConcurrentFileWritesConflictResolutionStrategy();
  Assertions.assertTrue(strategy.hasConflict(cc, cds));
} {code}
First stop the test at the first line, then copy the .hoodie folder of 
7483.zip/repro/.hoodie to the base path of the meta client used by this test. 
Run the test and it pass.

 

It means given the compaction plan and the delta commit instant, the conflict 
resolution strategy does its job. The only explanation is the lock does not 
hold the exclusive lock owner invariant, which leads to concurrent execution of 
OCC validation phase.

 

Given the in process lock provider test dimension has been very stable for a 
long time, we highly suspect the file system lock does not do its work 
depending on the OS/docker container we are using. Disable the test dimension 
to avoid false alarm in java CI



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to