sivabalan narayanan created HUDI-5588:
-----------------------------------------

             Summary: Fix Metadata table validator to deduce valid partitions 
when first commit where partition was added is failed
                 Key: HUDI-5588
                 URL: https://issues.apache.org/jira/browse/HUDI-5588
             Project: Apache Hudi
          Issue Type: Bug
          Components: tests-ci
            Reporter: sivabalan narayanan


Metadata validation sometimes fails due to test code issue. 


FS based listing shows 0 partitions, while MDT listing shows all 100 
partitions. Its an issue w/ validator code.
 
actual timeline:
ls -ltr tbl1/hoodie_table/.hoodie/ total 720 drwxr-xr-x 2 nsb staff 64 Jan 17 
18:45 archived drwxr-xr-x 4 nsb staff 128 Jan 17 18:45 metadata -rw-r--r-- 1 
nsb staff 808 Jan 17 18:45 hoodie.properties -rw-r--r-- 1 nsb staff 1230 Jan 17 
18:45 20230117214546000.rollback.requested -rw-r--r-- 1 nsb staff 0 Jan 17 
18:45 20230117214546000.rollback.inflight -rw-r--r-- 1 nsb staff 1414 Jan 17 
18:46 20230117214546000.rollback -rw-r--r-- 1 nsb staff 1230 Jan 17 18:47 
20230117214701512.rollback.requested -rw-r--r-- 1 nsb staff 0 Jan 17 18:47 
20230117214701512.rollback.inflight -rw-r--r-- 1 nsb staff 1414 Jan 17 18:47 
20230117214701512.rollback -rw-r--r-- 1 nsb staff 15492 Jan 17 18:48 
20230117214831503.rollback.requested -rw-r--r-- 1 nsb staff 0 Jan 17 18:48 
20230117214831503.rollback.inflight -rw-r--r-- 1 nsb staff 0 Jan 17 18:48 
20230117214848714.deltacommit.requested -rw-r--r-- 1 nsb staff 16359 Jan 17 
18:48 20230117214831503.rollback -rw-r--r-- 1 nsb staff 69698 Jan 17 18:49 
20230117214848714.deltacommit.inflight -rw-r--r-- 1 nsb staff 0 Jan 17 18:50 
20230117215006714.deltacommit.requested -rw-r--r-- 1 nsb staff 94423 Jan 17 
18:50 20230117214848714.deltacommit -rw-r--r-- 1 nsb staff 142198 Jan 17 18:50 
20230117215006714.deltacommit.inflight
 
 
atleast there is one successfull commit 20230117214848714.deltacommit.
 
but our validator code checks for creation time of partition and considers that 
as valid partition only if that particular commit is succeded.
{code:java}
List<String> allPartitionPathsFromFS = 
FSUtils.getAllPartitionPaths(engineContext, basePath, false, 
cfg.assumeDatePartitioning);
HoodieTimeline completedTimeline = 
metaClient.getActiveTimeline().filterCompletedInstants();

// ignore partitions created by uncommitted ingestion.
allPartitionPathsFromFS = 
allPartitionPathsFromFS.stream().parallel().filter(part -> {
  HoodiePartitionMetadata hoodiePartitionMetadata =
      new HoodiePartitionMetadata(metaClient.getFs(), 
FSUtils.getPartitionPath(basePath, part));

  Option<String> instantOption = 
hoodiePartitionMetadata.readPartitionCreatedCommitTime();
  if (instantOption.isPresent()) {
    String instantTime = instantOption.get();
    return completedTimeline.containsOrBeforeTimelineStarts(instantTime);
  } else {
    return false;
  }
}).collect(Collectors.toList()); {code}
 

we need to fix this
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to