zhangyue19921010 commented on a change in pull request #4994:
URL: https://github.com/apache/hudi/pull/4994#discussion_r825658841
##########
File path:
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieMetadataTableValidator.java
##########
@@ -345,16 +358,59 @@ public void doMetadataTableValidation() {
boolean finalResult = true;
metaClient.reloadActiveTimeline();
String basePath = metaClient.getBasePath();
+ List<String> baseFilesUnderDeletion = Collections.emptyList();
+
+ if (cfg.skipUnderDeletionDataFiles) {
+ HoodieTimeline pendingCleaningTimeline = metaClient.getActiveTimeline()
+ .getCleanerTimeline()
+ .filter(instant -> instant.getState() !=
HoodieInstant.State.COMPLETED);
Review comment:
Nice call. Yeap we only need to take care of inflight cleaning.
##########
File path:
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieMetadataTableValidator.java
##########
@@ -345,16 +358,59 @@ public void doMetadataTableValidation() {
boolean finalResult = true;
metaClient.reloadActiveTimeline();
String basePath = metaClient.getBasePath();
+ List<String> baseFilesUnderDeletion = Collections.emptyList();
+
+ if (cfg.skipUnderDeletionDataFiles) {
+ HoodieTimeline pendingCleaningTimeline = metaClient.getActiveTimeline()
+ .getCleanerTimeline()
+ .filter(instant -> instant.getState() !=
HoodieInstant.State.COMPLETED);
+
+ baseFilesUnderDeletion =
pendingCleaningTimeline.getInstants().flatMap(instant -> {
+ try {
+ if (instant.isInflight()) {
+ // convert inflight instant to requested and get clean plan
+ instant = new HoodieInstant(HoodieInstant.State.REQUESTED,
instant.getAction(), instant.getTimestamp());
+ }
+ HoodieCleanerPlan cleanerPlan =
CleanerUtils.getCleanerPlan(metaClient, instant);
+
+ return
cleanerPlan.getFilePathsToBeDeletedPerPartition().values().stream().flatMap(cleanerFIleInfoList
-> {
Review comment:
Changed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]