prashantwason commented on a change in pull request #4336:
URL: https://github.com/apache/hudi/pull/4336#discussion_r772843537
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -706,7 +706,20 @@ protected void
compactIfNecessary(AbstractHoodieWriteClient writeClient, String
}
}
- protected void doClean(AbstractHoodieWriteClient writeClient, String
instantTime) {
+ protected void cleanIfNecessary(AbstractHoodieWriteClient writeClient,
String instantTime) {
+ Option<HoodieInstant> lastCompletedCompactionInstant =
metadataMetaClient.reloadActiveTimeline()
+ .getCommitTimeline().filterCompletedInstants().lastInstant();
+ if (lastCompletedCompactionInstant.isPresent()
+ && metadataMetaClient.getActiveTimeline().filterCompletedInstants()
+
.findInstantsAfter(lastCompletedCompactionInstant.get().getTimestamp()).countInstants()
< 3) {
+ // do not clean the log files immediately after compaction to give some
buffer time for metadata table reader,
Review comment:
So this problem should also exist in the MOR table data path? Is there
any solution there?
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metadata/SparkHoodieBackedTableMetadataWriter.java
##########
@@ -154,7 +154,7 @@ protected void commit(HoodieData<HoodieRecord>
hoodieDataRecords, String partiti
metadataMetaClient.reloadActiveTimeline();
Review comment:
reloadActiveTimelice called here so not necessary in ccleanIfNeceasry/
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -706,7 +706,20 @@ protected void
compactIfNecessary(AbstractHoodieWriteClient writeClient, String
}
}
- protected void doClean(AbstractHoodieWriteClient writeClient, String
instantTime) {
+ protected void cleanIfNecessary(AbstractHoodieWriteClient writeClient,
String instantTime) {
+ Option<HoodieInstant> lastCompletedCompactionInstant =
metadataMetaClient.reloadActiveTimeline()
Review comment:
is reloadActiveTimeline() neceassary here?
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java
##########
@@ -706,7 +706,20 @@ protected void
compactIfNecessary(AbstractHoodieWriteClient writeClient, String
}
}
- protected void doClean(AbstractHoodieWriteClient writeClient, String
instantTime) {
+ protected void cleanIfNecessary(AbstractHoodieWriteClient writeClient,
String instantTime) {
+ Option<HoodieInstant> lastCompletedCompactionInstant =
metadataMetaClient.reloadActiveTimeline()
Review comment:
Also, can you check if there is already a metadata table function to get
the last compaction timestamp?
I guess there are other code paths where this is required. So would be a
good idea to create a utility function if does not exist.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]