wangxianghu commented on a change in pull request #2260:
URL: https://github.com/apache/hudi/pull/2260#discussion_r556391080
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/compact/SparkScheduleCompactionActionExecutor.java
##########
@@ -60,34 +63,64 @@ protected HoodieCompactionPlan scheduleCompaction() {
LOG.info("Checking if compaction needs to be run on " +
config.getBasePath());
Option<HoodieInstant> lastCompaction =
table.getActiveTimeline().getCommitTimeline()
.filterCompletedInstants().lastInstant();
- String lastCompactionTs = "0";
+ HoodieTimeline deltaCommits =
table.getActiveTimeline().getDeltaCommitTimeline();
+ String lastCompactionTs;
+ int deltaCommitsSinceLastCompaction;
if (lastCompaction.isPresent()) {
lastCompactionTs = lastCompaction.get().getTimestamp();
+ deltaCommitsSinceLastCompaction =
deltaCommits.findInstantsAfter(lastCompactionTs,
Integer.MAX_VALUE).countInstants();
+ } else {
+ lastCompactionTs = deltaCommits.firstInstant().get().getTimestamp();
+ deltaCommitsSinceLastCompaction =
deltaCommits.findInstantsAfterOrEquals(lastCompactionTs,
Integer.MAX_VALUE).countInstants();
}
-
- int deltaCommitsSinceLastCompaction =
table.getActiveTimeline().getDeltaCommitTimeline()
- .findInstantsAfter(lastCompactionTs,
Integer.MAX_VALUE).countInstants();
- if (config.getInlineCompactDeltaCommitMax() >
deltaCommitsSinceLastCompaction) {
- LOG.info("Not scheduling compaction as only " +
deltaCommitsSinceLastCompaction
- + " delta commits was found since last compaction " +
lastCompactionTs + ". Waiting for "
- + config.getInlineCompactDeltaCommitMax());
- return new HoodieCompactionPlan();
+ // judge if we need to compact according to num delta commits and time
elapsed
+ boolean compactable = getCompactType(deltaCommitsSinceLastCompaction,
lastCompactionTs);
Review comment:
how about extract all these logic to one method `needCompact(Table
table, CompactType compactType )`, and init proper vars when need.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]