wangxianghu commented on a change in pull request #2260:
URL: https://github.com/apache/hudi/pull/2260#discussion_r548796771
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/compact/SparkScheduleCompactionActionExecutor.java
##########
@@ -65,10 +66,12 @@ protected HoodieCompactionPlan scheduleCompaction() {
int deltaCommitsSinceLastCompaction =
table.getActiveTimeline().getDeltaCommitTimeline()
.findInstantsAfter(lastCompactionTs,
Integer.MAX_VALUE).countInstants();
- if (config.getInlineCompactDeltaCommitMax() >
deltaCommitsSinceLastCompaction) {
+ if (config.getInlineCompactDeltaCommitMax() >
deltaCommitsSinceLastCompaction
+ && timeCompaction(instantTime, initialTime,
deltaCommitsSinceLastCompaction)) {
Review comment:
how about triggering a compaction when any one of the conditions is met?
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieCompactionConfig.java
##########
@@ -109,6 +110,7 @@
private static final String DEFAULT_INLINE_COMPACT = "false";
private static final String DEFAULT_INCREMENTAL_CLEANER = "true";
private static final String DEFAULT_INLINE_COMPACT_NUM_DELTA_COMMITS = "5";
+ private static final String DEFAULT_INLINE_COMPACT_ELAPSED_TIME =
String.valueOf(60 * 60 * 24);
Review comment:
Do you mean one day? might be too long, who about an hour
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/action/compact/SparkScheduleCompactionActionExecutor.java
##########
@@ -85,4 +88,15 @@ protected HoodieCompactionPlan scheduleCompaction() {
}
}
+ protected boolean timeCompaction(String instantTime, String initialTime, int
deltaCommitsSinceLastCompaction) {
+ if (Long.parseLong(initialTime) +
config.getInlineCompactDeltaElapsedTimeMax() > Long.parseLong(instantTime)) {
Review comment:
The timestamp is of `yyyyMMddHHmmss` format, we can not simply sum them
by `+`
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/AbstractHoodieWriteClient.java
##########
@@ -129,6 +130,7 @@ public AbstractHoodieWriteClient(HoodieEngineContext
context, HoodieWriteConfig
this.metrics = new HoodieMetrics(config, config.getTableName());
this.rollbackPending = rollbackPending;
this.index = createIndex(writeConfig);
+ this.initialTime = HoodieActiveTimeline.createNewInstantTime();
Review comment:
How about mark this `initialTime` as lastCompaction time, then we can
gei it from timeline. and when we get a new commit, we can check the interval
between these two timestamps to decide whether execute compact or not.
in this way:
1. there is no need to update it additionally
2. it is more preciseness.(Time elapsed since the last compact)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]