yihua commented on code in PR #10874:
URL: https://github.com/apache/hudi/pull/10874#discussion_r1537052000
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1134,8 +1126,9 @@ protected void validateRollback(
String commitToRollbackInstantTime,
HoodieInstant compactionInstant,
HoodieTimeline deltacommitsSinceCompaction) {
- // The commit being rolled back should not be earlier than the latest
compaction on the MDT. Compaction on MDT only occurs when all actions
- // are completed on the dataset. Hence, this case implies a rollback of
completed commit which should actually be handled using restore.
+ // The commit being rolled back should not be earlier than the latest
compaction on the MDT because the latest file slice does not change after all.
+ // Compaction on MDT only occurs when all actions are completed on the
dataset.
Review Comment:
In this way, MDT compaction can happen at any time, and any deletions will
be reflected in new instant in MDT timeline.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1407,43 +1382,7 @@ protected void cleanIfNecessary(BaseHoodieWriteClient
writeClient, String instan
writeClient.lazyRollbackFailedIndexing();
}
- /**
- * Validates the timeline for both main and metadata tables to ensure
compaction on MDT can be scheduled.
- */
- protected boolean validateTimelineBeforeSchedulingCompaction(Option<String>
inFlightInstantTimestamp, String latestDeltaCommitTimeInMetadataTable) {
- // we need to find if there are any inflights in data table timeline
before or equal to the latest delta commit in metadata table.
- // Whenever you want to change this logic, please ensure all below
scenarios are considered.
- // a. There could be a chance that latest delta commit in MDT is committed
in MDT, but failed in DT. And so findInstantsBeforeOrEquals() should be employed
- // b. There could be DT inflights after latest delta commit in MDT and we
are ok with it. bcoz, the contract is, the latest compaction instant time in
MDT represents
- // any instants before that is already synced with metadata table.
- // c. Do consider out of order commits. For eg, c4 from DT could complete
before c3. and we can't trigger compaction in MDT with c4 as base instant time,
until every
- // instant before c4 is synced with metadata table.
- List<HoodieInstant> pendingInstants =
dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
-
.findInstantsBeforeOrEquals(latestDeltaCommitTimeInMetadataTable).getInstants();
-
- if (!pendingInstants.isEmpty()) {
- checkNumDeltaCommits(metadataMetaClient,
dataWriteConfig.getMetadataConfig().getMaxNumDeltacommitsWhenPending());
- LOG.info(String.format(
- "Cannot compact metadata table as there are %d inflight instants in
data table before latest deltacommit in metadata table: %s. Inflight instants
in data table: %s",
- pendingInstants.size(), latestDeltaCommitTimeInMetadataTable,
Arrays.toString(pendingInstants.toArray())));
- return false;
- }
-
- // Check if there are any pending compaction or log compaction instants in
the timeline.
- // If pending compact/logCompaction operations are found abort scheduling
new compaction/logCompaction operations.
- Option<HoodieInstant> pendingLogCompactionInstant =
Review Comment:
I’m wondering if the condition of scheduling compaction based on log
compaction should be put into the compaction logic in the write/table service
client, instead of the MDT writer. Is this specific to MDT? The logic should
also apply to DT around completion time.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1410,35 +1404,19 @@ protected void cleanIfNecessary(BaseHoodieWriteClient
writeClient, String instan
/**
* Validates the timeline for both main and metadata tables to ensure
compaction on MDT can be scheduled.
*/
- protected boolean validateTimelineBeforeSchedulingCompaction(Option<String>
inFlightInstantTimestamp, String latestDeltaCommitTimeInMetadataTable) {
- // we need to find if there are any inflights in data table timeline
before or equal to the latest delta commit in metadata table.
- // Whenever you want to change this logic, please ensure all below
scenarios are considered.
- // a. There could be a chance that latest delta commit in MDT is committed
in MDT, but failed in DT. And so findInstantsBeforeOrEquals() should be employed
- // b. There could be DT inflights after latest delta commit in MDT and we
are ok with it. bcoz, the contract is, the latest compaction instant time in
MDT represents
- // any instants before that is already synced with metadata table.
- // c. Do consider out of order commits. For eg, c4 from DT could complete
before c3. and we can't trigger compaction in MDT with c4 as base instant time,
until every
- // instant before c4 is synced with metadata table.
- List<HoodieInstant> pendingInstants =
dataMetaClient.reloadActiveTimeline().filterInflightsAndRequested()
-
.findInstantsBeforeOrEquals(latestDeltaCommitTimeInMetadataTable).getInstants();
-
- if (!pendingInstants.isEmpty()) {
- checkNumDeltaCommits(metadataMetaClient,
dataWriteConfig.getMetadataConfig().getMaxNumDeltacommitsWhenPending());
- LOG.info(String.format(
- "Cannot compact metadata table as there are %d inflight instants in
data table before latest deltacommit in metadata table: %s. Inflight instants
in data table: %s",
- pendingInstants.size(), latestDeltaCommitTimeInMetadataTable,
Arrays.toString(pendingInstants.toArray())));
- return false;
- }
-
- // Check if there are any pending compaction or log compaction instants in
the timeline.
- // If pending compact/logCompaction operations are found abort scheduling
new compaction/logCompaction operations.
- Option<HoodieInstant> pendingLogCompactionInstant =
-
metadataMetaClient.getActiveTimeline().filterPendingLogCompactionTimeline().firstInstant();
- Option<HoodieInstant> pendingCompactionInstant =
-
metadataMetaClient.getActiveTimeline().filterPendingCompactionTimeline().firstInstant();
- if (pendingLogCompactionInstant.isPresent() ||
pendingCompactionInstant.isPresent()) {
- LOG.warn(String.format("Not scheduling compaction or logCompaction,
since a pending compaction instant %s or logCompaction %s instant is present",
- pendingCompactionInstant, pendingLogCompactionInstant));
- return false;
+ protected boolean validateCompactionScheduling() {
+ // Under the log compaction scope, the sequence of the log-compaction and
compaction needs to be ensured because metadata items such as RLI
+ // only has proc-time ordering semantics. For "ensured", it means the
completion sequence of the log-compaction/compaction is the same as the start
sequence.
+ if (metadataWriteConfig.isLogCompactionEnabled()) {
Review Comment:
There can be a pending log compaction after log compaction is disabled, so
only checking the write config may not be enough?
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1134,8 +1126,9 @@ protected void validateRollback(
String commitToRollbackInstantTime,
HoodieInstant compactionInstant,
HoodieTimeline deltacommitsSinceCompaction) {
- // The commit being rolled back should not be earlier than the latest
compaction on the MDT. Compaction on MDT only occurs when all actions
- // are completed on the dataset. Hence, this case implies a rollback of
completed commit which should actually be handled using restore.
+ // The commit being rolled back should not be earlier than the latest
compaction on the MDT because the latest file slice does not change after all.
+ // Compaction on MDT only occurs when all actions are completed on the
dataset.
Review Comment:
> “Compaction on MDT only occurs when all actions are completed on the
dataset.”
Can this be relaxed after all DT table service instants use new instant time
in MDT and MDT compaction just considers the ones corresponding to the
completed instants on DT timeline?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]