nsivabalan commented on code in PR #7921:
URL: https://github.com/apache/hudi/pull/7921#discussion_r1103473377
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java:
##########
@@ -659,8 +662,46 @@ protected Map<String, Option<HoodiePendingRollbackInfo>>
getPendingRollbackInfos
return infoMap;
}
+ /**
+ * Rolls back the failed delta commits corresponding to the indexing action.
+ * Such delta commits are identified based on the suffix
`METADATA_INDEXER_TIME_SUFFIX` ("004").
+ * <p>
+ * TODO(HUDI-5733): This should be cleaned up once the proper fix of
rollbacks
+ * in the metadata table is landed.
+ *
+ * @return {@code true} if rollback happens; {@code false} otherwise.
+ */
+ protected boolean lazyRollbackFailedIndexing() {
Review Comment:
we should call out in release docs that if someone has failed indexing from
previous version, they should drop that fully before moving to 0.13.0.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java:
##########
@@ -659,8 +662,46 @@ protected Map<String, Option<HoodiePendingRollbackInfo>>
getPendingRollbackInfos
return infoMap;
}
+ /**
+ * Rolls back the failed delta commits corresponding to the indexing action.
+ * Such delta commits are identified based on the suffix
`METADATA_INDEXER_TIME_SUFFIX` ("004").
+ * <p>
+ * TODO(HUDI-5733): This should be cleaned up once the proper fix of
rollbacks
+ * in the metadata table is landed.
+ *
+ * @return {@code true} if rollback happens; {@code false} otherwise.
+ */
+ protected boolean lazyRollbackFailedIndexing() {
+ HoodieTable table = createTable(config, hadoopConf);
+ List<String> instantsToRollback =
getFailedIndexingToRollback(table.getMetaClient());
+ Map<String, Option<HoodiePendingRollbackInfo>> pendingRollbacks =
getPendingRollbackInfos(table.getMetaClient());
+ instantsToRollback.forEach(entry -> pendingRollbacks.putIfAbsent(entry,
Option.empty()));
+ rollbackFailedWrites(pendingRollbacks);
+ return !pendingRollbacks.isEmpty();
+ }
+
+ protected List<String> getFailedIndexingToRollback(HoodieTableMetaClient
metaClient) {
+ Stream<HoodieInstant> inflightInstantsStream =
metaClient.getCommitsTimeline()
+ .filter(instant -> !instant.isCompleted()
+ && isDeltaCommitFromIndexing(instant.getTimestamp()))
+ .getInstantsAsStream();
+ return inflightInstantsStream.filter(instant -> {
+ try {
+ return heartbeatClient.isHeartbeatExpired(instant.getTimestamp());
+ } catch (IOException io) {
+ throw new HoodieException("Failed to check heartbeat for instant " +
instant, io);
+ }
+ }).map(HoodieInstant::getTimestamp).collect(Collectors.toList());
+ }
+
+ protected boolean isDeltaCommitFromIndexing(String instantTime) {
Review Comment:
can these de private?
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java:
##########
@@ -659,8 +662,46 @@ protected Map<String, Option<HoodiePendingRollbackInfo>>
getPendingRollbackInfos
return infoMap;
}
+ /**
+ * Rolls back the failed delta commits corresponding to the indexing action.
+ * Such delta commits are identified based on the suffix
`METADATA_INDEXER_TIME_SUFFIX` ("004").
+ * <p>
+ * TODO(HUDI-5733): This should be cleaned up once the proper fix of
rollbacks
+ * in the metadata table is landed.
+ *
+ * @return {@code true} if rollback happens; {@code false} otherwise.
+ */
+ protected boolean lazyRollbackFailedIndexing() {
Review Comment:
minor. should we name the method "rollbackFailedIndexingInstants"
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadataWriter.java:
##########
@@ -1125,7 +1130,10 @@ private void initialCommit(String createInstantTime,
List<MetadataPartitionType>
LOG.info("Committing " + partitions.size() + " partitions and " +
totalDataFilesCount + " files to metadata");
}
- commit(createInstantTime, partitionToRecordsMap, false);
+ String commitInstantTime =
Review Comment:
should we be doing this in L1088 ? bcoz, this also gets into metadata
records.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]