yihua commented on a change in pull request #5090:
URL: https://github.com/apache/hudi/pull/5090#discussion_r832525843
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/ConcurrentOperation.java
##########
@@ -116,14 +116,27 @@ private void init(HoodieInstant instant) {
this.metadataWrapper.getMetadataFromTimeline().getHoodieReplaceCommitMetadata().getPartitionToWriteStats()).keySet();
this.operationType =
WriteOperationType.fromValue(this.metadataWrapper.getMetadataFromTimeline().getHoodieReplaceCommitMetadata().getOperationType());
} else {
- HoodieRequestedReplaceMetadata requestedReplaceMetadata =
this.metadataWrapper.getMetadataFromTimeline().getHoodieRequestedReplaceMetadata();
- this.mutatedFileIds = requestedReplaceMetadata
- .getClusteringPlan().getInputGroups()
- .stream()
- .flatMap(ig -> ig.getSlices().stream())
- .map(file -> file.getFileId())
- .collect(Collectors.toSet());
- this.operationType = WriteOperationType.CLUSTER;
+ // we need to different handling for requested and inflight
replacecommit because
+ // for requested replacecommit, clustering will generate a plan
and HoodieRequestedReplaceMetadata will not be empty, but
insert_overwrite/insert_overwrite_table could have empty content
+ // for inflight replacecommit, clustering will have no content in
metadata, but insert_overwrite/insert_overwrite_table will have some commit
metadata
+ if (instant.isRequested()) {
+ HoodieRequestedReplaceMetadata requestedReplaceMetadata =
this.metadataWrapper.getMetadataFromTimeline().getHoodieRequestedReplaceMetadata();
+ if (requestedReplaceMetadata != null) {
+ this.mutatedFileIds = requestedReplaceMetadata
+ .getClusteringPlan().getInputGroups()
+ .stream()
+ .flatMap(ig -> ig.getSlices().stream())
+ .map(file -> file.getFileId())
+ .collect(Collectors.toSet());
+ this.operationType = WriteOperationType.CLUSTER;
+ }
+ } else {
Review comment:
If the replacecommit from clustering is inflight, should we still read
the requested HoodieRequestedReplaceMetadata? Otherwise, we miss those stored
File IDs.
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/transaction/ConcurrentOperation.java
##########
@@ -116,14 +116,27 @@ private void init(HoodieInstant instant) {
this.metadataWrapper.getMetadataFromTimeline().getHoodieReplaceCommitMetadata().getPartitionToWriteStats()).keySet();
this.operationType =
WriteOperationType.fromValue(this.metadataWrapper.getMetadataFromTimeline().getHoodieReplaceCommitMetadata().getOperationType());
} else {
- HoodieRequestedReplaceMetadata requestedReplaceMetadata =
this.metadataWrapper.getMetadataFromTimeline().getHoodieRequestedReplaceMetadata();
- this.mutatedFileIds = requestedReplaceMetadata
- .getClusteringPlan().getInputGroups()
- .stream()
- .flatMap(ig -> ig.getSlices().stream())
- .map(file -> file.getFileId())
- .collect(Collectors.toSet());
- this.operationType = WriteOperationType.CLUSTER;
+ // we need to different handling for requested and inflight
replacecommit because
Review comment:
nit: `we need to different handling` -> `we need to have different
handling`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]