yihua commented on code in PR #13556:
URL: https://github.com/apache/hudi/pull/13556#discussion_r2217501838
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/HoodieCompactor.java:
##########
@@ -186,7 +187,7 @@ private FileSlice
getFileSliceFromOperation(CompactionOperation operation, Strin
return new FileSlice(
operation.getFileGroupId(),
operation.getBaseInstantTime(),
- baseFileOpt.isPresent() ? baseFileOpt.get() : null,
+ operationType == WriteOperationType.LOG_COMPACT ||
baseFileOpt.isEmpty() ? null : baseFileOpt.get(),
Review Comment:
On the same topic of constructing `FileGroupReaderBasedAppendHandle`, the
file slice is derived from the operation, which is also passed to
`FileGroupReaderBasedAppendHandle`. So should
`FileGroupReaderBasedAppendHandle` take the `operation` only and construct the
file slice within `FileGroupReaderBasedAppendHandle` instead of taking an
additional argument `fileSlice`? Similar for `FileGroupReaderBasedMergeHandle`.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/FileGroupReaderBasedAppendHandle.java:
##########
@@ -114,6 +115,7 @@ public List<WriteStatus> close() {
if (writeStatus.getStat().getRuntimeStats() != null) {
writeStatus.getStat().getRuntimeStats().setTotalScanTime(readStats.getTotalLogReadTimeMs());
}
+ writeStatus.getStat().setPrevCommit(operation.getBaseInstantTime());
Review Comment:
Actually the `prevCommit` represents the base instant time or "version" of
the file slice, so this is correct.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/HoodieCompactor.java:
##########
@@ -186,7 +187,7 @@ private FileSlice
getFileSliceFromOperation(CompactionOperation operation, Strin
return new FileSlice(
operation.getFileGroupId(),
operation.getBaseInstantTime(),
- baseFileOpt.isPresent() ? baseFileOpt.get() : null,
+ operationType == WriteOperationType.LOG_COMPACT ||
baseFileOpt.isEmpty() ? null : baseFileOpt.get(),
Review Comment:
Should the filtering or drop of base file happen inside
`FileGroupReaderBasedAppendHandle` before being passed to the FGReader for
compacting log files only so that the `fileSlice` in
`FileGroupReaderBasedAppendHandle` still represents the latest file slice, to
avoid confusion and possible breaking change during consolidation of append
handles in the future?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]