umehrot2 commented on a change in pull request #1459: [HUDI-418] [HUDI-421]
Bootstrap Index using HFile and File System View Changes with unit-test
URL: https://github.com/apache/incubator-hudi/pull/1459#discussion_r404458181
##########
File path:
hudi-common/src/main/java/org/apache/hudi/common/table/view/AbstractTableFileSystemView.java
##########
@@ -395,9 +448,10 @@ protected FileSlice
filterBaseFileAfterPendingCompaction(FileSlice fileSlice) {
public final Stream<HoodieBaseFile> getLatestBaseFilesInRange(List<String>
commitsToReturn) {
try {
readLock.lock();
- return fetchAllStoredFileGroups().map(fileGroup ->
Option.fromJavaOptional(
+ return fetchAllStoredFileGroups().map(fileGroup ->
Pair.of(fileGroup.getFileGroupId(), Option.fromJavaOptional(
fileGroup.getAllBaseFiles().filter(baseFile ->
commitsToReturn.contains(baseFile.getCommitTime())
- &&
!isBaseFileDueToPendingCompaction(baseFile)).findFirst())).filter(Option::isPresent).map(Option::get);
+ &&
!isBaseFileDueToPendingCompaction(baseFile)).findFirst()))).filter(p ->
p.getValue().isPresent())
+ .map(p -> addExternalBaseFileIfPresent(p.getKey(),
p.getValue().get()));
Review comment:
I think it would be better if we can push the code for adding external base
if present down to `fetchAllStoredFileGroups()`. It would help avoid so many
changes we are making to this class to add external files, and avoid having to
keep in mind to do this for future methods we might add. Also, in general I
think its good if the underlying implementations return the file groups formed
with all the information (including external files) instead of this class
managing it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services