[email protected] has posted comments on this change. ( http://gerrit.cloudera.org:8080/19043 )
Change subject: IMPALA-11591: Avoid calling planFiles() on Iceberg tables ...................................................................... Patch Set 2: (7 comments) Thanks for this improving, it`s will be very helpful! I went through the code and left a few comments. http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@567 PS2, Line 567: for (ContentFile contentFile : allFiles.first) { : String pathHash = IcebergUtil.getFilePathHash(contentFile); : HdfsPartition.FileDescriptor fd = getOrCreateIcebergFd( : table, hdfsFileDescMap, contentFile); : fileStore.addDataFileDescriptor(pathHash, fd); : } : for (ContentFile contentFile : allFiles.second) { : String pathHash = IcebergUtil.getFilePathHash(contentFile); : HdfsPartition.FileDescriptor fd = getOrCreateIcebergFd( : table, hdfsFileDescMap, contentFile); : fileStore.addDeleteFileDescriptor(pathHash, fd); : } nit: The logic of these two for loops is similar, so would it be better if we can extract a common method? http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java: http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@44 PS2, Line 44: private List<FileDescriptor> dataFiles_ = new ArrayList<>(); : private List<FileDescriptor> deleteFiles_ = new ArrayList<>(); nit: We can add a comment so that others can understand them quickly. http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@48 PS2, Line 48: private ConcurrentMap We already initialized the member variable at declaration moment and will not change it, would it be better to mark it as a final member variable? This will better protect the member from being modified. http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@106 PS2, Line 106: tFileStore.getPath_hash_to_data_file() The Field Requiredness of 'TIcebergContentFileStore.path_hash_to_data_file ' is 'optional', so it may be null. Should we use "isset" flags indicate whether the optional Field isset or not? http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@108 PS2, Line 108: tFileStore.getPath_hash_to_delete_file() ditto http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java: http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java@119 PS2, Line 119: Preconditions.checkState(false, "This should never be called."); : return null; This is OK, but I think 'throw new NotImplementedException("msg");' is much cleaner, it`s just one line code. http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java: http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@471 PS2, Line 471: getDefaultPartitionSpec Whether time-travel should be considered? If this gets the most recent 'PartitionSpec' using 'getDefaultPartitionSpec' and queries a previous moment, the result is incorrect. -- To view, visit http://gerrit.cloudera.org:8080/19043 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iadb883a28602bb68cf4f61e57cdd691605045ac5 Gerrit-Change-Number: 19043 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Tamas Mate <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 27 Sep 2022 04:17:36 +0000 Gerrit-HasComments: Yes
