[email protected] has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19043 )

Change subject: IMPALA-11591: Avoid calling planFiles() on Iceberg tables
......................................................................


Patch Set 2:

(7 comments)

Thanks for this improving, it`s will be very helpful!
I went through the code and left a few comments.

http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@567
PS2, Line 567: for (ContentFile contentFile : allFiles.first) {
             :         String pathHash = 
IcebergUtil.getFilePathHash(contentFile);
             :         HdfsPartition.FileDescriptor fd = getOrCreateIcebergFd(
             :             table, hdfsFileDescMap, contentFile);
             :         fileStore.addDataFileDescriptor(pathHash, fd);
             :       }
             :       for (ContentFile contentFile : allFiles.second) {
             :         String pathHash = 
IcebergUtil.getFilePathHash(contentFile);
             :         HdfsPartition.FileDescriptor fd = getOrCreateIcebergFd(
             :             table, hdfsFileDescMap, contentFile);
             :         fileStore.addDeleteFileDescriptor(pathHash, fd);
             :       }
nit: The logic of these two for loops is similar, so would it be better if we 
can extract a common method?


http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java:

http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@44
PS2, Line 44:  private List<FileDescriptor> dataFiles_ = new ArrayList<>();
            :   private List<FileDescriptor> deleteFiles_ = new ArrayList<>();
nit: We can add a comment so that others can understand them quickly.


http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@48
PS2, Line 48: private ConcurrentMap
We already initialized the member variable at declaration moment and will not 
change it, would it be better to mark it as a final member variable? This will 
better protect the member from being modified.


http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@106
PS2, Line 106: tFileStore.getPath_hash_to_data_file()
The Field Requiredness of 'TIcebergContentFileStore.path_hash_to_data_file ' is 
'optional', so it may be null. Should we use "isset" flags indicate whether the 
optional Field isset or not?


http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@108
PS2, Line 108: tFileStore.getPath_hash_to_delete_file()
ditto


http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java:

http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/catalog/IcebergPositionDeleteTable.java@119
PS2, Line 119:     Preconditions.checkState(false, "This should never be 
called.");
             :     return null;
This is OK, but I think 'throw new NotImplementedException("msg");' is much 
cleaner, it`s just one line code.


http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java:

http://gerrit.cloudera.org:8080/#/c/19043/2/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@471
PS2, Line 471: getDefaultPartitionSpec
Whether time-travel should be considered? If this gets the most recent 
'PartitionSpec' using 'getDefaultPartitionSpec' and queries a previous moment, 
the result is incorrect.



--
To view, visit http://gerrit.cloudera.org:8080/19043
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb883a28602bb68cf4f61e57cdd691605045ac5
Gerrit-Change-Number: 19043
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 27 Sep 2022 04:17:36 +0000
Gerrit-HasComments: Yes

Reply via email to