Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/22458 )
Change subject: IMPALA-13737: Directly load file metadata via IcebergFileMetadataLoader ...................................................................... Patch Set 3: (11 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@246 PS2, Line 246: Whether to skip file metadata loading. Iceberg tables take care of file metadata : // loading themselv > If this property is determined by whether it is an Iceberg table, we could Changing the constructor would be more invasive and might break unexpected code (FENG). Added a TODO to remove this flag once we resolved IMPALA-13734. http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@246 PS2, Line 246: Whether to skip file metadata loading. Iceberg tables take care of file metadata : // loading themselv > Could you explain what determines whether this flag is set to true or false Correct. Done. http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@247 PS2, Line 247: is par > nit: metadata Done http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@248 PS2, Line 248: ct. > If it's only used for Iceberg tables, it could be called 'skipIcebergFileMe Done http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@340 PS2, Line 340: private long lastVersionSeenByTopicUpdate_ = -1; : > nit: could be in one line Done http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java: http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@195 PS2, Line 195: Preconditions.checkNotNull(iceApiTable); > Why was the 'Preconditions.checkNotNull(iceApiTable)' check removed? Re-added the NULL check. http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@340 PS2, Line 340: : public static IcebergContentFileStore fromThrift(TIcebergContentFileStore tFileStore, : List<TNetworkAddress> networkAddresses, > This TODO was left here because it was moved out of the scope of IMPALA-112 Removed this comment as with this change the file descriptors are not duplicated anymore. http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java File fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java@135 PS2, Line 135: tru > Delete 'not'. Nice catch! http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java File fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java@102 PS2, Line 102: Preconditions.checkState(!HdfsFileFormat.ICEBERG.equals(format)); > If it is not used for Iceberg tables, we could add a DCHECK that the format Added a precondition check. http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@1043 PS2, Line 1043: > Unrelated nit: "cf". Good catch, done. http://gerrit.cloudera.org:8080/#/c/22458/2/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@1126 PS2, Line 1126: // This needs to be consistent with getPartitionValue(). > Couldn't the table have a different null partition key value set, is it alw It doesn't really matter, as NULLs are tracked by the Iceberg manifest files. The important thing is that this needs to be consistent with getPartitionValue(). Added a comment about this. -- To view, visit http://gerrit.cloudera.org:8080/22458 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iaf7e23ec21b65036b47edadcb4cbe4b64be3baee Gerrit-Change-Number: 22458 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 11 Feb 2025 15:20:01 +0000 Gerrit-HasComments: Yes
