Mihaly Szjatinya has posted comments on this change. ( http://gerrit.cloudera.org:8080/24041 )
Change subject: IMPALA-14583: Support partial RPC dispatch for Iceberg tables ...................................................................... Patch Set 4: (11 comments) Thanks for reivew! Also fixed testLoadIcebergFileDescriptors in this patch. http://gerrit.cloudera.org:8080/#/c/24041/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/24041/3//COMMIT_MSG@16 PS3, Line 16: This patch implements : offset-based pagination to allow catalogd to truncate the response at : any point in the file list, not just at partition boundaries. > Do we also change the corresponding implementation for HdfsTable? No, it's only for Iceberg. http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java: http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@154 PS3, Line 154: > nit: using Collections.emptyMap() to avoid creating lots of empty maps. I t Ack http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@162 PS3, Line 162: if (startOffset == 0 && endOffset == fileDescList_.size()) return toThrift(); : : if (reverseMap_ == null) { : > nit: can be simplified as 'reverseMap' has got some backlash already. What about this: change it to a field with lazy instantiation. This way it's populated only for container with boundary (thanks to above's fast path), and only once. Applied stream nit. http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@586 PS3, Line 586: : /** : * Count total files in a TIcebergContentFileStore. : */ : public static long countContentFiles(TIcebergContentFileStore contentFiles) { : if (contentFiles == null) return 0; : : return contentFiles.getPath_hash_to_data_file_without_deletesSize() : + contentFiles.getPath_hash_to_data_file_with_deletesSize() : + contentFiles.getPath_hash_to_position_delete_fileSize() : + contentFiles.getPath_hash_to_equality_delete_fileSize(); : } : : /** > nit: this can be simplified as Ack http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@892 PS3, Line 892: if (fileOffset == 0) { : partialContentFiles.setTotal_file_count(totalFiles); : } > nit: To be consistent, this can be moved into IcebergContentFileStore.toThr Pls check https://gerrit.cloudera.org/c/24041/3..4/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java#1282 In latest patch, I had to unset it separately at coordinator's site to be consistent with toThrift (fixing the regressed testLoadIcebergFileDescriptors). Maybe in that light we want to keep it separate? http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@104 PS3, Line 104: import org.apache.impala.thrift.TNetworkAddress; > nit: unused import Ack http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1241 PS3, Line 1241: long filesReceived = IcebergContentFileStore.countContentFiles( > nit: this comment corresponds to L1246. Maybe move it down. Hmm, it seems accurate. Patch mismatch? http://gerrit.cloudera.org:8080/#/c/24041/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1286 PS3, Line 1286: } : : @Ov > nit: stale comment? Ack http://gerrit.cloudera.org:8080/#/c/24041/3/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/24041/3/tests/custom_cluster/test_local_catalog.py@726 PS3, Line 726: > nit: please fix the idention, e.g. moving all parameters to the next line a Ack http://gerrit.cloudera.org:8080/#/c/24041/3/tests/custom_cluster/test_local_catalog.py@745 PS3, Line 745: self.assert_impalad_log_contains( > nit: add a blank line after this Ack http://gerrit.cloudera.org:8080/#/c/24041/3/tests/custom_cluster/test_local_catalog.py@821 PS3, Line 821: > nit: we can use parse_query_id() directly Thanks -- To view, visit http://gerrit.cloudera.org:8080/24041 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f2c058b7cc8efc15bac9fe0e91baadbb7b92cbb Gerrit-Change-Number: 24041 Gerrit-PatchSet: 4 Gerrit-Owner: Mihaly Szjatinya <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Mihaly Szjatinya <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 11 Mar 2026 17:21:07 +0000 Gerrit-HasComments: Yes
