Mihaly Szjatinya has posted comments on this change. ( http://gerrit.cloudera.org:8080/24041 )
Change subject: IMPALA-14583: Support partial RPC dispatch for Iceberg tables ...................................................................... Patch Set 3: (8 comments) Thanks! See the answers in threads. Also fixed some regression tests. http://gerrit.cloudera.org:8080/#/c/24041/1/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java: http://gerrit.cloudera.org:8080/#/c/24041/1/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@149 PS1, Line 149: * @param startOffset Index to start from (0-based) > nit: create ret only after fast return path Ack http://gerrit.cloudera.org:8080/#/c/24041/1/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@159 PS1, Line 159: ()) return toThrift(); > Maybe this could be thought through in a follow up patch. Yeah it's a doubly linked list. http://gerrit.cloudera.org:8080/#/c/24041/1/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@175 PS1, Line 175: } > Isn't ret.size() always endOffset - startOffset? if yes, it would be enough You're right, it's unnecessary. http://gerrit.cloudera.org:8080/#/c/24041/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/24041/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1230 PS2, Line 1230: ithPagin > With pagination? Ack http://gerrit.cloudera.org:8080/#/c/24041/2/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@1248 PS2, Line 1248: p.table_info.iceberg_table.content_files.isSetTotal_file_count(), > It could be treated as an exception if total_file_count is not returned on Done http://gerrit.cloudera.org:8080/#/c/24041/2/tests/custom_cluster/test_local_catalog.py File tests/custom_cluster/test_local_catalog.py: http://gerrit.cloudera.org:8080/#/c/24041/2/tests/custom_cluster/test_local_catalog.py@791 PS2, Line 791: "functional.alltypestiny".format(tbl)) : exception = self.execute_query_expect_failure( > How much time does this take? My guess is 150 DMLs would be pretty slow - t Was almost a minute, and now less than 25s. Is that ok ? http://gerrit.cloudera.org:8080/#/c/24041/2/tests/custom_cluster/test_local_catalog.py@798 PS2, Line 798: assert err in str(exception) : self.assert_catalogd_log_contains("ERROR", err) > the query id is not really needed as the database itself will be unique It's needed becase the latest insert statements themselves trigger metadata fetch with the same logs. http://gerrit.cloudera.org:8080/#/c/24041/2/tests/custom_cluster/test_local_catalog.py@801 PS2, Line 801: @pytest.mark.execute_serially : @CustomClusterTestSuite.with_args( : impalad_args="--use_local_catalog=true", : catalogd_args="--catalog_topic_mode=minimal --catalog_partial_fetch_max_files=20") : def test_incomplete_iceberg_file_list(self, unique_database): : """Test that coordinator can fetch missing Iceberg files when catalogd truncates : the file list in the response""" : # Create an Iceberg table with many data files : tbl = unique_database + ".iceberg_tbl" : self.execute_query( : "create table {0} (id int, `data` string) stored as iceberg".format(tbl)) : # Insert multiple times to create multiple data files : for i in range(30): : self.execute_query("insert into {0} values ({1}, 'data{1}')".format(tbl, i)) : : res = self.execute_query_expect_success(self.client, "show files in " + tbl) : # Should have 30 files : assert len(res.data) == 30 : : # Extract query_id from SHOW FILES to precisely check only its log messages : query_id = re.search(r"Query \(id=(.*)\):", res.runtime > Could have a utility function for this to use in both tests Done -- To view, visit http://gerrit.cloudera.org:8080/24041 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7f2c058b7cc8efc15bac9fe0e91baadbb7b92cbb Gerrit-Change-Number: 24041 Gerrit-PatchSet: 3 Gerrit-Owner: Mihaly Szjatinya <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Mihaly Szjatinya <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Mon, 09 Mar 2026 18:02:06 +0000 Gerrit-HasComments: Yes
