Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/23493 )
Change subject: IMPALA-14472: Add create/read support for ARRAY column of Kudu ...................................................................... Patch Set 4: (6 comments) Some high level comments, I am still in the process of understanding how Kudu handles arrays. http://gerrit.cloudera.org:8080/#/c/23493/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23493/4//COMMIT_MSG@12 PS4, Line 12: Is there a documentation about Kudu array feature and mem layout? It would be nice to reference it in commit message/related code. http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-scanner.cc File be/src/exec/kudu/kudu-scanner.cc: http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-scanner.cc@469 PS4, Line 469: result.ptr = item_tuple_mem_pool->Allocate(total_tuple_byte_size); Is the Kudu array mem layout different than our tuple layout? Is it not possible to avoid copying in some cases? http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-scanner.cc@543 PS4, Line 543: // Evaluate the conjuncts that haven't been pushed down to Kudu. Conjunct evaluation out of scope for this commit: It would make sense to do "late materialization light" and if the conjuncts do not reference collection/timestamp/varchar tuples then do the conversions above only for tuples that survived the conjunct. This is especially important for arrays where memory gets allocated for each row. http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-util-ir.cc File be/src/exec/kudu/kudu-util-ir.cc: http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-util-ir.cc@165 PS4, Line 165: // KuduScanner::PAD_UNIXTIME_MICROS_TO_16_BYTES does not apply to array elements. Do you know if this is is by design or just a feature gap in Kudu? http://gerrit.cloudera.org:8080/#/c/23493/4/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: http://gerrit.cloudera.org:8080/#/c/23493/4/tests/query_test/test_kudu.py@1868 PS4, Line 1868: functional_kudu Why functional_kudu, can't the test use a unique_database? This could cause interference with other tests if they do SHOW TABLES in functional_kudu http://gerrit.cloudera.org:8080/#/c/23493/4/tests/query_test/test_kudu.py@1966 PS4, Line 1966: expected_tuples = list(zip( : range(len(sorted_tuples)), : *[EXPECTED_COLUMNS[item_type] for item_type in types] : )) Wouldn't it be easier to use a .test file? SUPPORTED_ITEM_TYPES seems constant. -- To view, visit http://gerrit.cloudera.org:8080/23493 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9282aac821bd30668189f84b2ed8fff7047e7310 Gerrit-Change-Number: 23493 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Xuebin Su <[email protected]> Gerrit-Comment-Date: Tue, 07 Oct 2025 17:21:45 +0000 Gerrit-HasComments: Yes
