Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23493 )

Change subject: IMPALA-14472: Add create/read support for ARRAY column of Kudu
......................................................................


Patch Set 4:

(6 comments)

Some high level comments, I am still in the process of understanding how Kudu 
handles arrays.

http://gerrit.cloudera.org:8080/#/c/23493/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23493/4//COMMIT_MSG@12
PS4, Line 12:
Is there a documentation about Kudu array feature and mem layout? It would be 
nice to reference it in commit message/related code.


http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-scanner.cc
File be/src/exec/kudu/kudu-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-scanner.cc@469
PS4, Line 469:     result.ptr = 
item_tuple_mem_pool->Allocate(total_tuple_byte_size);
Is the Kudu array mem layout different than our tuple layout? Is it not 
possible to avoid copying in some cases?


http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-scanner.cc@543
PS4, Line 543:     // Evaluate the conjuncts that haven't been pushed down to 
Kudu. Conjunct evaluation
out of scope for this commit:
It would make sense to do "late materialization light" and if the conjuncts do 
not reference collection/timestamp/varchar tuples then do the conversions above 
only for tuples that survived the conjunct. This is especially important for 
arrays where memory gets allocated for each row.


http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-util-ir.cc
File be/src/exec/kudu/kudu-util-ir.cc:

http://gerrit.cloudera.org:8080/#/c/23493/4/be/src/exec/kudu/kudu-util-ir.cc@165
PS4, Line 165:   // KuduScanner::PAD_UNIXTIME_MICROS_TO_16_BYTES does not apply 
to array elements.
Do you know if this is is by design or just a feature gap in Kudu?


http://gerrit.cloudera.org:8080/#/c/23493/4/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/23493/4/tests/query_test/test_kudu.py@1868
PS4, Line 1868: functional_kudu
Why functional_kudu, can't the test use a unique_database? This could cause 
interference with other tests if they do SHOW TABLES in functional_kudu


http://gerrit.cloudera.org:8080/#/c/23493/4/tests/query_test/test_kudu.py@1966
PS4, Line 1966:     expected_tuples = list(zip(
              :         range(len(sorted_tuples)),
              :         *[EXPECTED_COLUMNS[item_type] for item_type in types]
              :     ))
Wouldn't it be easier to use a .test file? SUPPORTED_ITEM_TYPES seems constant.



--
To view, visit http://gerrit.cloudera.org:8080/23493
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9282aac821bd30668189f84b2ed8fff7047e7310
Gerrit-Change-Number: 23493
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Xuebin Su <[email protected]>
Gerrit-Comment-Date: Tue, 07 Oct 2025 17:21:45 +0000
Gerrit-HasComments: Yes

Reply via email to