Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23709 )

Change subject: IMPALA-13552: Add support for size()/length() for array/map 
types
......................................................................


Patch Set 2:

(14 comments)

Overall a very solid change.  A few minor items to look into.

http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions-test.cc
File be/src/exprs/collection-functions-test.cc:

http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions-test.cc@34
PS2, Line 34:     param_desc.type = FunctionContext::Type::TYPE_ARRAY;
Do these tests also need to run against TYPE_MAP?


http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions-test.cc@126
PS2, Line 126:   EXPECT_EQ(5, result3.val);
Define a constant instead of using the magic number 5.


http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions-test.cc@128
PS2, Line 128:   // All should return the same count regardless of the "type"
             :   EXPECT_EQ(result1.val, result2.val);
             :   EXPECT_EQ(result2.val, result3.val);
Can these asserts be removed since the value stored in `.val` has already been 
asserted in previous steps?


http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions.h
File be/src/exprs/collection-functions.h:

http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions.h@17
PS2, Line 17:
            : #ifndef IMPALA_EXPRS_COLLECTION_FUNCTIONS_H
            : #define IMPALA_EXPRS_COLLECTION_FUNCTIONS_H
Nit: use #pragma once
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536#:~:text=Header%20include%20guards%3A%20for%20new%20code%2C%20we%20prefer%C2%A0%23pragma%20once%20because%20it%20is%20cleaner%20than%20the%20classic%20%23include%20guards


http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/analysis/CollectionFunctionsBuiltins.java
File 
fe/src/main/java/org/apache/impala/analysis/CollectionFunctionsBuiltins.java:

http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/analysis/CollectionFunctionsBuiltins.java@43
PS2, Line 43:     // Add more collection function registrations here as they're 
implemented:
            :     // registerMapKeysFunctions(db);
            :     // registerMapValuesFunctions(db);
            :     // registerArrayContainsFunctions(db);
Nit: No need for these comments, please remove.


http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/analysis/CollectionFunctionsBuiltins.java@80
PS2, Line 80: varArgs, userVisible,
Nit: these two need to their order switched.


http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/util/FunctionUtils.java
File fe/src/main/java/org/apache/impala/util/FunctionUtils.java:

http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/util/FunctionUtils.java@174
PS2, Line 174:  - use the original comparison
Nit: no need for this portion of the comment, please remove.


http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test
File 
testdata/workloads/functional-query/queries/QueryTest/collection-functions.test:

http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@1
PS2, Line 1: ====
Can tests be added to ensure size() works on nested arrays/maps (e.g. an array 
of arrays or an array of maps)?


http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@77
PS2, Line 77: ---- QUERY
            : # Test size() with empty array
            : select size(int_array) from complextypestbl where id = 3
            : ---- RESULTS
            : 0
            : ---- TYPES
            : int
            : ====
            : ---- QUERY
            : # Test size() with NULL array
            : select size(int_array) from complextypestbl where id = 4
            : ---- RESULTS
            : NULL
            : ---- TYPES
            : int
            : ====
            : ---- QUERY
            : # Test size() with empty map
            : select size(int_map) from complextypestbl where id = 4
            : ---- RESULTS
            : 0
            : ---- TYPES
            : int
            : ====
            : ---- QUERY
            : # Test size() with NULL map
            : select size(int_map) from complextypestbl where id = 6
            : ---- RESULTS
            : NULL
            : ---- TYPES
            : int
These tests do not seem to be testing anything new.  The previous tests cover 
these case (unless the where clause changes things).

If these test does not cover anything new, please remove them.


http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@175
PS2, Line 175: ---- QUERY
             : # Test size() with nested arrays - check inner array
             : select id, size(int_array_array) from complextypestbl where id = 
1
             : ---- RESULTS
             : 1,2
             : ---- TYPES
             : bigint,int
This test does not seem to be testing anything new.  The test on line 33 covers 
this case (unless the where clause changes things).

If this test does not cover anything new, please remove it.


http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@183
PS2, Line 183: ---- QUERY
             : # Test size() with ORDER BY
             : select id, size(int_array) as arr_size from complextypestbl 
order by arr_size desc nulls last, id
             : ---- RESULTS
             : 2,6
             : 1,3
             : 8,1
             : 3,0
             : 4,NULL
             : 5,NULL
             : 6,NULL
             : 7,NULL
             : ---- TYPES
             : bigint,int
This test is asserting the case where the return from a UDF is used in the 
order by clause.  This case should already be tested elsewhere which means this 
test can be removed.


http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@219
PS2, Line 219: ---- QUERY
             : # Test size() with arithmetic operations
             : select id, size(int_array) * 2 as doubled_size from 
complextypestbl where size(int_array) is not null order by id
             : ---- RESULTS
             : 1,6
             : 2,12
             : 3,0
             : 8,2
             : ---- TYPES
             : bigint,bigint
This test is asserting the case where the return from a UDF can be used in 
arithmetic operations.  This case should already be tested elsewhere which 
means this test can be removed.


http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@280
PS2, Line 280: ---- QUERY
             : # Test size() with JOIN
             : select t1.id, size(t1.int_array) as t1_size, size(t2.int_array) 
as t2_size
             : from complextypestbl t1 join complextypestbl t2 on t1.id = t2.id
             : where size(t1.int_array) is not null and size(t2.int_array) is 
not null
             : order by t1.id
             : ---- RESULTS
             : 1,3,3
             : 2,6,6
             : 3,0,0
             : 8,1,1
             : ---- TYPES
             : bigint,int,int
Is this test asserting a new test case?  Other tests already assert that size() 
can be used in a where clause.


http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@295
PS2, Line 295: # Test size() with UNION ALL
             : select 'table1' as source, id, size(int_array) as arr_size from 
complextypestbl where id = 1
             : union all
             : select 'table2' as source, id, size(int_array) as arr_size from 
complextypestbl where id = 2
             : ---- RESULTS
             : 'table1',1,3
             : 'table2',2,6
             : ---- TYPES
             : string,bigint,int
This test is asserting the case where the return from a UDF can be used in a 
UNION.  This case should already be tested elsewhere which means this test can 
be removed.



--
To view, visit http://gerrit.cloudera.org:8080/23709
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85b50386790657ea035addb35eed959d71810ab2
Gerrit-Change-Number: 23709
Gerrit-PatchSet: 2
Gerrit-Owner: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Xuebin Su <[email protected]>
Gerrit-Comment-Date: Thu, 12 Mar 2026 22:04:22 +0000
Gerrit-HasComments: Yes

Reply via email to