Jason Fehr has posted comments on this change. ( http://gerrit.cloudera.org:8080/23709 )
Change subject: IMPALA-13552: Add support for size()/length() for array/map types ...................................................................... Patch Set 2: (14 comments) Overall a very solid change. A few minor items to look into. http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions-test.cc File be/src/exprs/collection-functions-test.cc: http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions-test.cc@34 PS2, Line 34: param_desc.type = FunctionContext::Type::TYPE_ARRAY; Do these tests also need to run against TYPE_MAP? http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions-test.cc@126 PS2, Line 126: EXPECT_EQ(5, result3.val); Define a constant instead of using the magic number 5. http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions-test.cc@128 PS2, Line 128: // All should return the same count regardless of the "type" : EXPECT_EQ(result1.val, result2.val); : EXPECT_EQ(result2.val, result3.val); Can these asserts be removed since the value stored in `.val` has already been asserted in previous steps? http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions.h File be/src/exprs/collection-functions.h: http://gerrit.cloudera.org:8080/#/c/23709/2/be/src/exprs/collection-functions.h@17 PS2, Line 17: : #ifndef IMPALA_EXPRS_COLLECTION_FUNCTIONS_H : #define IMPALA_EXPRS_COLLECTION_FUNCTIONS_H Nit: use #pragma once https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536#:~:text=Header%20include%20guards%3A%20for%20new%20code%2C%20we%20prefer%C2%A0%23pragma%20once%20because%20it%20is%20cleaner%20than%20the%20classic%20%23include%20guards http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/analysis/CollectionFunctionsBuiltins.java File fe/src/main/java/org/apache/impala/analysis/CollectionFunctionsBuiltins.java: http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/analysis/CollectionFunctionsBuiltins.java@43 PS2, Line 43: // Add more collection function registrations here as they're implemented: : // registerMapKeysFunctions(db); : // registerMapValuesFunctions(db); : // registerArrayContainsFunctions(db); Nit: No need for these comments, please remove. http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/analysis/CollectionFunctionsBuiltins.java@80 PS2, Line 80: varArgs, userVisible, Nit: these two need to their order switched. http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/util/FunctionUtils.java File fe/src/main/java/org/apache/impala/util/FunctionUtils.java: http://gerrit.cloudera.org:8080/#/c/23709/2/fe/src/main/java/org/apache/impala/util/FunctionUtils.java@174 PS2, Line 174: - use the original comparison Nit: no need for this portion of the comment, please remove. http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test File testdata/workloads/functional-query/queries/QueryTest/collection-functions.test: http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@1 PS2, Line 1: ==== Can tests be added to ensure size() works on nested arrays/maps (e.g. an array of arrays or an array of maps)? http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@77 PS2, Line 77: ---- QUERY : # Test size() with empty array : select size(int_array) from complextypestbl where id = 3 : ---- RESULTS : 0 : ---- TYPES : int : ==== : ---- QUERY : # Test size() with NULL array : select size(int_array) from complextypestbl where id = 4 : ---- RESULTS : NULL : ---- TYPES : int : ==== : ---- QUERY : # Test size() with empty map : select size(int_map) from complextypestbl where id = 4 : ---- RESULTS : 0 : ---- TYPES : int : ==== : ---- QUERY : # Test size() with NULL map : select size(int_map) from complextypestbl where id = 6 : ---- RESULTS : NULL : ---- TYPES : int These tests do not seem to be testing anything new. The previous tests cover these case (unless the where clause changes things). If these test does not cover anything new, please remove them. http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@175 PS2, Line 175: ---- QUERY : # Test size() with nested arrays - check inner array : select id, size(int_array_array) from complextypestbl where id = 1 : ---- RESULTS : 1,2 : ---- TYPES : bigint,int This test does not seem to be testing anything new. The test on line 33 covers this case (unless the where clause changes things). If this test does not cover anything new, please remove it. http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@183 PS2, Line 183: ---- QUERY : # Test size() with ORDER BY : select id, size(int_array) as arr_size from complextypestbl order by arr_size desc nulls last, id : ---- RESULTS : 2,6 : 1,3 : 8,1 : 3,0 : 4,NULL : 5,NULL : 6,NULL : 7,NULL : ---- TYPES : bigint,int This test is asserting the case where the return from a UDF is used in the order by clause. This case should already be tested elsewhere which means this test can be removed. http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@219 PS2, Line 219: ---- QUERY : # Test size() with arithmetic operations : select id, size(int_array) * 2 as doubled_size from complextypestbl where size(int_array) is not null order by id : ---- RESULTS : 1,6 : 2,12 : 3,0 : 8,2 : ---- TYPES : bigint,bigint This test is asserting the case where the return from a UDF can be used in arithmetic operations. This case should already be tested elsewhere which means this test can be removed. http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@280 PS2, Line 280: ---- QUERY : # Test size() with JOIN : select t1.id, size(t1.int_array) as t1_size, size(t2.int_array) as t2_size : from complextypestbl t1 join complextypestbl t2 on t1.id = t2.id : where size(t1.int_array) is not null and size(t2.int_array) is not null : order by t1.id : ---- RESULTS : 1,3,3 : 2,6,6 : 3,0,0 : 8,1,1 : ---- TYPES : bigint,int,int Is this test asserting a new test case? Other tests already assert that size() can be used in a where clause. http://gerrit.cloudera.org:8080/#/c/23709/2/testdata/workloads/functional-query/queries/QueryTest/collection-functions.test@295 PS2, Line 295: # Test size() with UNION ALL : select 'table1' as source, id, size(int_array) as arr_size from complextypestbl where id = 1 : union all : select 'table2' as source, id, size(int_array) as arr_size from complextypestbl where id = 2 : ---- RESULTS : 'table1',1,3 : 'table2',2,6 : ---- TYPES : string,bigint,int This test is asserting the case where the return from a UDF can be used in a UNION. This case should already be tested elsewhere which means this test can be removed. -- To view, visit http://gerrit.cloudera.org:8080/23709 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I85b50386790657ea035addb35eed959d71810ab2 Gerrit-Change-Number: 23709 Gerrit-PatchSet: 2 Gerrit-Owner: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jason Fehr <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Xuebin Su <[email protected]> Gerrit-Comment-Date: Thu, 12 Mar 2026 22:04:22 +0000 Gerrit-HasComments: Yes
