Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19252 )

Change subject: [IMPALA-11728] Set fallback database for functions
......................................................................


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/19252/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19252/4//COMMIT_MSG@7
PS4, Line 7: [IMPALA-11728]
we generally use the IMPALA-11728: ... format.


http://gerrit.cloudera.org:8080/#/c/19252/4/fe/src/main/java/org/apache/impala/analysis/FunctionName.java
File fe/src/main/java/org/apache/impala/analysis/FunctionName.java:

http://gerrit.cloudera.org:8080/#/c/19252/4/fe/src/main/java/org/apache/impala/analysis/FunctionName.java@126
PS4, Line 126: preferBuiltinsDb
I think that we should use fallback db only if preferBuiltinsDb is true - there 
are cases when we should ignore builtins, for example in CREATE FUNCTION / DROP 
FUNCTION, and my impression is that preferBuiltinsDb is used in these cases (I 
haven't checked all places where we call this function).


http://gerrit.cloudera.org:8080/#/c/19252/4/fe/src/main/java/org/apache/impala/analysis/FunctionName.java@128
PS4, Line 128:       } else if (fallbackDbContainsFn(analyzer)) {
I am not sure about the correct order here - checking the fallback db first 
would allow a nice way to override builtins.


http://gerrit.cloudera.org:8080/#/c/19252/4/fe/src/main/java/org/apache/impala/analysis/FunctionName.java@129
PS4, Line 129: analyzer.getQueryCtx()
             :                   .getClient_request()
             :                   .getQuery_options()
             :                   .getFallback_db_for_functions();
Can you create a function like String Analyzer.getFallbackDbForFunctions(), 
similarly to getDefaultDb?


http://gerrit.cloudera.org:8080/#/c/19252/4/tests/custom_cluster/test_fallback_db_for_functions.py
File tests/custom_cluster/test_fallback_db_for_functions.py:

http://gerrit.cloudera.org:8080/#/c/19252/4/tests/custom_cluster/test_fallback_db_for_functions.py@28
PS4, Line 28: CustomClusterTestSuite
Since there is a query option for fallback_db_for_functions, a simple (non 
CustomCluster) could be used, which would make test running faster.


http://gerrit.cloudera.org:8080/#/c/19252/4/tests/custom_cluster/test_fallback_db_for_functions.py@36
PS4, Line 36: fn
It would be nice to also test a function name that also exists as builtin 
function and check what is happening.


http://gerrit.cloudera.org:8080/#/c/19252/4/tests/custom_cluster/test_fallback_db_for_functions.py@39
PS4, Line 39: functional_parquet
It would be better to use a temporary db than persistent one like 
functional_parquet, as now the test is potentially dirty - for example if the 
assert at line 47 fails, the function will not be dropped, which can affect 
other tests.

I think that the simplest solution would be to move this test to an EE test, 
for example in 
https://github.com/apache/impala/blob/master/tests/query_test/test_udfs.py

In EE tests there is a fixture called unique_database, which can be used to get 
a temporary database that will be cleaned up after the test. You can find 
examples in test_udfs.py for using unique_database and for setting query 
options.



--
To view, visit http://gerrit.cloudera.org:8080/19252
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37b7e126718fea1c50723cacbaed898b20bb55e5
Gerrit-Change-Number: 19252
Gerrit-PatchSet: 4
Gerrit-Owner: Xiaoqing Gao <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Xiaoqing Gao <[email protected]>
Gerrit-Comment-Date: Mon, 21 Nov 2022 14:03:38 +0000
Gerrit-HasComments: Yes

Reply via email to