Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 9:

(23 comments)

Great test section. I completed the review on the section and added some 
comments, most of them are minor.

Seems this patch can provide another simple and powerful tool for people to 
better optimize queries.

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5053
PS9, Line 5053: "Syntax error in line 1"
This error may not be user friendly for super long SQL query.

Could we report the error specifically as follows?

Table hint not recognized for <T>: a negative value (-1).


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5056
PS9, Line 5056: "Syntax error in line 1")
same comment as above.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5060
PS9, Line 5060:  "Syntax error in line 1"
same comment as above.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5064
PS9, Line 5064: table
nit. may add a qualifier "non-HDFS" before 'table' to make it clear.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5071
PS9, Line 5071: "Syntax error in line 1"
See the comment on TABLE_NUM_ROWS for "syntax error in line 1".


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5085
PS9, Line 5085:   "Syntax error in line 1")
is this a mistake? 0.1 is a perfect selectivity value.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5087
PS9, Line 5087: Syntax error in line 1"
Same as above. 0 is okay.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5093
PS9, Line 5093: (
should be [0,1], right?


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5098
PS9, Line 5098: 0.0
See the comment about. Should be allowed.


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5102
PS9, Line 5102: l_shipdate <= (select '1998-09-02') " +
              :             "/* +SELECTIVITY(0.5)
>From the updated parser, it seems this hint should be accepted in that the 
>entire predicate is recognized first l_shipdate <= (select '1998-09-02'), 
>followed by the hint.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test@90
PS9, Line 90: DISTRIBUTEDPLAN
same argument as for selectivity hints. We probably do not need to test 
PARALLEL and DISTRIBUTED plan.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test@261
PS9, Line 261: 100000
Maybe in the future we could allow short-hand notation such as 100K, 10G etc.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test:

http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@2
PS9, Line 2: default value is 0.1
Please add a comment after it to indicate the cardinality clause in the scan 
node 00 shows the actual value after the selectivity of 0.1 is applied: 6M X 
0.01 = 600.12K.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@70
PS9, Line 70: little
nit. Since almost 98% values are less than '1998-09-02', we set


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@95
PS9, Line 95: 5.88M
nice.

I agree with QiangLong that we only need the serial plan in this test as the 
distributed plan is generated well after the cardinality is determined.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@344
PS9, Line 344: There are no null value in 'l_shipdate' column, so this 
selectivity is 0
nit. Delete this sentence and add "Here we assume ...".


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@481
PS9, Line 481: predicate
nit. predicate's


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@643
PS9, Line 643: 5.31M
nice!


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@754
PS9, Line 754: This predicate cannot compute selectivity, default value is 0.1
nit. Delete this sentence and add "Here we assume ...".


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@780
PS9, Line 780: 3.57M
Can you double check? It seems to me 3M should be the right number.

See the test case for the following query

select count(1) from tpch.lineitem
where l_shipdate IS NULL /* +SELECTIVITY(0.5) */

at line 345.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@822
PS9, Line 822: A simple example for selectivity hint to change join mode, just 
for testing!
nit. A simple example to show that selectivity hint can help change join mode 
and be used as an optimization tool.


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@826
PS9, Line 826: Default value is 0.1,
nit. Impala can not compute selectivity and assumes the default value of 0.1.  
If we assume the actual ...


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test@827
PS9, Line 827: will be
nit. becomes



--
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Fucun Chu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: wangsheng <[email protected]>
Gerrit-Comment-Date: Wed, 07 Sep 2022 17:35:33 +0000
Gerrit-HasComments: Yes

Reply via email to