Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18829 )

Change subject: IMPALA-7942 (part 1): Add query hints for table cardinalities
......................................................................


Patch Set 5:

(5 comments)

Looks good!

Complete the code section and have a few comments.

I have not got a chance to review the test section. If it is similar to the one 
in the parent JIRA of this JIRA, would it be possible to address any concerns 
there in this JIRA?

http://gerrit.cloudera.org:8080/#/c/18829/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18829/5//COMMIT_MSG@9
PS5, Line 9: Currently, We need execute 'COMPUTE STATS' manually to compute
           : table stats info. Stats is very useful for query planning.
           : Without these stats, query plan maybe worse. In order to solve
           : this probelm, this patch adds a new query hint: 'TABLE_NUM_ROWS',
nit. Rewording.

Currently, we run 'COMPUTE STATS' command to compute table stats which is very 
useful for query planning. Without these stats, a query plan may not be 
optimal. However, these stats may not be available, up to date, or valid. To 
workaround this problem, this patch adds a new query hint: ......


http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/analysis/TableRef.java@176
PS5, Line 176: tableNumRowsHint_
This new data member in class TableRef implies that two table ref objects may 
be assigned with different num rows, even when both refer to the same SQL table.

select * from T /* table_num_row(100) */, T /* table_num_row(1000) */;

It sounds like we need to have a step to verify this and raise an error.


http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/analysis/TableRef.java@569
PS5, Line 569: supportTableHint
It is a good idea to specify the exact hint in question: supportTableRowHint().


http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1547
PS5, Line 1547:  * Currently, we provide a table hint 'TABLE_NUM_ROWS' to set 
table rows manually if
              :    * table has no stats or has corrupt stats. If the table has 
valid stats, this hint
              :    * will be ignored.
In my past experience with very large data warehouse, that the table num row 
hint can be applied to even a table with valid stats is very useful to judge 
plans under different table growth conditions.

For this patch, I feel it is okey to support it only for missing or non valid 
stats first, and extend it for valid stats later on.


http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@379
PS5, Line 379: no
nit. has no stats



--
To view, visit http://gerrit.cloudera.org:8080/18829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f0c773f4e67782a1428db64062f68afbd257af7
Gerrit-Change-Number: 18829
Gerrit-PatchSet: 5
Gerrit-Owner: wangsheng <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Fucun Chu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: wangsheng <[email protected]>
Gerrit-Comment-Date: Thu, 08 Sep 2022 17:41:19 +0000
Gerrit-HasComments: Yes

Reply via email to