wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18829 )

Change subject: IMPALA-7942 (part 1): Add query hints for table cardinalities
......................................................................


Patch Set 5:

(4 comments)

> (5 comments)
 >
 > Looks good!
 >
 > Complete the code section and have a few comments.
 >
 > I have not got a chance to review the test section. If it is
 > similar to the one in the parent JIRA of this JIRA, would it be
 > possible to address any concerns there in this JIRA?

Hi Qifan, thanks for continued interest in this feature. This patch is split 
from https://gerrit.cloudera.org/#/c/18023. We will focus on table cardinality 
hint in this patch, and reduce previous patch to focus on selectivity hint.

I've already minimize test cases in this patch. Remove unnecessary plan to 
keeping test file small. Such as PARALLEL and DISTRIBUTED plan, not exists in 
test file. Only 61 lines currently. If you are free, you can read it quickly 
and simply.

http://gerrit.cloudera.org:8080/#/c/18829/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18829/5//COMMIT_MSG@9
PS5, Line 9: Currently, We need execute 'COMPUTE STATS' manually to compute
           : table stats info. Stats is very useful for query planning.
           : Without these stats, query plan maybe worse. In order to solve
           : this probelm, this patch adds a new query hint: 'TABLE_NUM_ROWS',
> nit. Rewording.
Done
By the way, 'valid' -> 'invalid'?


http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/analysis/TableRef.java@176
PS5, Line 176: tableNumRowsHint_
> This new data member in class TableRef implies that two table ref objects m
Do you mean different 'table_num_row' value for same table in sql will cause 
different cardinality is a problem?
We should raise error for this situation?
Such as:
select count(1) from functional_parquet.alltypes t1/* +TABLE_NUM_ROWS(1000) 
*/,functional_parquet.alltypes t2/* +TABLE_NUM_ROWS(2000) */ where t1.id = 
t2.id;


http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1547
PS5, Line 1547:  * Currently, we provide a table hint 'TABLE_NUM_ROWS' to set 
table rows manually if
              :    * table has no stats or has corrupt stats. If the table has 
valid stats, this hint
              :    * will be ignored.
> In my past experience with very large data warehouse, that the table num ro
Thanks for adivce, Qifan. I agree that we can extend for valid stats later on 
if necessary.


http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18829/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@379
PS5, Line 379: no
> nit. has no stats
Done



--
To view, visit http://gerrit.cloudera.org:8080/18829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f0c773f4e67782a1428db64062f68afbd257af7
Gerrit-Change-Number: 18829
Gerrit-PatchSet: 5
Gerrit-Owner: wangsheng <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Fucun Chu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: wangsheng <[email protected]>
Gerrit-Comment-Date: Tue, 13 Sep 2022 13:00:47 +0000
Gerrit-HasComments: Yes

Reply via email to