Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................


Patch Set 3:

(16 comments)

The idea behind this patch provides great value and I like it a lot.

My concern for this patch is that the hint can only be applied to simple 
predicate. I wonder if we can allow hints to complex predicates, such as

select * from t where (t.a > 1 or t.a < 1000) /* +selectivity (0.4) */

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@9
PS3, Line 9: use
nit uses


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@10
PS3, Line 10: , other predicates
nit. remove?


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@11
PS3, Line 11: set these stats
            : manually in query to help us get better CBO.
nit. to reduce such errors.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@20
PS3, Line 20: But
nit remove


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@21
PS3, Line 21: hint value only valid when table does not have stats or stats is
nit is valid


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@22
PS3, Line 22: corrupt. Otherwise, Impala will use table original stats.
I think we should raise a warning when the hint is not used.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24
PS3, Line 24: :
in non-compounding form.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24
PS3, Line 24: For 'SELECTIVITY' hint, we can use in these predicates:
nit. types of


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49
PS3, Line 49: maybe does not take effect as you
            : expected.
> nit: might not take the expected effect
I wonder why we can not assign a selectivity to a complex predicate in this 
patch, as it is usually difficult to figure such selectivity out.


http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@51
PS3, Line 51: Another thing, for 'BetweenPredicate', Impala will transfom this
            : predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if
            : set hint for 'BetweenPredicate' in query, we will split this hint
            : value for two 'BinaryPredicate' children.
I wonder if this is correct.

The selectivity at the BETWEEN level has nothing to do with the selectivity  at 
the two transformed binary predicates.

If we support selectivity for complex predicate, we do not have to "push" the 
selectivity down.


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@248
PS3, Line 248: selectivity_ > 0
nit. This prevents a 0 selectivity. Maybe use -2 to indicate non-computable 
selectivity?


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/InPredicate.java
File fe/src/main/java/org/apache/impala/analysis/InPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/InPredicate.java@177
PS3, Line 177: elect
same comment as for BinaryPredicate.java


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
File fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java@143
PS3, Line 143: electivity
same comment as for BinaryPredicate.java


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java@175
PS3, Line 175: hdfsNumRowsHint_
I wonder we can use a more generic name numRowsHint_ as the hint can be applied 
to other types of tables equally well.

We can apply the hint to HDFS only tables in this patch and a check is already 
in place in this file:

494     // BaseTableRef will always have their path resolved at this point.     
495     Preconditions.checkState(getResolvedPath() != null);
496     if (getResolvedPath().destTable() != null &&
497         !(getResolvedPath().destTable() instanceof FeFsTable)) {            
  
498       analyzer.addWarning("Table hints only supported for Hdfs tables");
499     }


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java@543
PS3, Line 543: // This hint is only valid for hdfs table
nit. This comment probably should be moved to line 547.


http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@331
PS3, Line 331:  private long hdfsNumRowsHint_ = -1;
I think we probably should move it to ScanNode with the name numRowsHints_.



--
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 3
Gerrit-Owner: wangsheng <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Fucun Chu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: wangsheng <[email protected]>
Gerrit-Comment-Date: Wed, 08 Dec 2021 17:44:16 +0000
Gerrit-HasComments: Yes

Reply via email to