wangsheng has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities
......................................................................

IMPALA-7942: Add query hints for cardinalities and selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds two query hints: 'HDFS_NUM_ROWS' and 'SELECTIVITY'.
We can add 'HDFS_NUM_ROWS' after a hdfs table in query like this:

  * select col from t /* +TABLE_NUM_ROWS(1000) */;

If set, Impala will use this value as table scanned rows, even if
table has stats.

For 'SELECTIVITY' hint, we can use in these 'Predicate':
  * BinaryPredicate
  * InPredicate
  * IsNullPredicate
  * LikePredicate, including 'not like' syntax
  * BetweenPredicate, including 'not between and' syntax
Format like this:

  select col from t where a=1 /* +SELECTIVITY(0.5) */;

This value will replace original selectivity computing. These formats
are not allowed:

  * select col from t where (a=1) /* +SELECTIVITY(0.5) */;
  * select col from t where (a=1 and b<2) /* +SELECTIVITY(0.5) */;
  * select col from t1 where exists (...) /* +SELECTIVITY(0.5) */;

Pay attention, if you set selectivity hint like this:

  * select col from t where (a=1 /* +SELECTIVITY(0.5) */ and b>2);

Impala will set 0.5 for first binary predicate, second is -1, so
Impala can not compute this predicate.The whole compound predicate
selectivity is still unavailable. Hence, for compound predicate, we
need to ensure that each child selectivity has been set by hint or
computable. Otherwise, this hint might not take the expected effect.
Another thing, for 'BetweenPredicate', Impala will transfom this
predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if
set hint for 'BetweenPredicate' in query, we will split this hint
value for two 'BinaryPredicate' children.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
A 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
14 files changed, 1,635 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/7
--
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 7
Gerrit-Owner: wangsheng <[email protected]>
Gerrit-Reviewer: Amogh Margoor <[email protected]>
Gerrit-Reviewer: Fucun Chu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: wangsheng <[email protected]>

Reply via email to