wangsheng has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/18023 )
Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities ...................................................................... IMPALA-7942: Add query hints for cardinalities and selectivities Currently, Impala only uses simple estimation to compute selectivity for some predicates, and this may lead to worse query plan due to CBO. Hence, we add new hints to reduce such errors. Maybe in the future, we can use histograms to get more precise query plan. This patch adds two query hints: 'HDFS_NUM_ROWS' and 'SELECTIVITY'. We can add 'HDFS_NUM_ROWS' after a hdfs table in query like this: * select col from t /* +TABLE_NUM_ROWS(1000) */; If set, Impala will use this value as table scanned rows, even if table has stats. For 'SELECTIVITY' hint, we can use in these 'Predicate': * BinaryPredicate * InPredicate * IsNullPredicate * LikePredicate, including 'not like' syntax * BetweenPredicate, including 'not between and' syntax Format like this: select col from t where a=1 /* +SELECTIVITY(0.5) */; This value will replace original selectivity computing. These formats are not allowed: * select col from t where (a=1) /* +SELECTIVITY(0.5) */; * select col from t where (a=1 and b<2) /* +SELECTIVITY(0.5) */; * select col from t1 where exists (...) /* +SELECTIVITY(0.5) */; Pay attention, if you set selectivity hint like this: * select col from t where (a=1 /* +SELECTIVITY(0.5) */ and b>2); Impala will set 0.5 for first binary predicate, second is -1, so Impala can not compute this predicate.The whole compound predicate selectivity is still unavailable. Hence, for compound predicate, we need to ensure that each child selectivity has been set by hint or computable. Otherwise, this hint might not take the expected effect. Another thing, for 'BetweenPredicate', Impala will transfom this predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if set hint for 'BetweenPredicate' in query, we will split this hint value for two 'BinaryPredicate' children. Testing: - Added new fe tests in 'PlannerTest' - Added new fe tests in 'AnalyzeStmtsTest' for negative cases Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java M fe/src/main/java/org/apache/impala/analysis/Predicate.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/ScanNode.java M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java M fe/src/main/jflex/sql-scanner.flex M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test A testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test 14 files changed, 1,635 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/7 -- To view, visit http://gerrit.cloudera.org:8080/18023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b Gerrit-Change-Number: 18023 Gerrit-PatchSet: 7 Gerrit-Owner: wangsheng <[email protected]> Gerrit-Reviewer: Amogh Margoor <[email protected]> Gerrit-Reviewer: Fucun Chu <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: wangsheng <[email protected]>
