Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18023 )
Change subject: IMPALA-7942: Add query hints for cardinalities and selectivities ...................................................................... Patch Set 3: (16 comments) The idea behind this patch provides great value and I like it a lot. My concern for this patch is that the hint can only be applied to simple predicate. I wonder if we can allow hints to complex predicates, such as select * from t where (t.a > 1 or t.a < 1000) /* +selectivity (0.4) */ http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@9 PS3, Line 9: use nit uses http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@10 PS3, Line 10: , other predicates nit. remove? http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@11 PS3, Line 11: set these stats : manually in query to help us get better CBO. nit. to reduce such errors. http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@20 PS3, Line 20: But nit remove http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@21 PS3, Line 21: hint value only valid when table does not have stats or stats is nit is valid http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@22 PS3, Line 22: corrupt. Otherwise, Impala will use table original stats. I think we should raise a warning when the hint is not used. http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24 PS3, Line 24: : in non-compounding form. http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@24 PS3, Line 24: For 'SELECTIVITY' hint, we can use in these predicates: nit. types of http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@49 PS3, Line 49: maybe does not take effect as you : expected. > nit: might not take the expected effect I wonder why we can not assign a selectivity to a complex predicate in this patch, as it is usually difficult to figure such selectivity out. http://gerrit.cloudera.org:8080/#/c/18023/3//COMMIT_MSG@51 PS3, Line 51: Another thing, for 'BetweenPredicate', Impala will transfom this : predicate to a 'CompoundPredicate' with two 'BinaryPredicate', if : set hint for 'BetweenPredicate' in query, we will split this hint : value for two 'BinaryPredicate' children. I wonder if this is correct. The selectivity at the BETWEEN level has nothing to do with the selectivity at the two transformed binary predicates. If we support selectivity for complex predicate, we do not have to "push" the selectivity down. http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java File fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java: http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java@248 PS3, Line 248: selectivity_ > 0 nit. This prevents a 0 selectivity. Maybe use -2 to indicate non-computable selectivity? http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/InPredicate.java File fe/src/main/java/org/apache/impala/analysis/InPredicate.java: http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/InPredicate.java@177 PS3, Line 177: elect same comment as for BinaryPredicate.java http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java File fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java: http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java@143 PS3, Line 143: electivity same comment as for BinaryPredicate.java http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java File fe/src/main/java/org/apache/impala/analysis/TableRef.java: http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java@175 PS3, Line 175: hdfsNumRowsHint_ I wonder we can use a more generic name numRowsHint_ as the hint can be applied to other types of tables equally well. We can apply the hint to HDFS only tables in this patch and a check is already in place in this file: 494 // BaseTableRef will always have their path resolved at this point. 495 Preconditions.checkState(getResolvedPath() != null); 496 if (getResolvedPath().destTable() != null && 497 !(getResolvedPath().destTable() instanceof FeFsTable)) { 498 analyzer.addWarning("Table hints only supported for Hdfs tables"); 499 } http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/analysis/TableRef.java@543 PS3, Line 543: // This hint is only valid for hdfs table nit. This comment probably should be moved to line 547. http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/18023/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@331 PS3, Line 331: private long hdfsNumRowsHint_ = -1; I think we probably should move it to ScanNode with the name numRowsHints_. -- To view, visit http://gerrit.cloudera.org:8080/18023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b Gerrit-Change-Number: 18023 Gerrit-PatchSet: 3 Gerrit-Owner: wangsheng <[email protected]> Gerrit-Reviewer: Amogh Margoor <[email protected]> Gerrit-Reviewer: Fucun Chu <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: wangsheng <[email protected]> Gerrit-Comment-Date: Wed, 08 Dec 2021 17:44:16 +0000 Gerrit-HasComments: Yes
