Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-2805: Order filters based on selectivity and cost ......................................................................
Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/2598/3/fe/src/main/java/com/cloudera/impala/analysis/AnalyticExpr.java File fe/src/main/java/com/cloudera/impala/analysis/AnalyticExpr.java: Line 543: // This is not currently used as an AnalyticExpr cannot be a child of a conjunct. > i would group that with the type assignment, because it's a standard compon This is tricky, since every Expr::analyze is a little different (eg. type_ is set in Predicate::analyze, not the individual *Predicate classes), but I moved things around to make it as consistent as possible. http://gerrit.cloudera.org:8080/#/c/2598/3/fe/src/main/java/com/cloudera/impala/analysis/CaseExpr.java File fe/src/main/java/com/cloudera/impala/analysis/CaseExpr.java: Line 330: evalCost_ = CASE_COST; > do we need that (on top of the costs of the branches)? No. Removed. http://gerrit.cloudera.org:8080/#/c/2598/3/fe/src/main/java/com/cloudera/impala/analysis/Expr.java File fe/src/main/java/com/cloudera/impala/analysis/Expr.java: Line 191: // its children. evalCost_ will be -1 if it has not been calculated yet. evalCost_ > you should point out when this gets set (see comment for selectivity_). Done Line 235: Preconditions.checkState(evalCost_ >= 0); > check for isAnalyzed_ instead As we briefly discussed, I was hesitant to do this because we have lots of special cases for analysis and its easy to imagine someone adding a new Expr and creating a situation where isAnalyzed_ is true even while evalCost_ hasn't really been set to something meaningful. On the other hand, you make a good point in the above comment that we use '-1' to mean unknown in other places and we should be consistent with that. A big part of the problem is that there is no easy way to see from a particular Expr if any of its subexprs don't have valid costs, since we would just add the '-1' into the total and still end up with a positive cost for the parent Expr. I think the solution is to have nodes check that any costs that they use (eg. the costs of their children) are valid (eg. by calling a hasCost function), and to not set their own cost if that's the case. That way, in orderConjuncts we can check that the costs of the Expr isn't -1, which guarantees that all of their children have valid costs. Line 1249: public static <T extends Expr> List<T> orderConjuncts(List<T> conjuncts) { > too generic a name; orderConjunctsByCost? Done http://gerrit.cloudera.org:8080/#/c/2598/3/fe/src/main/java/com/cloudera/impala/analysis/InPredicate.java File fe/src/main/java/com/cloudera/impala/analysis/InPredicate.java: Line 185: IN_COST; > is in_cost needed? (isn't that captured by the binary_predicate_cost?) No. Removed. -- To view, visit http://gerrit.cloudera.org:8080/2598 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I02279a26fbc6308ac5eb819d78345fc010469034 Gerrit-PatchSet: 3 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-HasComments: Yes
