Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-2805: Order filters based on selectivity and cost
......................................................................


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/2598/3/fe/src/main/java/com/cloudera/impala/analysis/AnalyticExpr.java
File fe/src/main/java/com/cloudera/impala/analysis/AnalyticExpr.java:

Line 543:     // This is not currently used as an AnalyticExpr cannot be a 
child of a conjunct.
> i would group that with the type assignment, because it's a standard compon
This is tricky, since every Expr::analyze is a little different (eg. type_ is 
set in Predicate::analyze, not the individual *Predicate classes), but I moved 
things around to make it as consistent as possible.


http://gerrit.cloudera.org:8080/#/c/2598/3/fe/src/main/java/com/cloudera/impala/analysis/CaseExpr.java
File fe/src/main/java/com/cloudera/impala/analysis/CaseExpr.java:

Line 330:     evalCost_ = CASE_COST;
> do we need that (on top of the costs of the branches)?
No. Removed.


http://gerrit.cloudera.org:8080/#/c/2598/3/fe/src/main/java/com/cloudera/impala/analysis/Expr.java
File fe/src/main/java/com/cloudera/impala/analysis/Expr.java:

Line 191:   // its children. evalCost_ will be -1 if it has not been calculated 
yet. evalCost_
> you should point out when this gets set (see comment for selectivity_).
Done


Line 235:     Preconditions.checkState(evalCost_ >= 0);
> check for isAnalyzed_ instead
As we briefly discussed, I was hesitant to do this because we have lots of 
special cases for analysis and its easy to imagine someone adding a new Expr 
and creating a situation where isAnalyzed_ is true even while evalCost_ hasn't 
really been set to something meaningful.

On the other hand, you make a good point in the above comment that we use '-1' 
to mean unknown in other places and we should be consistent with that.

A big part of the problem is that there is no easy way to see from a particular 
Expr if any of its subexprs don't have valid costs, since we would just add the 
'-1' into the total and still end up with a positive cost for the parent Expr.

I think the solution is to have nodes check that any costs that they use (eg. 
the costs of their children) are valid (eg. by calling a hasCost function), and 
to not set their own cost if that's the case. That way, in orderConjuncts we 
can check that the costs of the Expr isn't -1, which guarantees that all of 
their children have valid costs.


Line 1249:   public static <T extends Expr> List<T> orderConjuncts(List<T> 
conjuncts) {
> too generic a name; orderConjunctsByCost?
Done


http://gerrit.cloudera.org:8080/#/c/2598/3/fe/src/main/java/com/cloudera/impala/analysis/InPredicate.java
File fe/src/main/java/com/cloudera/impala/analysis/InPredicate.java:

Line 185:         IN_COST;
> is in_cost needed? (isn't that captured by the binary_predicate_cost?)
No. Removed.


-- 
To view, visit http://gerrit.cloudera.org:8080/2598
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I02279a26fbc6308ac5eb819d78345fc010469034
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-HasComments: Yes

Reply via email to