Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24153 )

Change subject: IMPALA-14597: Initial HBO support
......................................................................


Patch Set 14:

(11 comments)

> Patch Set 13:
>
> (11 comments)
>
> This is only a half code review at this point.  I'd like to dive in a little 
> deeper, but figured I'd post what I have for now.

Thanks for the review! Addressed these comments first.

http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/analysis/Expr.java@234
PS13, Line 234:           // TODO: support CompoundPredicate like "p = 1 or p = 
2".
> Reuse IS_NOT_EQ_BINARY_PREDICATE?
Done


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/analysis/Expr.java@239
PS13, Line 239:         }
> nit optional: return arg instanceof InPredicate condenses 4 lines into 1.
Done. That looks better.


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@319
PS13, Line 319:       if (rawPath_ == null) return label_;
> Optional for cutesiness: Iterable.getLast(rawPath_)
Done


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/catalog/FeTable.java
File fe/src/main/java/org/apache/impala/catalog/FeTable.java:

http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/catalog/FeTable.java@175
PS13, Line 175:     return slotRefs.stream()
> Optional for cutesiness:
Done. Slightly modified this since the left-hand side of the lambda function 
should be a variable.


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/CanonicalizationStrategy.java
File fe/src/main/java/org/apache/impala/planner/CanonicalizationStrategy.java:

http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/CanonicalizationStrategy.java@54
PS13, Line 54:   private final int errorLevel_;
> nit: errorLevel_
Done


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/ExprCanonicalizer.java
File fe/src/main/java/org/apache/impala/planner/ExprCanonicalizer.java:

http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/ExprCanonicalizer.java@75
PS13, Line 75:     // is not EXPR_REWRITE (i.e. more aggressive than 
EXPR_REWRITE).
> If I'm understanding this code correctly, this is only called for the two s
Yeah, that's better.


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/ExprCanonicalizer.java@97
PS13, Line 97:     List<String> values = new ArrayList<>();
> There are 2 cases here.  One that works for BinaryPredicate and one that wo
These are something like ExprRewriteRules but return strings and probably need 
the table metadata for complex cases. I think it's OK to keep them outside 
Predicate classes. So far we just support two kinds of simple predicates. I'll 
split this into two methods to make it cleaner. In the future when we support 
more complex expressions, perhaps making this a visitor is better?


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/ExprCanonicalizer.java@128
PS13, Line 128:     if (!Expr.IS_NOT_EQ_BINARY_PREDICATE.apply(binPred)
> We have this if check in at least 2 places (that I've noticed so far on the
Expr.IS_EQ_BINARY_PREDICATE is the closest method but it doesn't allow 
NOT_DISTINCT. I changed this to use !Expr.IS_NOT_EQ_BINARY_PREDICATE.apply() 
here. It's a bit awkward since the usage is double negative. I'll check if 
IS_EQ_BINARY_PREDICATE should allow NOT_DISTINCT. Otherwise we can add a new 
function for this.


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1745
PS13, Line 1745:       inputCardinality_ = totalFiles;
> nit:  return immediate if use_hbo_stats is false.
Done


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1031
PS13, Line 1031:       if (hashString == null) break;
> I think it's better to have a test outside the for loop to do this. It flow
Eventually we will replace this with a Preconditions check when HBO supports 
all PlanNodes, all stats types and all canonicalization strategies.

In the current patch, if a PlanNode doesn't support HBO, 
generateHboHashString() returns null for all stats type and all 
canonicalization strategies. So we use "break" here. Maybe adding a 
supportsHBO() method returning a boolean value is more clear?


http://gerrit.cloudera.org:8080/#/c/24153/13/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1034
PS13, Line 1034:       if (hashStrings.contains(hashString)) {
> I know there are only 2 elements, but this usage is better with a Set versu
The returned list will be iterated using the enum of CanonicalizationStrategy 
(corresponding to the index), so we do need it to be a list. This deduplication 
is a small optimization to reduce HBO stats read/write requests. I think so far 
it's OK since the list is either empty or has only one element here.



--
To view, visit http://gerrit.cloudera.org:8080/24153
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6ff60a8bd22c13c0ecad1198934cc96249b1015e
Gerrit-Change-Number: 24153
Gerrit-PatchSet: 14
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Steve Carlin <[email protected]>
Gerrit-Reviewer: Surya Hebbar <[email protected]>
Gerrit-Comment-Date: Mon, 01 Jun 2026 14:20:03 +0000
Gerrit-HasComments: Yes

Reply via email to