Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15403 )

Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner
......................................................................


Patch Set 11:

(3 comments)

Looks great. Just have several optimization-related comments.

http://gerrit.cloudera.org:8080/#/c/15403/11/be/src/exec/hdfs-orc-scanner.cc
File be/src/exec/hdfs-orc-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/15403/11/be/src/exec/hdfs-orc-scanner.cc@1053
PS11, Line 1053:   RETURN_IF_ERROR(ScalarExprEvaluator::Clone(&obj_pool_, 
state_,
               :       expr_perm_pool_.get(), context_->expr_results_pool(),
               :       scan_node_->min_max_conjunct_evals(), 
&min_max_conjunct_evals_));
Since HdfsOrcScanner operates at the thread level, the work here (all the way 
to populate sarg) can be moved to the scan node level. That is, we populate 
sarg once and allow all scanners to read it and create the push-down predicate 
within each.

To check literals, can we use min_max_conjuncts_ (std::vector<ScalarExpr*>) 
directly, without the need to populate min_max_conjunct_evals_?


http://gerrit.cloudera.org:8080/#/c/15403/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/15403/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@575
PS11, Line 575: } else if (BinaryPredicate.IS_EQ_PREDICATE.apply(binaryPred)) {
              :       addMinMaxOriginalConjunct(slotDesc.getParent(), 
binaryPred);
              :       // TODO: this could be optimized for boolean columns.
              :       buildStatsPredicate(analyzer, slotRef, binaryPred, 
BinaryPredicate.Operator.LE);
              :       buildStatsPredicate(analyzer, slotRef, binaryPred, 
BinaryPredicate.Operator.GE);
              :     }
Optimization: we may keep EQ operator as is for ORC, since ORC supports EQUALS 
(https://orc.apache.org/api/hive-storage-api/org/apache/hadoop/hive/ql/io/sarg/PredicateLeaf.Operator.html).


http://gerrit.cloudera.org:8080/#/c/15403/11/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@581
PS11, Line 581:   }
Future optimization: we may also push-down IsNullPredicate.



--
To view, visit http://gerrit.cloudera.org:8080/15403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845
Gerrit-Change-Number: 15403
Gerrit-PatchSet: 11
Gerrit-Owner: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Anonymous Coward (520)
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 07 Sep 2021 16:51:29 +0000
Gerrit-HasComments: Yes

Reply via email to