Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15403 )

Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner
......................................................................


Patch Set 8:

(1 comment)

* Added FE changes to skip generating min-max predicates on CHAR/VARCHAR types 
of ORC.
* Bumped ORC version to contain the fix of ORC-971. Adjusted a test due to 
having this fix.

> > But I think we do need to improve observability on the final predicates 
> > that are pushed down. E.g. "x = 1" is currently transformed into "x <= 1" 
> > and "x >= 1" but it's not shown in the plan.
>
> It may be possible for FE to go over the min/max predicates and output, as 
> "orc pushdown predicates" in the explain string, those that can be translated 
> into ORC push-down predicates by BE.

Filed IMPALA-10406 to follow up this.

http://gerrit.cloudera.org:8080/#/c/15403/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/15403/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@428
PS7, Line 428: computeMinMaxTupleAndConjuncts(analyzer)
> As a followup question about showing ORC stats predicates vs. ORC push-down
Yeah, that's a good direction. We can even skip materializing columns that only 
used in pushed down predicates (similar to IMPALA-10406). But currently the C++ 
ORC lib doesn't provide row-level filtering, i.e. the returned results can 
still contains unmatched rows. It just uses the pushed down predicates as hints 
to skip reading unneccessary RowGroups (Note: in ORC, RowGroup is another 
concept than parquet's. It by defaults has 10,000 rows).

BTW, I feel like the ORC lib is not mature enough currently, e.g. it has bugs 
like ORC-971. We can put this in future items.



--
To view, visit http://gerrit.cloudera.org:8080/15403
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845
Gerrit-Change-Number: 15403
Gerrit-PatchSet: 8
Gerrit-Owner: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Anonymous Coward (520)
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Fri, 27 Aug 2021 02:08:08 +0000
Gerrit-HasComments: Yes

Reply via email to