Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/15403 )
Change subject: IMPALA-6505: Min-Max predicate push down in ORC scanner ...................................................................... Patch Set 8: (1 comment) * Added FE changes to skip generating min-max predicates on CHAR/VARCHAR types of ORC. * Bumped ORC version to contain the fix of ORC-971. Adjusted a test due to having this fix. > > But I think we do need to improve observability on the final predicates > > that are pushed down. E.g. "x = 1" is currently transformed into "x <= 1" > > and "x >= 1" but it's not shown in the plan. > > It may be possible for FE to go over the min/max predicates and output, as > "orc pushdown predicates" in the explain string, those that can be translated > into ORC push-down predicates by BE. Filed IMPALA-10406 to follow up this. http://gerrit.cloudera.org:8080/#/c/15403/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/15403/7/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@428 PS7, Line 428: computeMinMaxTupleAndConjuncts(analyzer) > As a followup question about showing ORC stats predicates vs. ORC push-down Yeah, that's a good direction. We can even skip materializing columns that only used in pushed down predicates (similar to IMPALA-10406). But currently the C++ ORC lib doesn't provide row-level filtering, i.e. the returned results can still contains unmatched rows. It just uses the pushed down predicates as hints to skip reading unneccessary RowGroups (Note: in ORC, RowGroup is another concept than parquet's. It by defaults has 10,000 rows). BTW, I feel like the ORC lib is not mature enough currently, e.g. it has bugs like ORC-971. We can put this in future items. -- To view, visit http://gerrit.cloudera.org:8080/15403 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I136622413db21e0941d238ab6aeea901a6464845 Gerrit-Change-Number: 15403 Gerrit-PatchSet: 8 Gerrit-Owner: Norbert Luksa <[email protected]> Gerrit-Reviewer: Anonymous Coward (520) Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Norbert Luksa <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 27 Aug 2021 02:08:08 +0000 Gerrit-HasComments: Yes
