Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8480 )
Change subject: IMPALA-4985: use parquet stats of nested types for dynamic pruning ...................................................................... Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/8480/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/8480/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@439 PS5, Line 439: private boolean isArrayPosReference(SlotRef slotRef) { > Move to SlotRef? ah, didn't understand this as "move this method to the SlotRef class". Done. http://gerrit.cloudera.org:8080/#/c/8480/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@562 PS5, Line 562: // Adds only predicates for collections that are guarded by an IsNotEmptyPredicate. > guarded -> filtered Done http://gerrit.cloudera.org:8080/#/c/8480/5/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@563 PS5, Line 563: // Its assumed that analysis adds these guards such that they are correct, but > It is assumed Done http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test File testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test: http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test@95 PS5, Line 95: where a.item.e < -10; > Can you add a filter at all levels to make sure that all works together? Done http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test@96 PS5, Line 96: ---- PLAN > Do we need to go to explain level 2 in all these tests? that's the lowest level (at the moment) that prints out "parquet statistics predicates". http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test@99 PS5, Line 99: PLAN-ROOT SINK > Do we have tests for min-max filters on a top-level struct? I mean somethin I have tests for collections nested in top-level structs. Added a test for scalar in a top-level struct. These are in the runtime filtering tests. http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test@266 PS5, Line 266: # Test collections in a way that would incorrect to apply a min-max > garbled sentence Done http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test File testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test: http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test@40 PS5, Line 40: where int_map.value < -1; > Can you modify the tests to mix in more predicate variety? For example, use Done http://gerrit.cloudera.org:8080/#/c/8480/5/testdata/workloads/functional-query/queries/QueryTest/nested-types-parquet-stats.test@145 PS5, Line 145: # False pruning example. There is one table that's scanned (complextypestbl). > Add 1-2 more tests along these lines with non-selective min-max filters tha Done -- To view, visit http://gerrit.cloudera.org:8080/8480 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0c99e20cb080b504442cd5376ea3e046016158fe Gerrit-Change-Number: 8480 Gerrit-PatchSet: 5 Gerrit-Owner: Vuk Ercegovac <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Tue, 21 Nov 2017 23:36:07 +0000 Gerrit-HasComments: Yes
