Alex Behm has posted comments on this change. Change subject: IMPALA-5602: Fix query optimization for kudu and datasource tables ......................................................................
Patch Set 5: (7 comments) http://gerrit.cloudera.org:8080/#/c/7560/5/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java File fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java: Line 367: public boolean hasPushedConjuncts() { return !acceptedConjuncts_.isEmpty(); } Do we have the same issue with 'filters_' in the HBaseScanNode? http://gerrit.cloudera.org:8080/#/c/7560/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java File fe/src/main/java/org/apache/impala/planner/ScanNode.java: Line 219: public boolean hasPushedConjuncts() { It seems strange and unnecessary to distinguish the "pushed" from "non-pushed" conjuncts, especially since the concept is only relevant for certain scans. I'd prefer a function hasConjuncts() or hasScanConjuncts() or similar. Alternatively, each ScanNode subclass could directly override getInputCardinality(). http://gerrit.cloudera.org:8080/#/c/7560/5/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: Line 317: TQueryOptions options = defaultQueryOptions(); What do we need this change for? This adds a lot expected test output. http://gerrit.cloudera.org:8080/#/c/7560/5/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test: Line 774: ---- DISTRIBUTEDPLAN We don't need explain level VERBOSE to test this. http://gerrit.cloudera.org:8080/#/c/7560/5/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test File testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test: Line 132: ---- QUERY a PlannerTest is more suitable http://gerrit.cloudera.org:8080/#/c/7560/5/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: Line 1010: """IMPALA-5602: Test that 'small query' optimization is not used if table stats are Have you tried to see if we can modify FrontendTestBase.addTestTable() to serve this use case? It's unfortunate if we have to split up the tests like this. http://gerrit.cloudera.org:8080/#/c/7560/5/tests/query_test/test_queries.py File tests/query_test/test_queries.py: Line 164: # Reset this exec_option to check default behaviour of any planner optimization tests Why? This way we lose coverage of the other case. My understanding is that in exhaustive we run with and without this option which, so I think we should leave this as is. -- To view, visit http://gerrit.cloudera.org:8080/7560 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I93822d67ebda41d5d0456095c429e3915a3f40c4 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
