Bikramjeet Vig has posted comments on this change. Change subject: IMPALA-5602: Fix query optimization for kudu and datasource tables ......................................................................
Patch Set 5: (7 comments) http://gerrit.cloudera.org:8080/#/c/7560/5/fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java File fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java: Line 367: public boolean hasPushedConjuncts() { return !acceptedConjuncts_.isEmpty(); } > Do we have the same issue with 'filters_' in the HBaseScanNode? No, for HBaseScanNode, if filters are created they are not removed from the conjuncts member variable http://gerrit.cloudera.org:8080/#/c/7560/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java File fe/src/main/java/org/apache/impala/planner/ScanNode.java: Line 219: public boolean hasPushedConjuncts() { > It seems strange and unnecessary to distinguish the "pushed" from "non-push Makes sense, but as Matt pointed out earlier in a discussion that having similar names like hasConjuncts() or hasScanConjuncts() might make the interface to this class confusing while comparing hasConjuncts() and getConjuncts().isEmpty(). Would it make more sense to have a more specific name like he suggested earlier: getEffectiveScanConjuncts() ? http://gerrit.cloudera.org:8080/#/c/7560/5/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: Line 317: TQueryOptions options = defaultQueryOptions(); > What do we need this change for? This adds a lot expected test output. Done http://gerrit.cloudera.org:8080/#/c/7560/5/testdata/workloads/functional-planner/queries/PlannerTest/kudu.test File testdata/workloads/functional-planner/queries/PlannerTest/kudu.test: Line 774: ---- DISTRIBUTEDPLAN > We don't need explain level VERBOSE to test this. Done http://gerrit.cloudera.org:8080/#/c/7560/5/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test File testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test: Line 132: ---- QUERY > a PlannerTest is more suitable This code change has no effect on the plan created since for a data source a single node is used in the plan regardless of other optimizations. So, to make sure that small query optimization is not used, I had to check the runtime profile instead and make sure query options set by small query optimization dont appear http://gerrit.cloudera.org:8080/#/c/7560/5/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: Line 1010: """IMPALA-5602: Test that 'small query' optimization is not used if table stats are > Have you tried to see if we can modify FrontendTestBase.addTestTable() to Unfortunately FrontendTestBase.addTestTable() can only be used to add hdfs tables. http://gerrit.cloudera.org:8080/#/c/7560/5/tests/query_test/test_queries.py File tests/query_test/test_queries.py: Line 164: # Reset this exec_option to check default behaviour of any planner optimization tests > Why? This way we lose coverage of the other case. My understanding is that Done -- To view, visit http://gerrit.cloudera.org:8080/7560 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I93822d67ebda41d5d0456095c429e3915a3f40c4 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
