Bikramjeet Vig has posted comments on this change. Change subject: IMPALA-5602: Fix query optimization for kudu and datasource tables ......................................................................
Patch Set 6: (4 comments) http://gerrit.cloudera.org:8080/#/c/7560/5/fe/src/main/java/org/apache/impala/planner/ScanNode.java File fe/src/main/java/org/apache/impala/planner/ScanNode.java: Line 219: // Derived classes should override this method if they have any pushed conjuncts > Fair point. To me it's not confusing, but I'll defer to MJ. Instead of addi As discussed, using hasScanConjuncts to denote conjuncts being used by impala, and hasStorageLayerConjuncts to denote those pushed down to the storage layer. http://gerrit.cloudera.org:8080/#/c/7560/5/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java File fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java: Line 57: valid_ = false; > MJ, does this change even make sense? Suppose we have a query with a limit as discussed, this code is correct as we need multiple impala instances to make kudu scans faster http://gerrit.cloudera.org:8080/#/c/7560/5/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test File testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test: Line 132: ---- QUERY > I don't think that's true. This is what I get: Done http://gerrit.cloudera.org:8080/#/c/7560/5/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: Line 1010: """IMPALA-5602: Test that 'small query' optimization is not used if table stats are > I understand. My question is whether you have tried extending addTestTable( Still working on a way to make this work. What I have found till now: addTestTable must be able to create a table only in the catalog and not in the actual storage layer. The problem is that when a kuduScan node is created in the planning phase it connects to the kudu client to make sure that the table exists. Now to get around this, if we create an external table, we will still have to copy code out of KuduTable.load() in order to load the columns from the kudu client. KuduTable.load() cannot be used directly as it requires a handle to HMS instance. Either this, or we refactor the code in KuduTable.load() to extract the required code into a separate method and call that. -- To view, visit http://gerrit.cloudera.org:8080/7560 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I93822d67ebda41d5d0456095c429e3915a3f40c4 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
