Bharath Vissapragada has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12221
Change subject: [PROTOTYPE] IMPALA-5872: Test case builder for query planner ...................................................................... [PROTOTYPE] IMPALA-5872: Test case builder for query planner This patch implements a new "test case" builder for simulating query plans from one cluster on a different cluster/minicluster with different number of nodes. A "test case" in the context of this patch is a single file that includes all the information that is needed to reproduce the query plan of a given query statement. The typical workflow is like. 1) Collect the testcase of a given QueryStmt in cluster A. 2) Copy the testcase output file to cluster B. 3) Load the testcase on cluster B. 4) Run the explain <query> to make sure the plan matches (including number of hosts). Motivation: ---------- - Make query planner issues more debuggable - Improve user experience while collecting query diagnostics - Make it easy to test new planner features by testing it on customer usecases collected from much larger clusters. Caveats: ------ - The tool does not collect actual data files for the tables. Only the metadata state is dumped. - Currently only imports databases/tables/views. We can extend it to work for UDFS etc. - It only works for QueryStmts (select/union queries) - Once the metadata dump is loaded on a target cluster, the state is volatile. Hence it cannot survive a cluster restart / invalidate metadata - Loading a testcase requires setting the query option (SET PLANNER_DEBUG_MODE=true) so that the planner knows to fake the number of hosts. Otherwise it takes into account the local cluster topology. This patch adds two new SQL queries: (full end-to-end example in gerrit comments) For exporting a testcase: ------------------------- EXPORT TESTCASE INTO OUTFILE '<hdfs dir>' <query stmt>; <outputs the testcase file path> For loading a testcase: ---------------------- SET PLANNER_DEBUG_MODE=true; LOAD TESTCASE FROM '<testcase output path>' How it works? ------------ - During export on the source cluster, the command dumps all the thrift states of referenced objects in the query into a gzipped binary file. - During load on a target cluster, it adds these objects to the catalog cache by faking them as DDLs. - The planner also fakes the number of hosts by using the scan range information from the target cluster. ** The patch is just meant to be a prototype to gather some initial feedback. It needs much more polish to be review-ready (comments, logic refactor, unit/e-e tests) Change-Id: Iec83eeb2dc5136768b70ed581fb8d3ed0335cb52 --- M be/src/service/client-request-state.cc M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/CatalogService.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/JniCatalog.thrift M common/thrift/Types.thrift M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java A fe/src/main/java/org/apache/impala/analysis/ExportTestCaseStmt.java A fe/src/main/java/org/apache/impala/analysis/LoadTestCaseStmt.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java M fe/src/main/java/org/apache/impala/common/JniUtil.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/main/jflex/sql-scanner.flex 24 files changed, 453 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/12221/1 -- To view, visit http://gerrit.cloudera.org:8080/12221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iec83eeb2dc5136768b70ed581fb8d3ed0335cb52 Gerrit-Change-Number: 12221 Gerrit-PatchSet: 1 Gerrit-Owner: Bharath Vissapragada <[email protected]>
