Wenzhe Zhou has uploaded a new patch set (#2). (
http://gerrit.cloudera.org:8080/21304 )
Change subject: WIP IMPALA-12910: Support running TPCH/TPCDS queries for JDBC
tables
......................................................................
WIP IMPALA-12910: Support running TPCH/TPCDS queries for JDBC tables
This patch adds script to create external JDBC tables for the dataset
of TPCH and TPCDS, and adds unit-tests to run TPCH and TPCDS queries
for external JDBC tables with Impala-Impala federation.
testdata/bin/create-tpc-jdbc-tables.py supports to create JDBC tables
for Impala-Impala, Postgres and MySQL.
Following sample commands creates TPCDS JDBC tables for Impala-Impala
federation with remote coordinator running at 10.19.10.86, and Postgres
server running at 10.19.10.86:
${IMPALA_HOME}/testdata/bin/create-tpc-jdbc-tables.py \
--jdbc_db_name=tpcds_jdbc --workload=tpcds \
--database_type=IMPALA --database_host=10.19.10.86 --clean
${IMPALA_HOME}/testdata/bin/create-tpc-jdbc-tables.py \
--jdbc_db_name=tpcds_jdbc --workload=tpcds \
--database_type=POSTGRES --database_host=10.19.10.86 \
--database_name=tpcds --clean
TODO
- run TPCDS queries in exhaustive mode.
- set proper default values for maxTotal and maxWaitMillis of DBCP
configuration parameters.
Remaining Issues:
- tpcds-decimal_v2-q80a failed with returned rows not matching
expected results for some decimal values.
- Coordinator open multiple JDBC connections in parallel for some
complex TPCDS quries with multiple DataSource scan nodes in
query plan, which cause connection not available from connection
pool if maxTotal is less than total number of scan nodes.
- maxWaitMillis is not working.
Testing:
- TODO Pass all TPCH/TCPDS queries.
Change-Id: I44e8c1bb020e90559c7f22483a7ab7a151b8f48a
---
M
fe/src/main/java/org/apache/impala/extdatasource/jdbc/conf/JdbcStorageConfigManager.java
M
fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java
M
fe/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M testdata/bin/create-load-data.sh
A testdata/bin/create-tpc-jdbc-tables.py
A testdata/datasets/tpcds/tpcds_jdbc_schema_template.sql
A testdata/datasets/tpch/tpch_jdbc_schema_template.sql
M tests/query_test/test_tpcds_queries.py
M tests/query_test/test_tpch_queries.py
10 files changed, 1,532 insertions(+), 7 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/21304/2
--
To view, visit http://gerrit.cloudera.org:8080/21304
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I44e8c1bb020e90559c7f22483a7ab7a151b8f48a
Gerrit-Change-Number: 21304
Gerrit-PatchSet: 2
Gerrit-Owner: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>