Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/21304 )
Change subject: IMPALA-12910: Support running TPCH/TPCDS queries for JDBC tables ...................................................................... Patch Set 6: (10 comments) http://gerrit.cloudera.org:8080/#/c/21304/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21304/6//COMMIT_MSG@9 PS6, Line 9: This patch adds script to create external JDBC tables for the dataset of Please mention in the commit message how much additional disk space is needed for these tables in a development or test environment. http://gerrit.cloudera.org:8080/#/c/21304/6//COMMIT_MSG@12 PS6, Line 12: It fixes the race condition for the caching of SQL DataSource objects by suggestion: It fixes a race condition when caching SQL DataSource objects by using a new DataSourceObjectCache class... http://gerrit.cloudera.org:8080/#/c/21304/6//COMMIT_MSG@15 PS6, Line 15: java.sql.Connection.close() is not effectively to remove a closed suggestion: java.sql.Connection.close() fails to remove a closed connection from connection pool, which causes JDBC handler threads to wait for available connections from the connection pool for a long time. http://gerrit.cloudera.org:8080/#/c/21304/6/be/src/service/frontend.cc File be/src/service/frontend.cc: http://gerrit.cloudera.org:8080/#/c/21304/6/be/src/service/frontend.cc@99 PS6, Line 99: DEFINE_int32(dbcp_max_conn_pool_size, 8, These seem like they may need to be RDBMS-specific at some point. Maybe we could plan table properties to override them needed? Although that would reduce its effectiveness as a limit on user behavior. http://gerrit.cloudera.org:8080/#/c/21304/6/fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java File fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java: http://gerrit.cloudera.org:8080/#/c/21304/6/fe/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java@214 PS6, Line 214: Connection connToBeClosed = null; This section is a little confusing, particularly what's going on when iterator_ is not null. Could you add a comment describing what's going on? http://gerrit.cloudera.org:8080/#/c/21304/6/fe/src/main/java/org/apache/impala/extdatasource/jdbc/conf/JdbcStorageConfigManager.java File fe/src/main/java/org/apache/impala/extdatasource/jdbc/conf/JdbcStorageConfigManager.java: http://gerrit.cloudera.org:8080/#/c/21304/6/fe/src/main/java/org/apache/impala/extdatasource/jdbc/conf/JdbcStorageConfigManager.java@102 PS6, Line 102: DatabaseType.valueOf(dbTypeName.toUpperCase()); Is this a fix for something? I don't think it's mentioned in the commit message. http://gerrit.cloudera.org:8080/#/c/21304/6/testdata/datasets/tpcds/tpcds_jdbc_schema_template.sql File testdata/datasets/tpcds/tpcds_jdbc_schema_template.sql: http://gerrit.cloudera.org:8080/#/c/21304/6/testdata/datasets/tpcds/tpcds_jdbc_schema_template.sql@22 PS6, Line 22: CREATE EXTERNAL TABLE IF NOT EXISTS {jdbc_db_name}.store_sales ( nit: Could we generate this file? I guess the TPCDS layout isn't likely to change often, so probably not a big deal to hardcode it. http://gerrit.cloudera.org:8080/#/c/21304/6/testdata/datasets/tpch/tpch_jdbc_schema_template.sql File testdata/datasets/tpch/tpch_jdbc_schema_template.sql: http://gerrit.cloudera.org:8080/#/c/21304/6/testdata/datasets/tpch/tpch_jdbc_schema_template.sql@22 PS6, Line 22: CREATE EXTERNAL TABLE IF NOT EXISTS {jdbc_db_name}.lineitem ( Same question about generating the file as tpcds. http://gerrit.cloudera.org:8080/#/c/21304/6/tests/query_test/test_tpcds_queries.py File tests/query_test/test_tpcds_queries.py: http://gerrit.cloudera.org:8080/#/c/21304/6/tests/query_test/test_tpcds_queries.py@776 PS6, Line 776: @SkipIfDockerizedCluster.runs_slowly Any idea why the Dockerized environment is particularly slow for this? http://gerrit.cloudera.org:8080/#/c/21304/6/tests/query_test/test_tpcds_queries.py@800 PS6, Line 800: self.run_test_case('tpcds-decimal_v2-q1', vector, use_db='tpcds_jdbc') Could this use the parameterized run like in test_tpch_queries.py? -- To view, visit http://gerrit.cloudera.org:8080/21304 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I44e8c1bb020e90559c7f22483a7ab7a151b8f48a Gerrit-Change-Number: 21304 Gerrit-PatchSet: 6 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: gaurav singh <[email protected]> Gerrit-Comment-Date: Fri, 19 Apr 2024 22:51:33 +0000 Gerrit-HasComments: Yes
