Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17842 )

Change subject: IMPALA-5741: Initial support for reading tiny RDBMS tables
......................................................................


Patch Set 23:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/17842/22//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17842/22//COMMIT_MSG@11
PS22, Line 11: It has some limitations due to the restrictions of "external data
             : source":
> nit: I assume this patch is also still limited by IMPALA-7131 (does not wor
mentioned IMPALA-7131 and IMPALA-12375


http://gerrit.cloudera.org:8080/#/c/17842/22//COMMIT_MSG@33
PS22, Line 33: c drivers and the da
> nit: General comment about naming: instead of referring things with *data-s
renamed the scripts as suggested.


http://gerrit.cloudera.org:8080/#/c/17842/22//COMMIT_MSG@39
PS22, Line 39:
> nit: This can be removed? impala-shell will connect to first impalad.
removed


http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
File 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java:

http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java@273
PS22, Line 273:     String project;
> Is quoting added somewhere? Postgres will convert unquoted capitals to lowe
The case sensitive column names must be provided in column mapping property in 
create table statement since Impala is case in-sensitive and column names are 
save as low case. The column names in column mapping property will be quoted 
with double quotes if the name is not quoted, and jdbc table name will be 
quoted if the column mapping is not empty.
See sample create table statement in 
testdata/bin/create-ext-data-source-table.sql


http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java
File 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java:

http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@96
PS22, Line 96:       // TODO: If a target database cannot flatten this view 
query, try to text
> Are we sure all of the target databases will flatten this view query? Other
This function is replicated from Hive JDBC Storage Handler. I think it had been 
tested. We did not find issue in Postgres and MySql so far. Add TODO comment 
and will revisit this code when finding issue on a target database.


http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@285
PS22, Line 285:     // by calling BasicDataSource.setDriverClassLoader.
> Are these database-specific?
These are configurations for dbcp (Database Connection Pools) component, and 
not database-specific.


http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
File 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java:

http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java@86
PS22, Line 86:     try {
> This should re-throw
fixed


http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java@175
PS22, Line 175:   public void close() throws JdbcDatabaseAccessException {
> probably should re-throw
fixed


http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JethroDatabaseAccessor.java
File 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JethroDatabaseAccessor.java:

http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JethroDatabaseAccessor.java@39
PS22, Line 39:     return "Select * from (" + sql + ") as \"tmp\" limit " + 
limit;
> This may not always flatten well. consider text replacement instead.
This class is replicated from Hive JDBC Storage Handler. I guess it should be 
working. I will leave it as is until we have chance to run test again 
JethroData database.


http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/util/QueryConditionUtil.java
File 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/util/QueryConditionUtil.java:

http://gerrit.cloudera.org:8080/#/c/17842/22/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/util/QueryConditionUtil.java@54
PS22, Line 54:         joiner.add(String.format("%s %s %s", name, op, value));
> May need quoting here.
jdbc column names in columnMapping are quoted.


http://gerrit.cloudera.org:8080/#/c/17842/22/testdata/bin/copy-data-sources.sh
File testdata/bin/copy-data-sources.sh:

http://gerrit.cloudera.org:8080/#/c/17842/22/testdata/bin/copy-data-sources.sh@28
PS22, Line 28:
> nit: Contain this path into its own bash variable and use that for the foll
defined variables EXT_DATA_SOURCES_PATH and JDBC_DRIVERS_PATH


http://gerrit.cloudera.org:8080/#/c/17842/22/testdata/bin/copy-data-sources.sh@40
PS22, Line 40:
             :
             :
> nit: print something at the end of this script to indicate successful run?
Added printing messages as suggested


http://gerrit.cloudera.org:8080/#/c/17842/23/testdata/bin/copy-ext-data-sources.sh
File testdata/bin/copy-ext-data-sources.sh:

http://gerrit.cloudera.org:8080/#/c/17842/23/testdata/bin/copy-ext-data-sources.sh@39
PS23, Line 39: echo "Copied" 
${IMPALA_HOME}/java/ext-data-source/test/target/impala-data-source-test-*.jar \
> line too long (93 > 90)
fixed


http://gerrit.cloudera.org:8080/#/c/17842/23/testdata/bin/copy-ext-data-sources.sh@46
PS23, Line 46: echo "Copied" 
${IMPALA_HOME}/java/ext-data-source/jdbc/target/impala-data-source-jdbc-*.jar \
> line too long (93 > 90)
fixed


http://gerrit.cloudera.org:8080/#/c/17842/22/testdata/bin/create-data-source-table.sql
File testdata/bin/create-data-source-table.sql:

http://gerrit.cloudera.org:8080/#/c/17842/22/testdata/bin/create-data-source-table.sql@56
PS22, Line 56:
> Add tests that require quoting
Added a Postgres table with case sensitive column names and table name. Made 
alltypes_jdbc_datasource_2 point to that new Postgres table. Added 
column.mapping table property to map columns between Impala and Postgres. Added 
unit-test to read the new Postgres's table.


http://gerrit.cloudera.org:8080/#/c/17842/22/testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test
File 
testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test:

http://gerrit.cloudera.org:8080/#/c/17842/22/testdata/workloads/functional-query/queries/QueryTest/jdbc-data-source.test@28
PS22, Line 28: 
11,false,1,1,1,10,1.100000023841858,10.1,'01/02/09','1',2009-01-02 
00:11:00.450000000
             : 
12,true,2,2,2,20,2.200000047683716,20.2,'01/02/09','2',2009-01-02 
00:12:00.460000000
             : 
13,false,3,3,3,30,3.299999952316284,30.3,'01/02/09','3',2009-01-02 
00:13:00.480000000
             : 
14,true,4,4,4,40,4.400000095367432,40.4,'01/02/09','4',2009-01-02 
00:14:00.510000000
             : 20,true,0,0,0,0,0,0,'01/03/09','0',2009-01-03 00:20:00.900000000
> Tried to run the test myself and I get following error:
This is time zone issue. jdbc driver use system time zone by default, but 
Impala use UTC by default. Fixed the issue when reading the value of timestamp 
column from RecordSet by calling ResultSet.getTimestamp() with UTC time zone.



--
To view, visit http://gerrit.cloudera.org:8080/17842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8244e978c7717c6f1452f66f1630b6441392e7d2
Gerrit-Change-Number: 17842
Gerrit-PatchSet: 23
Gerrit-Owner: Fucun Chu <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Fucun Chu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Wed, 04 Oct 2023 03:48:07 +0000
Gerrit-HasComments: Yes

Reply via email to