Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17842 )

Change subject: IMPALA-5741: Initial support for reading tiny RDBMS tables
......................................................................


Patch Set 22:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17842/21//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17842/21//COMMIT_MSG@11
PS21, Line 11: It has some limitations due to the restrictions of "external data
> There also seems to be restrictions on column data types being received fro
Right. Updated commit messages to add restriction for column data type.


http://gerrit.cloudera.org:8080/#/c/17842/21/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java
File 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java:

http://gerrit.cloudera.org:8080/#/c/17842/21/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/JdbcDataSource.java@148
PS21, Line 148:     // 1. Check init string again because the call in prepare() 
was from
              :     // the frontend and used a different instance of this 
JdbcDataSource class.
              :     if 
(!convertInitStringToConfiguration(params.getInit_string())) {
              :       return new TOpenResult(
              :           new TStatus(TErrorCode.INTERNAL_ERROR,
              :
> This is the same check as one in prepare().
This logic is copied from our sample test code AllTypesDataSource.java 
(https://github.com/apache/impala/blob/master/java/ext-data-source/test/src/main/java/org/apache/impala/extdatasource/AllTypesDataSource.java#L179-L181).
 There is a comment in AllTypesDataSource.open(), which says "Need to check 
again because the call in Prepare() was from the frontend and used a different 
instance of this data source class."
The prepare() API is called by Impala frontend, open() API is called by Impala 
backend. Frontend and backend create separate  JdbcDataSource objects. We 
duplicate checking in prepare() and open() for safety. I think we should change 
external data source mechanism to share JdbcDataSource objects between frontend 
and backend for the long term, then we can remove these duplicated code.


http://gerrit.cloudera.org:8080/#/c/17842/21/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/conf/JdbcStorageConfigManager.java
File 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/conf/JdbcStorageConfigManager.java:

http://gerrit.cloudera.org:8080/#/c/17842/21/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/conf/JdbcStorageConfigManager.java@95
PS21, Line 95:   private static boolean isEmptyString(String value) {
> We don't seem to be calling this function anywhere?
This class is replicated from Hive JDBC Storage Handler. I think we should keep 
the change to a minimum so that we can easily port from Hive again if there is 
any improvement or bug fixing in Hive JDBC Storage Handler.


http://gerrit.cloudera.org:8080/#/c/17842/21/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java
File 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java:

http://gerrit.cloudera.org:8080/#/c/17842/21/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@66
PS21, Line 66:   protected DataSource dbcpDataSource = null;
> `dbcpDataSource_` for instance variables
This class is replicated from Hive JDBC Storage Handler. I think we should keep 
the original naming convention.


http://gerrit.cloudera.org:8080/#/c/17842/21/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/GenericJdbcDatabaseAccessor.java@68
PS21, Line 68:   public static Cache<String, DataSource> dataSourceCache = 
CacheBuilder
> naming convention: `dataSourceCache_`
This class is replicated from Hive JDBC Storage Handler. I think we should keep 
the original naming convention.


http://gerrit.cloudera.org:8080/#/c/17842/21/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java
File 
java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java:

http://gerrit.cloudera.org:8080/#/c/17842/21/java/ext-data-source/jdbc/src/main/java/org/apache/impala/extdatasource/jdbc/dao/JdbcRecordIterator.java@53
PS21, Line 53:   private final Connection conn;
> proper naming convention for all instance variables.
This class is replicated from Hive JDBC Storage Handler. I think we should keep 
the original naming convention.


http://gerrit.cloudera.org:8080/#/c/17842/21/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test
File 
testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test:

http://gerrit.cloudera.org:8080/#/c/17842/21/testdata/workloads/functional-query/queries/QueryTest/data-source-tables.test@132
PS21, Line 132:
move tests for jdbc data source to a new file.



--
To view, visit http://gerrit.cloudera.org:8080/17842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8244e978c7717c6f1452f66f1630b6441392e7d2
Gerrit-Change-Number: 17842
Gerrit-PatchSet: 22
Gerrit-Owner: Fucun Chu <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Fucun Chu <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Mon, 02 Oct 2023 05:56:29 +0000
Gerrit-HasComments: Yes

Reply via email to