[
https://issues.apache.org/jira/browse/HAWQ-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007321#comment-16007321
]
Michael Andre Pearce (IG) commented on HAWQ-1445:
-------------------------------------------------
[~PivotalGuru] thanks, I've split these out to three subtasks, so it can be
worked on iteratively/independantly.
I notice also, there is raised:
https://issues.apache.org/jira/browse/HAWQ-1461
Id like to work on HAWQ-1461 first, then probably the config file one next.
btw Anyone know how we make that a sub-task of this, so we can keep all
JDBC-PXF improvements linked for the moment.
> PXF JDBC Security, Extra Props, and Max Queries
> -----------------------------------------------
>
> Key: HAWQ-1445
> URL: https://issues.apache.org/jira/browse/HAWQ-1445
> Project: Apache HAWQ
> Issue Type: Improvement
> Components: PXF
> Reporter: Jon Roberts
> Assignee: Ed Espino
>
> 1) Security
> The example has the username and password in the connection string.
> LOCATION ('pxf://localhost:51200/demodb.myclass'
> '?PROFILE=JDBC'
> '&JDBC_DRIVER=com.mysql.jdbc.Driver'
>
> '&DB_URL=jdbc:mysql://192.168.200.6:3306/demodb&USER=root&PASS=root'
> )
> This creates security issue because anyone that can connect to the database
> will be able to see the username and password of the JDBC connection.
> I suggest changing the URL to a connection profile that points to a file
> outside of the database. For Greenplum database and S3, the LOCATION syntax
> includes "config=/path/to/config_file". The config_file contains the S3
> credentials. This seems like a good pattern to follow here too.
> 2) Extra Properties
> Some JDBC drivers will need many additional properties beyond the URL and
> this requires setting it with a put to a Properties variable. An example of
> this is Oracle's defaultRowPrefetch property that needs to be updated from
> the default of 10 which is designed for OLTP to something larger like 2000
> which is more ideal for data extracts.
> Additionally, you will need the ability to set the isolation level which is
> done with setTransactionIsolation on the Connection. I don't believe you can
> set this on the connection URL either. Many SQL Server and DB2 database
> still don't use snapshot isolation and use dirty reads instead to prevent
> blocking locks. The configuration file I suggested above will need an "extra
> properties" variable that is a delimited list of key/value pairs so you can
> add multiple extra properties.
> 3) Max Queries
> The external table definition doesn't limit how many concurrent queries can
> be executed on the remote server. It would be pretty simple to create a
> single external table using PXF JDBC that would issue thousands of concurrent
> queries to a single source database when doing a single SELECT in HAWQ.
> Initially, we should add a max_queries variable to the configuration file
> that I'm suggesting, that will reject queries from proceeding when a greater
> number of PXF instances are being requested than the max_queries variable.
> Longer term, we should implement a queueing system so we can support external
> tables that partitions data from the source at a very small grain but without
> killing the source database.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)