[ 
https://issues.apache.org/jira/browse/HAWQ-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radar Lei reassigned HAWQ-1445:
-------------------------------

    Assignee: Oleksandr Diachenko  (was: Radar Lei)

> PXF JDBC Security, Extra Props, and Max Queries
> -----------------------------------------------
>
>                 Key: HAWQ-1445
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1445
>             Project: Apache HAWQ
>          Issue Type: Improvement
>          Components: PXF
>            Reporter: Jon Roberts
>            Assignee: Oleksandr Diachenko
>
> 1) Security
> The example has the username and password in the connection string.  
> LOCATION ('pxf://localhost:51200/demodb.myclass'
>               '?PROFILE=JDBC'
>               '&JDBC_DRIVER=com.mysql.jdbc.Driver'
>               
> '&DB_URL=jdbc:mysql://192.168.200.6:3306/demodb&USER=root&PASS=root'
>               )
> This creates  security issue because anyone that can connect to the database 
> will be able to see the username and password of the JDBC connection.
> I suggest changing the URL to a connection profile that points to a file 
> outside of the database.  For Greenplum database and S3, the LOCATION syntax 
> includes "config=/path/to/config_file".  The config_file contains the S3 
> credentials.  This seems like a good pattern to follow here too.
> 2) Extra Properties
> Some JDBC drivers will need many additional properties beyond the URL and 
> this requires setting it with a put to a Properties variable.  An example of 
> this is Oracle's defaultRowPrefetch property that needs to be updated from 
> the default of 10 which is designed for OLTP to something larger like 2000 
> which is more ideal for data extracts.  
> Additionally, you will need the ability to set the isolation level which is 
> done with setTransactionIsolation on the Connection.  I don't believe you can 
> set this on the connection URL either.  Many SQL Server and DB2 database 
> still don't use snapshot isolation and use dirty reads instead to prevent 
> blocking locks.  The configuration file I suggested above will need an "extra 
> properties" variable that is a delimited list of key/value pairs so you can 
> add multiple extra properties.
> 3) Max Queries
> The external table definition doesn't limit how many concurrent queries can 
> be executed on the remote server.  It would be pretty simple to create a 
> single external table using PXF JDBC that would issue thousands of concurrent 
> queries to a single source database when doing a single SELECT in HAWQ.
> Initially, we should add a max_queries variable to the configuration file 
> that I'm suggesting, that will reject queries from proceeding when a greater 
> number of PXF instances are being requested than the max_queries variable.  
> Longer term, we should implement a queueing system so we can support external 
> tables that partitions data from the source at a very small grain but without 
> killing the source database.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to