[ https://issues.apache.org/jira/browse/HAWQ-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radar Lei reassigned HAWQ-1445: ------------------------------- Assignee: Oleksandr Diachenko (was: Radar Lei) > PXF JDBC Security, Extra Props, and Max Queries > ----------------------------------------------- > > Key: HAWQ-1445 > URL: https://issues.apache.org/jira/browse/HAWQ-1445 > Project: Apache HAWQ > Issue Type: Improvement > Components: PXF > Reporter: Jon Roberts > Assignee: Oleksandr Diachenko > > 1) Security > The example has the username and password in the connection string. > LOCATION ('pxf://localhost:51200/demodb.myclass' > '?PROFILE=JDBC' > '&JDBC_DRIVER=com.mysql.jdbc.Driver' > > '&DB_URL=jdbc:mysql://192.168.200.6:3306/demodb&USER=root&PASS=root' > ) > This creates security issue because anyone that can connect to the database > will be able to see the username and password of the JDBC connection. > I suggest changing the URL to a connection profile that points to a file > outside of the database. For Greenplum database and S3, the LOCATION syntax > includes "config=/path/to/config_file". The config_file contains the S3 > credentials. This seems like a good pattern to follow here too. > 2) Extra Properties > Some JDBC drivers will need many additional properties beyond the URL and > this requires setting it with a put to a Properties variable. An example of > this is Oracle's defaultRowPrefetch property that needs to be updated from > the default of 10 which is designed for OLTP to something larger like 2000 > which is more ideal for data extracts. > Additionally, you will need the ability to set the isolation level which is > done with setTransactionIsolation on the Connection. I don't believe you can > set this on the connection URL either. Many SQL Server and DB2 database > still don't use snapshot isolation and use dirty reads instead to prevent > blocking locks. The configuration file I suggested above will need an "extra > properties" variable that is a delimited list of key/value pairs so you can > add multiple extra properties. > 3) Max Queries > The external table definition doesn't limit how many concurrent queries can > be executed on the remote server. It would be pretty simple to create a > single external table using PXF JDBC that would issue thousands of concurrent > queries to a single source database when doing a single SELECT in HAWQ. > Initially, we should add a max_queries variable to the configuration file > that I'm suggesting, that will reject queries from proceeding when a greater > number of PXF instances are being requested than the max_queries variable. > Longer term, we should implement a queueing system so we can support external > tables that partitions data from the source at a very small grain but without > killing the source database. -- This message was sent by Atlassian JIRA (v6.4.14#64029)