https://github.com/apache/incubator-hawq/tree/master/pxf/pxf-jdbc
Issue 1: Security
The example has the username and password in the connection string.
LOCATION ('pxf://localhost:51200/demodb.myclass'
'?PROFILE=JDBC'
'&JDBC_DRIVER=com.mysql.jdbc.Driver'
'&DB_URL=jdbc:mysql://
192.168.200.6:3306/demodb&USER=root&PASS=root'
)
Any chance we can get this changed to a connection profile that points to a
file outside of the database? For Greenplum database and S3, the LOCATION
syntax includes "config=/path/to/config_file". The config_file contains
the S3 credentials. This seems like a good pattern to follow.
As it is right now, anyone that can connect to the database will be able to
see the username and password of the JDBC connection.
Issue 2: Extra Properties
Some JDBC drivers will need many additional properties beyond the URL and
this requires setting it with a put to a Properties variable. An example
of this is Oracle's defaultRowPrefetch property that needs to be updated
from the default of 10 which is designed for OLTP to something larger like
2000 which is more ideal for data extracts.
Additionally, you will need the ability to set the isolation level which is
done with setTransactionIsolation on the Connection. I don't believe you
can set this on the connection URL either. Many SQL Server and DB2
database still don't use snapshot isolation and use dirty reads instead.
Without a dirty read here, your query will block modifications to the table
with a "blocking lock".
So for the configuration file will need both an extra properties variable
that is delimited so you can multiple and a isolation level indicator.
Questions.
1. How do you manage the maximum number of pxf instances? For example, if
you partition like this:
"PARTITION_BY=cdate:date&RANGE=2008-01-01:2010-01-01&INTERVAL=1:day", will
you create 730 pxf instances and thus 730 concurrent queries to the source
database?
2. Have you tested special characters like null, escape, carriage return,
newline, and your delimiter? Maybe that is handled automatically by
pxfwritable_import so this isn't an issue.
Jon Roberts
Principal Engineer | [email protected] | 615-426-8661
On Mon, Apr 3, 2017 at 3:18 PM, Michael Pearce <[email protected]>
wrote:
> I mean the readme file :) at that location.
> ________________________________________
> From: Michael Pearce
> Sent: Monday, April 3, 2017 9:15:48 PM
> To: [email protected]
> Subject: Re: PXF JDBC plugin
>
> On master there is the document that was added.
>
> https://github.com/apache/incubator-hawq/tree/master/pxf/pxf-jdbc
>
> Reading this I think answers some of the questions.
>
>
>
>
>
> Sent using OWA for iPhone
> ________________________________________
> From: Vineet Goel <[email protected]>
> Sent: Monday, April 3, 2017 6:33:07 PM
> To: [email protected]
> Subject: Re: PXF JDBC plugin
>
> Devin and others,
>
> It would be great if you can expand a bit more on the implementation or
> point to a document?
>
> Thanks!
>
>
>
> On Mon, Apr 3, 2017 at 7:53 AM Jon Roberts <[email protected]> wrote:
>
> > JDBC PXF is pretty exciting. Are there details on how this works?
> >
> > Does PXF randomly pick a segment to connect to the JDBC source or do you
> > specify a particular node to execute the query? I'm assuming it isn't a
> > parallel query so you don't flood the JDBC source with x number of
> "select
> > *" queries but I could be wrong.
> >
> > How do you register the JDBC connection string with PXF?
> >
> > How do you manage credentials?
> >
> > How do you register the JDBC jar files?
> >
> >
> > Jon Roberts
> >
> > On Sat, Apr 1, 2017 at 5:05 PM, Ed Espino <[email protected]> wrote:
> >
> > > Thank you Michael, this is a well-deserved report highlight. I have
> added
> > > the following to the report:
> > >
> > > 3. Community contribution highlight(s):
> > >
> > > * Leveraging the extensible PXF design, a JDBC PXF plugin
> > > was contributed by Devin Jia (github id: jiadexin). This
> > > contribution came from the community and not from the
> > > company which originally donated HAWQ to the ASF.
> > >
> > > Warm regrds,
> > > -=e
> > >
> > > On Sat, Apr 1, 2017 at 1:14 PM, Michael André Pearce <
> > > [email protected]> wrote:
> > >
> > > >
> > > > I would add note about the new JDBC PXF plugin being contributed by
> > Devin
> > > > Jia. (Kudos to him)
> > > >
> > > > It's a good show of the community contributing and growing outside of
> > > just
> > > > Pivotal staff (I think for incubator review this is important) and
> also
> > > the
> > > > benefit of the extensible PXF design the project has done.
> >
> The information contained in this email is strictly confidential and for
> the use of the addressee only, unless otherwise indicated. If you are not
> the intended recipient, please do not read, copy, use or disclose to others
> this message or any attachment. Please also notify the sender by replying
> to this email or by telephone (+44(020 7896 0011) and then delete the email
> and any copies of it. Opinions, conclusion (etc) that do not relate to the
> official business of this company shall be understood as neither given nor
> endorsed by it. IG is a trading name of IG Markets Limited (a company
> registered in England and Wales, company number 04008957) and IG Index
> Limited (a company registered in England and Wales, company number
> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> Index Limited (register number 114059) are authorised and regulated by the
> Financial Conduct Authority.
>