[ https://issues.apache.org/jira/browse/HAWQ-1435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16066789#comment-16066789 ]
ASF GitHub Bot commented on HAWQ-1435: -------------------------------------- Github user dyozie commented on a diff in the pull request: https://github.com/apache/incubator-hawq-docs/pull/124#discussion_r124582946 --- Diff: markdown/pxf/JdbcPXF.html.md.erb --- @@ -0,0 +1,213 @@ +--- +title: Accessing External SQL Databases with JDBC (Beta) +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +Some of your data may already reside in an external SQL database. The PXF JDBC plug-in reads data stored in SQL databases including MySQL, ORACLE, PostgreSQL, and Hive. + +This section describes how to use PXF with JDBC, including an example of creating and querying an external table that accesses data in a MySQL database table. + +## <a id="jdbcplugin_prereq"></a>Prerequisites + +Before accessing external SQL databases using HAWQ and PXF, ensure that: + +- The JDBC plug-in is installed on all cluster nodes. See [Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information. +- The JDBC driver JAR files for the external SQL database are installed on all cluster nodes. +- The file locations of external SQL database JDBC JAR files are added to `pxf-public.classpath`. If you manage your HAWQ cluster with Ambari, add the JARS via the Ambari UI. If you managed your cluster from the command line, edit the `/etc/pxf/conf/pxf-public.classpath` file directly. + + +## <a id="hdfsplugin_queryextdata"></a>Querying External SQL Data +The PXF JDBC plug-in supports the single profile named `Jdbc`. + +Use the following syntax to create a HAWQ external table representing external SQL database tables you access via JDBC: + +``` sql +CREATE [READABLE | WRITABLE] EXTERNAL TABLE <table_name> + ( <column_name> <data_type> [, ...] | LIKE <other_table> ) +LOCATION ('pxf://<host>[:<port>]/<jdbc-schema-name>.<jdbc-database-name>.<jdbc-table-name> + ?PROFILE=Jdbc[&<custom-option>=<value>[...]]') +FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import'); +``` + +JDBC-plug-in-specific keywords and values used in the [CREATE EXTERNAL TABLE](../reference/sql/CREATE-EXTERNAL-TABLE.html) call are described in the table below. + +| Keyword | Value | +|-------|-------------------------------------| +| \<column-name\> | The name of the PXF external table column. The PXF \<column-name\> must exactly match the \<column-name\> used in the external SQL table.| +| \<data-type\> | The data type of the PXF external table column. The PXF \<data-type\> must be equivalent to the data type used for \<column-name\> in the SQL table.| +| \<host\> | The PXF host. While \<host\> may identify any PXF agent node, use the HDFS NameNode as it is guaranteed to be available in a running HDFS cluster. If HDFS High Availability is enabled, \<host\> must identify the HDFS NameService. | +| \<port\> | The PXF port. If \<port\> is omitted, PXF assumes \<host\> identifies a High Availability HDFS Nameservice and connects to the port number designated by the `pxf_service_port` server configuration parameter value. Default is 51200. | +| \<jdbc-schema-name\> | The schema name. The default schema name is `default`. | +| \<jdbc-database-name\> | The database name. The default database name is determined by the external SQL server. | +| \<jdbc-table-name\> | The table name. | +| PROFILE | The `PROFILE` keyword must specify `Jdbc`. | +| \<custom-option\> | The custom options supported by the `Jdbc` profile are discussed later in this section.| +| FORMAT 'CUSTOM' | The JDBC `CUSTOM` `FORMAT` supports only the built-in `'pxfwritable_import'` `FORMATTER` property. | + +*Note*: When creating PXF external tables, you cannot use the `HEADER` option in your `FORMAT` specification. + + +### <a id="profile_jdbcoptipns"></a>JDBC Custom Options + +You may include one or more custom options in the `LOCATION` URI. Preface each option with an ampersand `&`. + +The JDBC plug-in `Jdbc` profile supports the following \<custom-option\>s: + +| Option Name | Description +|---------------|--------------------| +| JDBC_DRIVER | The JDBC driver class name. | +| DB_URL | The URL to the database; includes the hostname, port, and database name. | +| USER | The database user name. | +| PASS | The database password for USER. | +| PARTITION_BY | The partition column, \<column-name\>:\<column-type\>. The JDBC plug-in supports `date`, `int`, and `enum` \<column-type\>s. Use the `yyyy-MM-dd` format for the `date` \<column-type\>. A null `PARTITION_BY` defaults to a single fragment. | +| RANGE | The query range, \<start-value\>[:\<end-value\>]. \<end-value\> may be empty for an `int` \<column-type\>. | --- End diff -- Should probably clarify that RANGE and INTERVAL are only used with PARTITION_BY? > docs - add usage info for pxf jdbc plug-in > ------------------------------------------ > > Key: HAWQ-1435 > URL: https://issues.apache.org/jira/browse/HAWQ-1435 > Project: Apache HAWQ > Issue Type: Improvement > Components: Documentation > Reporter: Lisa Owen > Assignee: David Yozie > > create usage info for the new jdbc plug-in. there is some good info in the > pxf-jdbc README.md. -- This message was sent by Atlassian JIRA (v6.4.14#64029)