Repository: incubator-hawq-docs
Updated Branches:
  refs/heads/master 458ef8ea4 -> 40d9a0b6d


HAWQ-1435 document new pxf jdbc plug-in (closes #124)


Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/commit/40d9a0b6
Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/tree/40d9a0b6
Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/diff/40d9a0b6

Branch: refs/heads/master
Commit: 40d9a0b6d27ab37d831827e9e845f3d5d8b0040b
Parents: 458ef8e
Author: Lisa Owen <[email protected]>
Authored: Wed Jun 28 09:43:56 2017 -0700
Committer: David Yozie <[email protected]>
Committed: Wed Jun 28 09:43:56 2017 -0700

----------------------------------------------------------------------
 .../source/subnavs/apache-hawq-nav.erb          |   1 +
 .../pxf/HawqExtensionFrameworkPXF.html.md.erb   |   4 +
 markdown/pxf/InstallPXFPlugins.html.md.erb      |   4 +-
 markdown/pxf/JdbcPXF.html.md.erb                | 211 +++++++++++++++++++
 4 files changed, 218 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/40d9a0b6/book/master_middleman/source/subnavs/apache-hawq-nav.erb
----------------------------------------------------------------------
diff --git a/book/master_middleman/source/subnavs/apache-hawq-nav.erb 
b/book/master_middleman/source/subnavs/apache-hawq-nav.erb
index f18edef..85eb424 100644
--- a/book/master_middleman/source/subnavs/apache-hawq-nav.erb
+++ b/book/master_middleman/source/subnavs/apache-hawq-nav.erb
@@ -398,6 +398,7 @@
               <li><a 
href="/docs/userguide/2.2.0.0-incubating/pxf/HivePXF.html">Accessing Hive 
Data</a></li>
               <li><a 
href="/docs/userguide/2.2.0.0-incubating/pxf/HBasePXF.html">Accessing HBase 
Data</a></li>
               <li><a 
href="/docs/userguide/2.2.0.0-incubating/pxf/JsonPXF.html">Accessing JSON 
Data</a></li>
+              <li><a 
href="/docs/userguide/2.2.0.0-incubating/pxf/JdbcPXF.html">Accessing External 
SQL Databases</a></li>
               <li><a 
href="/docs/userguide/2.2.0.0-incubating/pxf/HDFSWritablePXF.html">Writing Data 
to HDFS</a></li>
               <li><a 
href="/docs/userguide/2.2.0.0-incubating/pxf/ReadWritePXF.html">Using Profiles 
to Read and Write Data</a></li>
               <li><a 
href="/docs/userguide/2.2.0.0-incubating/pxf/PXFExternalTableandAPIReference.html">PXF
 External Tables and API</a></li>

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/40d9a0b6/markdown/pxf/HawqExtensionFrameworkPXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/HawqExtensionFrameworkPXF.html.md.erb 
b/markdown/pxf/HawqExtensionFrameworkPXF.html.md.erb
index 1a49576..8d82d8a 100644
--- a/markdown/pxf/HawqExtensionFrameworkPXF.html.md.erb
+++ b/markdown/pxf/HawqExtensionFrameworkPXF.html.md.erb
@@ -51,6 +51,10 @@ PXF allows users to create custom connectors to access other 
parallel data store
 
     This topic describes how to access JSON data using PXF.
 
+-   **[Accessing External SQL Databases](../pxf/JdbcPXF.html)**
+
+    This topic describes how to access data in external SQL databases using 
PXF.
+
 -   **[Writing Data to HDFS](../pxf/HDFSWritablePXF.html)**
 
     This topic describes how to write to HDFS using PXF.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/40d9a0b6/markdown/pxf/InstallPXFPlugins.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/InstallPXFPlugins.html.md.erb 
b/markdown/pxf/InstallPXFPlugins.html.md.erb
index 4fbda74..8060e5e 100644
--- a/markdown/pxf/InstallPXFPlugins.html.md.erb
+++ b/markdown/pxf/InstallPXFPlugins.html.md.erb
@@ -21,7 +21,7 @@ specific language governing permissions and limitations
 under the License.
 -->
 
-This topic describes how to install the built-in PXF service plug-ins that are 
required to connect PXF to HDFS, Hive, HBase, and JSON. 
+This topic describes how to install the built-in PXF service plug-ins that are 
required to connect PXF to HDFS, Hive, HBase, JDBC, and JSON. 
 
 **Note:** PXF requires that you run Tomcat on the host machine. Tomcat 
reserves ports 8005, 8080, and 8009. If you have configured Oozie JXM reporting 
on a host that will run PXF, make sure that the reporting service uses a port 
other than 8005. This helps to prevent port conflict errors from occurring when 
you start the PXF service.
 
@@ -48,7 +48,7 @@ Each PXF service plug-in resides in its own RPM.  You may 
have built these RPMs
 
 Perform the following steps on **_each_** node in your cluster to install PXF:
 
-1. Install the PXF software, including Apache, the PXF service, and all PXF 
plug-ins: HDFS, HBase, Hive, JSON:
+1. Install the PXF software, including Apache, the PXF service, and all PXF 
plug-ins: HDFS, HBase, Hive, JDBC, JSON:
 
     ```shell
     $ sudo yum install -y pxf

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/40d9a0b6/markdown/pxf/JdbcPXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/JdbcPXF.html.md.erb b/markdown/pxf/JdbcPXF.html.md.erb
new file mode 100644
index 0000000..337de66
--- /dev/null
+++ b/markdown/pxf/JdbcPXF.html.md.erb
@@ -0,0 +1,211 @@
+---
+title: Accessing External SQL Databases with JDBC (Beta)
+---
+
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+Some of your data may already reside in an external SQL database. The PXF JDBC 
plug-in reads data stored in SQL databases including MySQL, ORACLE, PostgreSQL, 
and Hive.
+
+This section describes how to use PXF with JDBC, including an example of 
creating and querying an external table that accesses data in a MySQL database 
table.
+
+## <a id="jdbcplugin_prereq"></a>Prerequisites
+
+Before accessing external SQL databases using HAWQ and PXF, ensure that:
+
+-   The JDBC plug-in is installed on all cluster nodes. See [Installing PXF 
Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+-   The JDBC driver JAR files for the external SQL database are installed on 
all cluster nodes.
+-   The file locations of external SQL database JDBC JAR files are added to 
`pxf-public.classpath`. If you manage your HAWQ cluster with Ambari, add the 
JARS via the Ambari UI. If you managed your cluster from the command line, edit 
the `/etc/pxf/conf/pxf-public.classpath` file directly.
+
+
+## <a id="hdfsplugin_queryextdata"></a>Querying External SQL Data
+The PXF JDBC plug-in supports the single profile named `Jdbc`.
+
+Use the following syntax to create a HAWQ external table representing external 
SQL database tables you access via JDBC: 
+
+``` sql
+CREATE [READABLE | WRITABLE] EXTERNAL TABLE <table_name>
+    ( <column_name> <data_type> [, ...] | LIKE <other_table> )
+LOCATION 
('pxf://<host>[:<port>]/<jdbc-schema-name>.<jdbc-database-name>.<jdbc-table-name>
+    ?PROFILE=Jdbc[&<custom-option>=<value>[...]]')
+FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
+```
+
+JDBC-plug-in-specific keywords and values used in the [CREATE EXTERNAL 
TABLE](../reference/sql/CREATE-EXTERNAL-TABLE.html) call are described in the 
table below.
+
+| Keyword  | Value |
+|-------|-------------------------------------|
+| \<column-name\>    | The name of the PXF external table column. The PXF 
\<column-name\> must exactly match the \<column-name\> used in the external SQL 
table.|
+| \<data-type\>    | The data type of the PXF external table column. The PXF 
\<data-type\> must be equivalent to the data type used for \<column-name\> in 
the SQL table.|
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node, 
use the HDFS NameNode as it is guaranteed to be available in a running HDFS 
cluster. If HDFS High Availability is enabled, \<host\> must identify the HDFS 
NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\> 
identifies a High Availability HDFS Nameservice and connects to the port number 
designated by the `pxf_service_port` server configuration parameter value. 
Default is 51200. |
+| \<jdbc-schema-name\>    | The schema name. The default schema name is 
`default`. |
+| \<jdbc-database-name\>    | The database name. The default database name is 
determined by the external SQL server. |
+| \<jdbc-table-name\>    | The table name. |
+| PROFILE    | The `PROFILE` keyword must specify `Jdbc`. |
+| \<custom-option\>  | The custom options supported by the `Jdbc` profile are 
discussed later in this section.|
+| FORMAT 'CUSTOM' | The JDBC `CUSTOM` `FORMAT` supports only the built-in 
`'pxfwritable_import'` `FORMATTER` property. |
+
+**Note**: When creating PXF external tables, you cannot use the `HEADER` 
option in your `FORMAT` specification.
+
+
+### <a id="profile_jdbcoptipns"></a>JDBC Custom Options
+
+You include custom options in the `LOCATION` URI. Preface each option with an 
ampersand `&`.
+
+The JDBC plug-in `Jdbc` profile supports the following \<custom-option\>s. The 
JDBC connection set up options are required.
+
+| Option Name   | Description
+|---------------|--------------------|
+| JDBC_DRIVER | The JDBC driver class name. (Required) |
+| DB_URL | The URL to the database; includes the hostname, port, and database 
name. (Required) |
+| USER | The database user name. (Required) |
+| PASS | The database password for USER. (Required) |
+| PARTITION_BY | The partition column, \<column-name\>:\<column-type\>. The 
JDBC plug-in supports `date`, `int`, and `enum` \<column-type\>s. Use the  
`yyyy-MM-dd` format for the `date` \<column-type\>. A null `PARTITION_BY` 
defaults to a single fragment. |
+| RANGE | (Used only when `PARTITION_BY` is specified.) The query range, 
\<start-value\>[:\<end-value\>]. \<end-value\> may be empty for an `int` 
\<column-type\>. The `RANGE` is left closed, right open. That is, the range 
includes the \<start-value\> but does *not* include the \<end-value\>.|
+| INTERVAL | (Used only when `PARTITION_BY` is specified.) The interval, 
\<interval-num\>[:\<interval-unit\>], of one fragment.  `INTERVAL` may be empty 
for an `enum` \<column-type\>. \<interval-unit\> may be empty for an `int` 
\<column-type\>. |
+
+Example JDBC \<custom-option\> connection string:
+
+``` pre
+&JDBC_DRIVER=com.mysql.jdbc.Driver&DB_URL=jdbc:mysql://<dbhost>:<dbport>/testdb&USER=user1&PASS=changeme
+```
+
+Example JDBC \<custom-option\> substrings identifying partitioning parameters:
+
+``` pre
+&PARTITION_BY=year:int&RANGE=2011:2013&INTERVAL=1
+&PARTITION_BY=createdate:date&RANGE=2013-01-01:2016-01-01&INTERVAL=1:month
+&PARTITION_BY=color:enum&RANGE=red:yellow:blue
+```
+
+### <a id="jdbc_example_mysql"></a>Example: Using the Jdbc Profile to Access a 
MySQL Database Table
+
+
+#### <a id="ex_create_mysqltbl"></a>Create a MySQL Table
+
+Perform the following steps to create a MySQL table named `mysql_table1` in 
the default schema of a database named `mtestdb`:
+
+1. Connect to the default MySQL database as the root user:
+
+    ``` shell
+    $ mysql --user=root -p
+    ```
+
+2. Create a database named `mtestdb`:
+
+    ``` sql
+    mysql> CREATE DATABASE mtestdb;
+    ```
+
+2. Assign privileges to `mtestdb` to a user named `muser1` and then exit:
+
+    ``` sql
+    mysql> GRANT ALL on mtestdb.* TO 'muser1'@'localhost' IDENTIFIED BY 
'muser1';
+    mysql> \q
+    ```
+ 
+2. Connect to `mtestdb` as user `muser1`:
+
+    ``` shell
+    $ mysql --user=muser1 mtestdb -pmuser1
+    ```
+
+2. Create a table named `mysql_table1` and insert some data into this table:
+
+    ``` sql
+    mysql> CREATE TABLE mysql_table1(id int);
+    mysql> INSERT INTO mysql_table1 VALUES (1);
+    mysql> INSERT INTO mysql_table1 VALUES (2);
+    mysql> INSERT INTO mysql_table1 VALUES (3);
+    ```
+
+3. Determine the MySQL server hostname and port:
+
+    ``` sql
+    mysql> SHOW VARIABLES LIKE 'hostname';
+    +---------------+-----------------+
+    | Variable_name | Value           |
+    +---------------+-----------------+
+    | hostname      | mydb.server.com |
+    +---------------+-----------------+
+
+    mysql> SHOW VARIABLES LIKE 'port';
+    +---------------+-------+
+    | Variable_name | Value |
+    +---------------+-------+
+    | port          | 3306  |
+    +---------------+-------+
+    1 row in set (0.00 sec)
+    ```
+
+    Make note of the hostname and port values returned.
+
+4. Construct the JDBC connection string, substituting your MySQL server 
hostname and port number:
+
+    ``` pre
+    
&JDBC_DRIVER=com.mysql.jdbc.Driver&DB_URL=jdbc:mysql://mydb.server.com:3306/mtestdb&USER=muser1&PASS=muser1
+    ```
+
+    Save this string for use later.
+
+#### <a id="ex_jdbconfig"></a>Configure PXF
+
+If not already present on your system, download and copy the MySQL connector 
JAR file to your system and update `pxf-public.classpath` with the location of 
this file.
+
+For example, if you manage your HAWQ cluster with Ambari, and the MySQL 
connector jar file is located in the `/usr/share/java` directory, add the 
following line:
+
+``` pre
+/usr/share/java/mysql-connector*.jar
+```
+
+in the Ambari **PXF** service **Configs > Advanced pxf-public.classpath** pane 
and then use the **Restart** button to restart PXF.
+
+If you manage your HAWQ cluster from the command line, directly edit the 
`/etc/pxf/conf/pxf-public.classpath` file, adding the MySQL connector JAR file 
path. Then restart PXF on *each* HAWQ node with the following command:
+
+``` shell
+root@hawq-node$ sudo service pxf-service restart
+```
+
+#### <a id="ex_usejdbc"></a>Query Using the Jdbc Profile
+
+Perform the following steps to create and query an external PXF table to 
access the `mysql_table1` table you created in the previous section:
+
+1. Use the `Jdbc` profile to create an external table to access the MySQL 
`mysql_table1` table. For example:
+
+    ``` sql
+    gpadmin=# CREATE EXTERNAL TABLE pxf_jdbc_mysql_table1(id int)
+                LOCATION 
('pxf://<host>:51200/mtestdb.mysql_table1?PROFILE=JDBC&JDBC_DRIVER=com.mysql.jdbc.Driver&DB_URL=jdbc:mysql://<dbserver-host>:<port>/mtestdb&USER=hawquser1&PASS=hawquser1')
+                FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
+    ```
+
+    Substitute your PXF \<host\>, as well as the `DB_URL` string you 
constructed in the previous exercise.
+
+2. Display all rows of the `pxf_jdbc_mysql_table1` table:
+
+    ``` sql
+    gpadmin=# SELECT * FROM pxf_jdbc_mysql_table1;
+     id
+    ----
+      1
+      2
+      3
+    (3 rows)
+    ```
+

Reply via email to