Repository: incubator-hawq-docs
Updated Branches:
  refs/heads/develop dcb5cadfc -> 5714ce5b3


HAWQ-1376 - clarify pxf host and port description (closes #99)


Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/commit/5714ce5b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/tree/5714ce5b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/diff/5714ce5b

Branch: refs/heads/develop
Commit: 5714ce5b3efb61387e6479907ada58f5aa8f34aa
Parents: dcb5cad
Author: Lisa Owen <[email protected]>
Authored: Thu Mar 9 18:15:45 2017 -0800
Committer: David Yozie <[email protected]>
Committed: Thu Mar 9 18:15:45 2017 -0800

----------------------------------------------------------------------
 .../HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb    | 4 ++++
 markdown/pxf/HBasePXF.html.md.erb                               | 2 +-
 markdown/pxf/HDFSFileDataPXF.html.md.erb                        | 3 ++-
 markdown/pxf/HDFSWritablePXF.html.md.erb                        | 3 ++-
 markdown/pxf/HivePXF.html.md.erb                                | 3 ++-
 markdown/pxf/JsonPXF.html.md.erb                                | 5 +++--
 markdown/pxf/PXFExternalTableandAPIReference.html.md.erb        | 4 ++--
 markdown/pxf/TroubleshootingPXF.html.md.erb                     | 2 +-
 markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb        | 4 ++--
 9 files changed, 19 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
----------------------------------------------------------------------
diff --git 
a/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb 
b/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
index 6923494..20892f6 100644
--- a/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
+++ b/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
@@ -240,3 +240,7 @@ For command-line administrators:
        $ hawq init standby -n -M fast
 
        ```
+
+## <a id="pxfnhdfsnamenode"></a>Using PXF with HDFS NameNode HA
+
+If HDFS NameNode High Availability is enabled, use the HDFS Nameservice ID in 
the `LOCATION` clause \<host\> field when invoking any PXF `CREATE EXTERNAL 
TABLE` command. If the \<port\> is omitted from the `LOCATION` URI, PXF 
connects to the port number designated by the `pxf_service_port` server 
configuration parameter value (default is 51200).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/HBasePXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/HBasePXF.html.md.erb 
b/markdown/pxf/HBasePXF.html.md.erb
index 3be06d2..ddb86d5 100644
--- a/markdown/pxf/HBasePXF.html.md.erb
+++ b/markdown/pxf/HBasePXF.html.md.erb
@@ -43,7 +43,7 @@ To create an external HBase table, use the following syntax:
 ``` sql
 CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
     ( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://namenode[:port]/hbase-table-name?Profile=HBase')
+LOCATION ('pxf://host[:port]/hbase-table-name?Profile=HBase')
 FORMAT 'CUSTOM' (Formatter='pxfwritable_import');
 ```
 

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/HDFSFileDataPXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/HDFSFileDataPXF.html.md.erb 
b/markdown/pxf/HDFSFileDataPXF.html.md.erb
index 6780650..47b964f 100644
--- a/markdown/pxf/HDFSFileDataPXF.html.md.erb
+++ b/markdown/pxf/HDFSFileDataPXF.html.md.erb
@@ -100,7 +100,8 @@ HDFS-plug-in-specific keywords and values used in the 
[CREATE EXTERNAL TABLE](..
 
 | Keyword  | Value |
 |-------|-------------------------------------|
-| \<host\>[:\<port\>]    | The HDFS NameNode and port. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node, 
use the HDFS NameNode as it is guaranteed to be available in a running HDFS 
cluster. If HDFS High Availability is enabled, \<host\> must identify the HDFS 
NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\> 
identifies a High Availability HDFS Nameservice and connects to the port number 
designated by the `pxf_service_port` server configuration parameter value. 
Default is 51200. |
 | \<path-to-hdfs-file\>    | The path to the file in the HDFS data store. |
 | PROFILE    | The `PROFILE` keyword must specify one of the values 
`HdfsTextSimple`, `HdfsTextMulti`, or `Avro`. |
 | \<custom-option\>  | \<custom-option\> is profile-specific. Profile-specific 
options are discussed in the relevant profile topic later in this section.|

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/HDFSWritablePXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/HDFSWritablePXF.html.md.erb 
b/markdown/pxf/HDFSWritablePXF.html.md.erb
index 021b6b9..0c498a2 100644
--- a/markdown/pxf/HDFSWritablePXF.html.md.erb
+++ b/markdown/pxf/HDFSWritablePXF.html.md.erb
@@ -54,7 +54,8 @@ HDFS-plug-in-specific keywords and values used in the [CREATE 
EXTERNAL TABLE](..
 
 | Keyword  | Value |
 |-------|-------------------------------------|
-| \<host\>[:\<port\>]    | The HDFS NameNode and port. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node, 
use the HDFS NameNode as it is guaranteed to be available in a running HDFS 
cluster. If HDFS High Availability is enabled, \<host\> must identify the HDFS 
NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\> 
identifies a High Availability HDFS Nameservice and connects to the port number 
designated by the `pxf_service_port` server configuration parameter value. 
Default is 51200. |
 | \<path-to-hdfs-file\>    | The path to the file in the HDFS data store. |
 | PROFILE    | The `PROFILE` keyword must specify one of the values 
`HdfsTextSimple` or `SequenceWritable`. |
 | \<custom-option\>  | \<custom-option\> is profile-specific. These options 
are discussed in the next topic.|

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/HivePXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/HivePXF.html.md.erb b/markdown/pxf/HivePXF.html.md.erb
index 6101016..bc4e9f6 100644
--- a/markdown/pxf/HivePXF.html.md.erb
+++ b/markdown/pxf/HivePXF.html.md.erb
@@ -332,7 +332,8 @@ Hive-plug-in-specific keywords and values used in the 
[CREATE EXTERNAL TABLE](..
 
 | Keyword  | Value |
 |-------|-------------------------------------|
-| \<host\>[:<port\>]    | The HDFS NameNode and port. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node, 
use the HDFS NameNode as it is guaranteed to be available in a running HDFS 
cluster. If HDFS High Availability is enabled, \<host\> must identify the HDFS 
NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\> 
identifies a High Availability HDFS Nameservice and connects to the port number 
designated by the `pxf_service_port` server configuration parameter value. 
Default is 51200. |
 | \<hive-db-name\>    | The name of the Hive database. If omitted, defaults to 
the Hive database named `default`. |
 | \<hive-table-name\>    | The name of the Hive table. |
 | PROFILE    | The `PROFILE` keyword must specify one of the values `Hive`, 
`HiveText`, or `HiveRC`. |

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/JsonPXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/JsonPXF.html.md.erb b/markdown/pxf/JsonPXF.html.md.erb
index 5f156c4..6aeea7e 100644
--- a/markdown/pxf/JsonPXF.html.md.erb
+++ b/markdown/pxf/JsonPXF.html.md.erb
@@ -169,7 +169,8 @@ JSON-plug-in-specific keywords and values used in the 
`CREATE EXTERNAL TABLE` ca
 
 | Keyword  | Value |
 |-------|-------------------------------------|
-| \<host\>    | Specify the HDFS NameNode in the \<host\> field. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node, 
use the HDFS NameNode as it is guaranteed to be available in a running HDFS 
cluster. If HDFS High Availability is enabled, \<host\> must identify the HDFS 
NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\> 
identifies a High Availability HDFS Nameservice and connects to the port number 
designated by the `pxf_service_port` server configuration parameter value. 
Default is 51200. |
 | PROFILE    | The `PROFILE` keyword must specify the value `Json`. |
 | IDENTIFIER  | Include the `IDENTIFIER` keyword and \<value\> in the 
`LOCATION` string only when accessing a JSON file with multi-line records. 
\<value\> should identify the member name used to determine the encapsulating 
JSON object to return.  (If the JSON file is the multi-line record Example 2 
above, `&IDENTIFIER=created_at` would be specified.) |  
 | FORMAT    | The `FORMAT` clause must specify `CUSTOM`. |
@@ -213,4 +214,4 @@ To query this external table populated with JSON data:
 
 ``` sql
 SELECT * FROM sample_json_multiline_tbl;
-```
\ No newline at end of file
+```

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb 
b/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb
index 8a29d1d..3681079 100644
--- a/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb
+++ b/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb
@@ -53,8 +53,8 @@ FORMAT 'custom' 
(formatter='pxfwritable_import|pxfwritable_export');
 
 | Parameter               | Value and description                              
                                                                                
                                                                                
                                                            |
 
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| host                    | The HDFS NameNode.                                 
                                                                                
                                                                                
                                          |
-| port                    | Connection port for the PXF service. If the port 
is omitted, PXF assumes that High Availability (HA) is enabled and connects to 
the HA name service port, 51200, by default. The HA name service port can be 
changed by setting the `pxf_service_port` configuration parameter. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node, 
use the HDFS NameNode as it is guaranteed to be available in a running HDFS 
cluster. If HDFS High Availability is enabled, \<host\> must identify the HDFS 
NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\> 
identifies a High Availability HDFS Nameservice and connects to the port number 
designated by the `pxf_service_port` server configuration parameter value. 
Default is 51200. |
 | \<path\-to\-data\>        | A directory, file name, wildcard pattern, table 
name, etc.                                                                      
                                                                                
                                                               |
 | PROFILE              | The profile PXF uses to access the data. PXF supports 
multiple plug-ins that currently expose profiles named `HBase`, `Hive`, 
`HiveRC`, `HiveText`, `HiveORC`,  `HdfsTextSimple`, `HdfsTextMulti`, `Avro`, 
`SequenceWritable`, and `Json`.                                                 
                                                                                
                                                  |
 | FRAGMENTER              | The Java class the plug-in uses for fragmenting 
data. Used for READABLE external tables only.                                   
                                                                                
                                                                |

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/TroubleshootingPXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/TroubleshootingPXF.html.md.erb 
b/markdown/pxf/TroubleshootingPXF.html.md.erb
index 57fe9d5..cf1ef13 100644
--- a/markdown/pxf/TroubleshootingPXF.html.md.erb
+++ b/markdown/pxf/TroubleshootingPXF.html.md.erb
@@ -81,7 +81,7 @@ The following table lists some common errors encountered 
while using PXF:
 </tr>
 <tr class="odd">
 <td>ERROR: fail to get filesystem credential for uri 
hdfs://&lt;namenode&gt;:8020/</td>
-<td>Secure PXF: Wrong HDFS host or port is not 8020 (this is a limitation that 
will be removed in the next release)</td>
+<td>Secure PXF: Wrong HDFS host or port is not 8020</td>
 </tr>
 <tr class="even">
 <td>ERROR: remote component error (413) from '&lt;x&gt;': HTTP status code is 
413 but HTTP response string is empty</td>

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb 
b/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb
index c46870c..c458cae 100644
--- a/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb
+++ b/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb
@@ -165,7 +165,7 @@ The `FORMAT` clause is used to describe how external table 
files are formatted.
 <dd>The data type of the column.</dd>
 
 <dt>LOCATION ('\<protocol\>://\<host\>\[:\<port\>\]/\<path\>/\<file\>' \[, 
...\])   </dt>
-<dd>For readable external tables, specifies the URI of the external data 
source(s) to be used to populate the external table or web table. Regular 
readable external tables allow the `file`, `gpfdist`, and `pxf` protocols. Web 
external tables allow the `http` protocol. If \<port\> is omitted, the `http` 
and `gpfdist` protocols assume port `8080` and the `pxf` protocol assumes the 
\<host\> is a high availability nameservice string. If using the `gpfdist` 
protocol, the \<path\> is relative to the directory from which `gpfdist` is 
serving files (the directory specified when you started the `gpfdist` program). 
Also, the \<path\> can use wildcards (or other C-style pattern matching) in the 
\<file\> name part of the location to denote multiple files in a directory. For 
example:
+<dd>For readable external tables, specifies the URI of the external data 
source(s) to be used to populate the external table or web table. Regular 
readable external tables allow the `file`, `gpfdist`, and `pxf` protocols. Web 
external tables allow the `http` protocol. If \<port\> is omitted, the `http` 
and `gpfdist` protocols assume port `8080` and the `pxf` protocol assumes the 
\<host\> specifies a high availability Nameservice ID. If using the `gpfdist` 
protocol, the \<path\> is relative to the directory from which `gpfdist` is 
serving files (the directory specified when you started the `gpfdist` program). 
Also, the \<path\> can use wildcards (or other C-style pattern matching) in the 
\<file\> name part of the location to denote multiple files in a directory. For 
example:
 
 ``` pre
 'gpfdist://filehost:8081/*'
@@ -183,7 +183,7 @@ For writable external tables, specifies the URI location of 
the `gpfdist` proces
 
 With two `gpfdist` locations listed as in the above example, half of the 
segments would send their output data to the `data1.out` file and the other 
half to the `data2.out` file.
 
-For the `pxf` protocol, the `LOCATION` string specifies the \<host\> and 
\<port\> of the PXF service, the location of the data, and the PXF plug-ins 
(Java classes) used to convert the data between storage format and HAWQ format. 
If the \<port\> is omitted, the \<host\> is taken to be the logical name for 
the high availability name service and the \<port\> is the value of the 
`pxf_service_port` configuration variable, 51200 by default. The URL parameters 
`FRAGMENTER`, `ACCESSOR`, and `RESOLVER` are the names of PXF plug-ins (Java 
classes) that convert between the external data format and HAWQ data format. 
The `FRAGMENTER` parameter is only used with readable external tables. PXF 
allows combinations of these parameters to be configured as profiles so that a 
single `PROFILE` parameter can be specified to access external data, for 
example `?PROFILE=Hive`. Additional \<custom-options\>` can be added to the 
LOCATION URI to further describe the external data format or storage options. 
For 
 details about the plug-ins and profiles provided with PXF and information 
about creating custom plug-ins for other data sources see [Using PXF with 
Unmanaged Data](../../pxf/HawqExtensionFrameworkPXF.html).</dd>
+For the `pxf` protocol, the `LOCATION` string specifies the HDFS NameNode 
\<host\> and the \<port\> of the PXF service, the location of the data, and the 
PXF profile or Java classes used to convert the data between storage format and 
HAWQ format. If the \<port\> is omitted, the \<host\> is taken to be the 
logical name for the high availability Nameservice, and the \<port\> is the 
value of the `pxf_service_port` configuration parameter, 51200 by default. The 
URL parameters `FRAGMENTER`, `ACCESSOR`, and `RESOLVER` are the names of PXF 
plug-ins (Java classes) that convert between the external data format and HAWQ 
data format. The `FRAGMENTER` parameter is only used with readable external 
tables. PXF allows combinations of these parameters to be configured as 
profiles so that a single `PROFILE` parameter can be specified to access 
external data, for example `?PROFILE=Hive`. Additional \<custom-options\>` can 
be added to the LOCATION URI to further describe the external data format or st
 orage options. For details about the plug-ins and profiles provided with PXF 
and information about creating custom plug-ins for other data sources see 
[Using PXF with Unmanaged Data](../../pxf/HawqExtensionFrameworkPXF.html).</dd>
 
 <dt>EXECUTE '\<command\>' ON ...  </dt>
 <dd>Allowed for readable web external tables or writable external tables only. 
For readable web external tables, specifies the OS command to be executed by 
the segment instances. The \<command\> can be a single OS command or a script. 
If \<command\> executes a script, that script must reside in the same location 
on all of the segment hosts and be executable by the HAWQ superuser (`gpadmin`).

Reply via email to