Github user dyozie commented on a diff in the pull request:
https://github.com/apache/incubator-hawq-docs/pull/94#discussion_r99383249
--- Diff: markdown/pxf/PXFExternalTableandAPIReference.html.md.erb ---
@@ -27,48 +27,66 @@ The PXF Java API lets you extend PXF functionality and
add new services and form
The Fragmenter produces a list of data fragments that can be read in
parallel from the data source. The Accessor produces a list of records from a
single fragment, and the Resolver both deserializes and serializes records.
-Together, the Fragmenter, Accessor, and Resolver classes implement a
connector. PXF includes plug-ins for tables in HDFS, HBase, and Hive.
+Together, the Fragmenter, Accessor, and Resolver classes implement a
connector. PXF includes plug-ins for HDFS and JSON files and tables in HBase
and Hive.
## <a id="creatinganexternaltable"></a>Creating an External Table
-The syntax for a readable `EXTERNAL TABLE` that uses the PXF protocol is
as follows:
+The syntax for an `EXTERNAL TABLE` that uses the PXF protocol is as
follows:
``` sql
-CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name
- ( column_name data_type [, ...] | LIKE other_table )
-LOCATION('pxf://host[:port]/path-to-data<pxf
parameters>[&custom-option=value...]')
+CREATE [READABLE|WRITABLE] EXTERNAL TABLE <table_name>
+ ( <column_name> <data_type> [, ...] | LIKE <other_table> )
+LOCATION('pxf://<host>[:<port>]/<path-to-data>?<pxf-parameters>[&<custom-option>=<value>[...]]')
FORMAT 'custom' (formatter='pxfwritable_import|pxfwritable_export');
```
-Â where *<pxf parameters>* is:
+Â where \<pxf\-parameters\> is:
``` pre
-
?FRAGMENTER=fragmenter_class&ACCESSOR=accessor_class&RESOLVER=resolver_class]
- | ?PROFILE=profile-name
+ [FRAGMENTER=<fragmenter_class>&ACCESSOR=<accessor_class>
+ &RESOLVER=<resolver_class>] | ?PROFILE=profile-name
```
+
+T
<caption><span class="tablecap">Table 1. Parameter values and
description</span></caption>
<a id="creatinganexternaltable__table_pfy_htz_4p"></a>
| Parameter | Value and description
|
|-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| host | The current host of the PXF service.
|
-| port | Connection port for the PXF service. If the
port is omitted, PXF assumes that High Availability (HA) is enabled and
connects to the HA name service port, 51200 by default. The HA name service
port can be changed by setting the `pxf_service_port` configuration parameter. |
-| *path\_to\_data* | A directory, file name, wildcard pattern,
table name, etc.
|
-| FRAGMENTER | The plug-in (Java class) to use for
fragmenting data. Used for READABLE external tables only.
|
-| ACCESSOR | The plug-in (Java class) to use for accessing
the data. Used for READABLE and WRITABLE tables.
|
-| RESOLVER | The plug-in (Java class) to use for
serializing and deserializing the data. Used for READABLE and WRITABLE tables.
|
-| *custom-option*=*value* | Additional values to pass to the plug-in
class. The parameters are passed at runtime to the plug-ins indicated above.
The plug-ins can lookup custom options with
`org.apache.hawq.pxf.api.utilities.InputData`.Â
|
+| host | The HDFS NameNode.
|
+| port | Connection port for the PXF service. If the
port is omitted, PXF assumes that High Availability (HA) is enabled and
connects to the HA name service port, 51200, by default. The HA name service
port can be changed by setting the `pxf_service_port` configuration parameter. |
+| \<path\-to\-data\> | A directory, file name, wildcard pattern,
table name, etc.
|
+| PROFILE | The profile PXF should use to access the data.
PXF supports multiple plug-ins that currently expose profiles named `HBase`,
`Hive`, `HiveRC`, `HiveText`, `HiveORC`, `HdfsTextSimple`, `HdfsTextMulti`,
`Avro`, `SequenceWritable`, and `Json`.
|
+| FRAGMENTER | The Java class the plug-in uses for
fragmenting data. Used for READABLE external tables only.
|
+| ACCESSOR | The Java class the plug-in uses for accessing
the data. Used for READABLE and WRITABLE tables.
|
+| RESOLVER | The Java class the plug-in uses for
serializing and deserializing the data. Used for READABLE and WRITABLE tables.
|
+| \<custom-option\> | Additional values to pass to the plug-in at runtime.
A plug-in can parse custom options with the PXF helper class
`org.apache.hawq.pxf.api.utilities.InputData`.Â
|
**Note:** When creating PXF external tables, you cannot use the `HEADER`
option in your `FORMAT` specification.
-For more information about this example, see [About the Java Class
Services and Formats](#aboutthejavaclassservicesandformats).
## <a id="aboutthejavaclassservicesandformats"></a>About the Java Class
Services and Formats
-The `LOCATION` string in a PXF `CREATE EXTERNAL TABLE` statement is a URI
that specifies the host and port of an external data source and the path to the
data in the external data source. The query portion of the URI, introduced by
the question mark (?), must include the required parameters `FRAGMENTER`
(readable tables only), `ACCESSOR`, and `RESOLVER`, which specify Java class
names that extend the base PXF API plug-in classes. Alternatively, the required
parameters can be replaced with a `PROFILE` parameter with the name of a
profile defined in the `/etc/conf/pxf-profiles.xml` that defines the required
classes.
+The `LOCATION` string in a PXF `CREATE EXTERNAL TABLE` statement is a URI
that specifies the host and port of an external data source and the path to the
data in the external data source. The query portion of the URI, introduced by
the question mark (?), must include the PXF profile name or the plug-in's
`FRAGMENTER` (readable tables only), `ACCESSOR`, and `RESOLVER` class names.
+
+PXF profiles are defined in the `/etc/pxf/conf/pxf-profiles.xml` file.
Profile definitions include plug-in class names. For example, the
`HdfsTextSimple` profile definition follows:
--- End diff --
Change "follows" to "is"
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---