[ https://issues.apache.org/jira/browse/HAWQ-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15623624#comment-15623624 ]
ASF GitHub Bot commented on HAWQ-1119: -------------------------------------- Github user dyozie commented on a diff in the pull request: https://github.com/apache/incubator-hawq-docs/pull/46#discussion_r85793887 --- Diff: pxf/HDFSWritablePXF.html.md.erb --- @@ -0,0 +1,410 @@ +--- +title: Writing Data to HDFS +--- + +The PXF HDFS plug-in supports writable external tables using the `HdfsTextSimple` and `SequenceWritable` profiles. You might create a writable table to export data from a HAWQ internal table to HDFS. + +This section describes how to use these PXF profiles to create writable external tables. + +**Note**: You cannot directly query data in a HAWQ writable table. After creating the external writable table, you must create a HAWQ readable external table accessing the HDFS file, then query that table. ??You can also create a Hive table to access the HDFS file.?? + +## <a id="pxfwrite_prereq"></a>Prerequisites + +Before working with HDFS file data using HAWQ and PXF, ensure that: + +- The HDFS plug-in is installed on all cluster nodes. See [Installing PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information. +- All HDFS users have read permissions to HDFS services and that write permissions have been restricted to specific users. + +## <a id="hdfsplugin_writeextdata"></a>Writing to PXF External Tables +The PXF HDFS plug-in supports writable two profiles: `HdfsTextSimple` and `SequenceWritable`. + +Use the following syntax to create a HAWQ external writable table representing HDFS data: + +``` sql +CREATE EXTERNAL WRITABLE TABLE <table_name> + ( <column_name> <data_type> [, ...] | LIKE <other_table> ) +LOCATION ('pxf://<host>[:<port>]/<path-to-hdfs-file> + ?PROFILE=HdfsTextSimple|SequenceWritable[&<custom-option>=<value>[...]]') +FORMAT '[TEXT|CSV|CUSTOM]' (<formatting-properties>); +``` + +HDFS-plug-in-specific keywords and values used in the [CREATE EXTERNAL TABLE](../reference/sql/CREATE-EXTERNAL-TABLE.html) call are described in the table below. + +| Keyword | Value | +|-------|-------------------------------------| +| \<host\>[:\<port\>] | The HDFS NameNode and port. | +| \<path-to-hdfs-file\> | The path to the file in the HDFS data store. | +| PROFILE | The `PROFILE` keyword must specify one of the values `HdfsTextSimple` or `SequenceWritable`. | +| \<custom-option\> | \<custom-option\> is profile-specific. These options are discussed in the next topic.| +| FORMAT 'TEXT' | Use '`TEXT`' `FORMAT` with the `HdfsTextSimple` profile when \<path-to-hdfs-file\> will reference a plain text delimited file. The `HdfsTextSimple` '`TEXT`' `FORMAT` supports only the built-in `(delimiter=<delim>)` \<formatting-property\>. | +| FORMAT 'CSV' | Use '`CSV`' `FORMAT` with `HdfsTextSimple` when \<path-to-hdfs-file\> will reference a comma-separated value file. | +| FORMAT 'CUSTOM' | Use the `'CUSTOM'` `FORMAT` with the `SequenceWritable` profile. The `SequenceWritable` '`CUSTOM`' `FORMAT` supports only the built-in `(formatter='pxfwritable_export)` (write) and `(formatter='pxfwritable_import)` (read) \<formatting-properties\>. + +**Note**: When creating PXF external tables, you cannot use the `HEADER` option in your `FORMAT` specification. + +## <a id="profile_hdfstextsimple"></a>Custom Options + +The `HdfsTextSimple` and `SequenceWritable` profiles support the following \<custom-options\>: + +| Keyword | Value Description | +|-------|-------------------------------------| +| COMPRESSION_CODEC | The compression codec Java class name. If this option is not provided, no data compression is performed. Supported compression codecs include: `org.apache.hadoop.io.compress.DefaultCodec`, `org.apache.hadoop.io.compress.BZip2Codec`, and `org.apache.hadoop.io.compress.GzipCodec` (`HdfsTextSimple` profile only) | --- End diff -- Instead of including parentheticals here (`HdfsTextSimple` profile only), add a third column to indicate which profile(s) the option applies to. > create new documentation topic for PXF writable profiles > -------------------------------------------------------- > > Key: HAWQ-1119 > URL: https://issues.apache.org/jira/browse/HAWQ-1119 > Project: Apache HAWQ > Issue Type: Improvement > Components: Documentation > Reporter: Lisa Owen > Assignee: David Yozie > Fix For: 2.0.1.0-incubating > > > certain profiles supported by the existing PXF plug-ins support writable > tables. create some documentation content for these profiles. -- This message was sent by Atlassian JIRA (v6.3.4#6332)