Github user dyozie commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq-docs/pull/46#discussion_r85812589
  
    --- Diff: pxf/HDFSWritablePXF.html.md.erb ---
    @@ -0,0 +1,410 @@
    +---
    +title: Writing Data to HDFS
    +---
    +
    +The PXF HDFS plug-in supports writable external tables using the 
`HdfsTextSimple` and `SequenceWritable` profiles.  You might create a writable 
table to export data from a HAWQ internal table to HDFS.
    +
    +This section describes how to use these PXF profiles to create writable 
external tables.
    +
    +**Note**: You cannot directly query data in a HAWQ writable table.  After 
creating the external writable table, you must create a HAWQ readable external 
table accessing the HDFS file, then query that table. ??You can also create a 
Hive table to access the HDFS file.??
    +
    +## <a id="pxfwrite_prereq"></a>Prerequisites
    +
    +Before working with HDFS file data using HAWQ and PXF, ensure that:
    +
    +-   The HDFS plug-in is installed on all cluster nodes. See [Installing 
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
    +-   All HDFS users have read permissions to HDFS services and that write 
permissions have been restricted to specific users.
    +
    +## <a id="hdfsplugin_writeextdata"></a>Writing to PXF External Tables
    +The PXF HDFS plug-in supports writable two profiles: `HdfsTextSimple` and 
`SequenceWritable`.
    +
    +Use the following syntax to create a HAWQ external writable table 
representing HDFS data: 
    +
    +``` sql
    +CREATE EXTERNAL WRITABLE TABLE <table_name> 
    +    ( <column_name> <data_type> [, ...] | LIKE <other_table> )
    +LOCATION ('pxf://<host>[:<port>]/<path-to-hdfs-file>
    +    
?PROFILE=HdfsTextSimple|SequenceWritable[&<custom-option>=<value>[...]]')
    +FORMAT '[TEXT|CSV|CUSTOM]' (<formatting-properties>);
    +```
    +
    +HDFS-plug-in-specific keywords and values used in the [CREATE EXTERNAL 
TABLE](../reference/sql/CREATE-EXTERNAL-TABLE.html) call are described in the 
table below.
    +
    +| Keyword  | Value |
    +|-------|-------------------------------------|
    +| \<host\>[:\<port\>]    | The HDFS NameNode and port. |
    +| \<path-to-hdfs-file\>    | The path to the file in the HDFS data store. |
    +| PROFILE    | The `PROFILE` keyword must specify one of the values 
`HdfsTextSimple` or `SequenceWritable`. |
    +| \<custom-option\>  | \<custom-option\> is profile-specific. These 
options are discussed in the next topic.|
    +| FORMAT 'TEXT' | Use '`TEXT`' `FORMAT` with the `HdfsTextSimple` profile 
when \<path-to-hdfs-file\> will reference a plain text delimited file. The 
`HdfsTextSimple` '`TEXT`' `FORMAT` supports only the built-in 
`(delimiter=<delim>)` \<formatting-property\>. |
    +| FORMAT 'CSV' | Use '`CSV`' `FORMAT` with `HdfsTextSimple` when 
\<path-to-hdfs-file\> will reference a comma-separated value file.  |
    +| FORMAT 'CUSTOM' | Use the `'CUSTOM'` `FORMAT` with the 
`SequenceWritable` profile. The `SequenceWritable` '`CUSTOM`' `FORMAT` supports 
only the built-in `(formatter='pxfwritable_export)` (write) and 
`(formatter='pxfwritable_import)` (read) \<formatting-properties\>.
    +
    +**Note**: When creating PXF external tables, you cannot use the `HEADER` 
option in your `FORMAT` specification.
    +
    +## <a id="profile_hdfstextsimple"></a>Custom Options
    +
    +The `HdfsTextSimple` and `SequenceWritable` profiles support the following 
\<custom-options\>:
    +
    +| Keyword  | Value Description |
    +|-------|-------------------------------------|
    +| COMPRESSION_CODEC    | The compression codec Java class name. If this 
option is not provided, no data compression is performed. Supported compression 
codecs include: `org.apache.hadoop.io.compress.DefaultCodec`, 
`org.apache.hadoop.io.compress.BZip2Codec`, and 
`org.apache.hadoop.io.compress.GzipCodec` (`HdfsTextSimple` profile only) |
    +| COMPRESSION_TYPE    | The compression type to employ; supported values 
are `RECORD` (the default) or `BLOCK`. |
    +| DATA-SCHEMA    | (`SequenceWritable` profile only) The name of the 
writer serialization/deserialization class. The jar file in which this class 
resides must be in the PXF class path. This option has no default value. |
    +| THREAD-SAFE | Boolean value determining if a table query can run in 
multi-thread mode. Default value is `TRUE`, requests run in multi-threaded 
mode. When set to `FALSE`, requests will be handled in a single thread.  
`THREAD-SAFE` should be set appropriately when operations that are not 
thread-safe are performed (i.e. compression). |
    --- End diff --
    
    multi-thread -> multi-threaded.  Also, the rest  some edits:
    
    The default value is true. Set this option to `FALSE` to handle all 
requests in a single thread for operations that are not thread-safe (for 
example, compression).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to