Github user kavinderd commented on a diff in the pull request:
https://github.com/apache/incubator-hawq-docs/pull/46#discussion_r96086612
--- Diff: markdown/pxf/HDFSWritablePXF.html.md.erb ---
@@ -0,0 +1,416 @@
+---
+title: Writing Data to HDFS
+---
+
+The PXF HDFS plug-in supports writable external tables using the
`HdfsTextSimple` and `SequenceWritable` profiles. You might create a writable
table to export data from a HAWQ internal table to binary or text HDFS files.
+
+Use the `HdfsTextSimple` profile when writing text data. Use the
`SequenceWritable` profile when dealing with binary data.
+
+This section describes how to use these PXF profiles to create writable
external tables.
+
+**Note**: Tables that you create with writable profiles can only be used
for INSERT operations. If you want to query inserted data, you must define a
separate external readable table that references the new HDFS file using the
equivalent readable profile. ??You can also create a Hive table to access the
HDFS file.??
+
+## <a id="pxfwrite_prereq"></a>Prerequisites
+
+Before working with HDFS file data using HAWQ and PXF, ensure that:
+
+- The HDFS plug-in is installed on all cluster nodes. See [Installing
PXF Plug-ins](InstallPXFPlugins.html) for PXF plug-in installation information.
+- All HDFS users have read permissions to HDFS services.
+- HDFS write permissions are provided to a restricted set of users.
+
+## <a id="hdfsplugin_writeextdata"></a>Writing to PXF External Tables
+The PXF HDFS plug-in supports two writable profiles: `HdfsTextSimple` and
`SequenceWritable`.
+
+Use the following syntax to create a HAWQ external writable table
representing HDFS data:Â
+
+``` sql
+CREATE WRITABLE EXTERNAL TABLE <table_name>
+ ( <column_name> <data_type> [, ...] | LIKE <other_table> )
+LOCATION ('pxf://<host>[:<port>]/<path-to-hdfs-file>
+
?PROFILE=HdfsTextSimple|SequenceWritable[&<custom-option>=<value>[...]]')
+FORMAT '[TEXT|CSV|CUSTOM]' (<formatting-properties>);
+```
+
+HDFS-plug-in-specific keywords and values used in the [CREATE EXTERNAL
TABLE](../reference/sql/CREATE-EXTERNAL-TABLE.html) call are described in the
table below.
+
+| Keyword | Value |
+|-------|-------------------------------------|
+| \<host\>[:\<port\>] | The HDFS NameNode and port. |
+| \<path-to-hdfs-file\> | The path to the file in the HDFS data store. |
+| PROFILE | The `PROFILE` keyword must specify one of the values
`HdfsTextSimple` or `SequenceWritable`. |
+| \<custom-option\> | \<custom-option\> is profile-specific. These
options are discussed in the next topic.|
+| FORMAT 'TEXT' | Use '`TEXT`' `FORMAT` with the `HdfsTextSimple` profile
to create a plain-text-delimited file at the location specified by
\<path-to-hdfs-file\>. The `HdfsTextSimple` '`TEXT`' `FORMAT` supports only the
built-in `(delimiter=<delim>)` \<formatting-property\>. |
+| FORMAT 'CSV' | Use '`CSV`' `FORMAT` with the `HdfsTextSimple` profile to
create a comma-separated-value file at the location specified by
\<path-to-hdfs-file\>. |
+| FORMAT 'CUSTOM' | Use the `'CUSTOM'` `FORMAT` with the
`SequenceWritable` profile. The `SequenceWritable` '`CUSTOM`' `FORMAT` supports
only the built-in `(formatter='pxfwritable_export)` (write) and
`(formatter='pxfwritable_import)` (read) \<formatting-properties\>.
+
+**Note**: When creating PXF external tables, you cannot use the `HEADER`
option in your `FORMAT` specification.
+
+## <a id="profile_hdfstextsimple"></a>Custom Options
+
+The `HdfsTextSimple` and `SequenceWritable` profiles support the following
custom options:
+
+| Option | Value Description | Profile |
+|-------|-------------------------------------|--------|
+| COMPRESSION_CODEC | The compression codec Java class name. If this
option is not provided, no data compression is performed. Supported compression
codecs include: `org.apache.hadoop.io.compress.DefaultCodec` and
`org.apache.hadoop.io.compress.BZip2Codec` | HdfsTextSimple, SequenceWritable |
+| | `org.apache.hadoop.io.compress.GzipCodec` | HdfsTextSimple |
+| COMPRESSION_TYPE | The compression type to employ; supported values
are `RECORD` (the default) or `BLOCK`. | HdfsTextSimple, SequenceWritable |
+| DATA-SCHEMA | The name of the writer serialization/deserialization
class. The jar file in which this class resides must be in the PXF classpath.
This option is required for the `SequenceWritable` profile and has no default
value. | SequenceWritable|
+| THREAD-SAFE | Boolean value determining if a table query can run in
multi-threaded mode. The default value is `TRUE`. Set this option to `FALSE` to
handle all requests in a single thread for operations that are not thread-safe
(for example, compression). | HdfsTextSimple, SequenceWritable|
+
+## <a id="profile_hdfstextsimple"></a>HdfsTextSimple Profile
+
+Use the `HdfsTextSimple` profile when writing delimited data to a plain
text file where each row is a single record.
--- End diff --
I think it's more appropriate to say 'where each line is a single record'.
'row' is more of a database term than a text file one
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---