http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/admin/ambari-rest-api.html.md.erb
----------------------------------------------------------------------
diff --git a/admin/ambari-rest-api.html.md.erb 
b/admin/ambari-rest-api.html.md.erb
deleted file mode 100644
index 2cc79e4..0000000
--- a/admin/ambari-rest-api.html.md.erb
+++ /dev/null
@@ -1,163 +0,0 @@
----
-title: Using the Ambari REST API
----
-
-You can monitor and manage the resources in your HAWQ cluster using the Ambari 
REST API.  In addition to providing access to the metrics information in your 
cluster, the API supports viewing, creating, deleting, and updating cluster 
resources.
-
-This section will provide an introduction to using the Ambari REST APIs for 
HAWQ-related cluster management activities.
-
-Refer to [Ambari API Reference 
v1](https://github.com/apache/ambari/blob/trunk/ambari-server/docs/api/v1/index.md)
 for the official Ambari API documentation, including full REST resource 
definitions and response semantics. *Note*: These APIs may change in new 
versions of Ambari.
-
-
-## <a id="ambari-rest-uri"></a>Manageable HAWQ Resources
-
-HAWQ provides several REST resources to support starting and stopping 
services, executing service checks, and viewing configuration information among 
other activities. HAWQ resources you can manage using the Ambari REST API 
include:
-
-| Ambari Resource      | Description     |
-|----------------------|------------------------|
-| cluster | The HAWQ cluster. |
-| service | The HAWQ and PXF service. You can manage other Hadoop services as 
well. |
-| component | A specific HAWQ/PXF service component, i.e. the HAWQ Master, 
PXF. |
-| configuration | A specific HAWQ/PXF configuration entity, for example the 
hawq-site or pxf-profiles configuration files, or a specific single HAWQ or PXF 
configuration property. |
-| request | A group of tasks. |
-
-## <a id="ambari-rest-uri"></a>URI Structure
-
-The Ambari REST API provides access to HAWQ cluster resources via URI (uniform 
resource identifier) paths. To use the Ambari REST API, you will send HTTP 
requests and parse JSON-formatted HTTP responses.
-
-The Ambari REST API supports standard HTTP request methods including:
-
-- `GET` - read resource properties, metrics
-- `POST` - create new resource
-- `PUT` - update resource
-- `DELETE` - delete resource
-
-URIs for Ambari REST API resources have the following structure:
-
-``` shell
-http://<ambari-server-host>:<port>/api/v1/<resource-path>
-```
-
-The Ambari REST API supports the following HAWQ-related \<resource-paths\>:
-
-| REST Resource Path              | Description     |
-|----------------------|------------------------|
-| clusters/\<cluster\-name\> | The HAWQ cluster name. |
-| clusters/\<cluster\-name\>/services/PXF | The PXF service. |
-| clusters/\<cluster\-name\>/services/HAWQ | The HAWQ service. |
-| clusters/\<cluster\-name\>/services/HAWQ/components | All HAWQ service 
components. |
-| clusters/\<cluster\-name\>/services/HAWQ/components/\<name\> | A specific 
HAWQ service component, i.e. HAWQMASTER. |
-| clusters/\<cluster\-name\>/configurations | Cluster configurations. |
-| clusters/\<cluster\-name\>/requests | Group of tasks that run a command. |
-
-## <a id="ambari-rest-curl"></a>Submitting Requests with cURL
-
-Your HTTP request to the Ambari REST API should include the following 
information:
-
-- User name and password for basic authentication.
-- An HTTP request header.
-- The HTTP request method.
-- JSON-formatted request data, if required.
-- The URI identifying the Ambari REST resource.
-
-You can use the `curl` command to transfer HTTP request data to, and receive 
data from, the Ambari server using the HTTP protocol.
-
-Use the following syntax to issue a `curl` command for Ambari HAWQ/PXF 
management operations:
-
-``` shell
-$ curl -u <user>:<passwd> -H <header> -X GET|POST|PUT|DELETE -d <data> <URI>
-```
-
-`curl` options relevant to Ambari REST API communication include:
-
-| Option              | Description     |
-|----------------------|------------------------|
-| -u \<user\>:\<passwd\> | Identify the username and password for basic 
authentication to the HTTP server. |
-| -H \<header\>   | Identify an extra header to include in the HTTP request. 
\<header\> must specify `'X-Requested-By:ambari'`.   |
-| -X \<command\>   | Identify the request method. \<command\> may specify 
`GET` (the default), `POST`, `PUT`, and `DELETE`. |
-| -d \<data\>     | Send the specified \<data\> to the HTTP server along with 
the request. The \<command\> and \<URI\> determine if \<data\> is required, and 
if so, its content.  |
-| \<URI\>    | Path to the Ambari REST resource.  |
-
-
-## <a id="ambari-rest-api-auth"></a>Authenticating with the Ambari REST API
-
-The first step in using the Ambari REST API is to authenticate with the Ambari 
server. The Ambari REST API supports HTTP basic authentication. With this 
authentication method, you provide a username and password that is internally 
encoded and sent in the HTTP header.
-
-Example: Testing Authentication
-
-1. Set up some environment variables; replace the values with those 
appropriate for your operating environment.  For example:
-
-    ``` shell
-    $ export AMBUSER=admin
-    $ export AMBPASSWD=admin
-    $ export AMBHOST=<ambari-server>
-    $ export AMBPORT=8080
-    ```
-
-2. Submit a `curl` request to the Ambari server:
-
-    ``` shell
-    $ curl -u $AMBUSER:$AMBPASSWD http://$AMBHOST:$AMBPORT
-    ```
-    
-    If authentication succeeds, Apache license information is displayed.
-
-
-## <a id="ambari-rest-using"></a>Using the Ambari REST API for HAWQ Management
-
-
-### <a id="ambari-rest-ex-clustname"></a>Example: Retrieving the HAWQ Cluster 
Name
-
-1. Set up an additional environment variables:
-
-    ``` shell
-    $ export AMBCREDS="$AMBUSER:$AMBPASSWD"
-    $ export AMBURLBASE="http://${AMBHOST}:${AMBPORT}/api/v1/clusters";
-    ```
-    
-    You will use these variables in upcoming examples to simplify `curl` calls.
-    
-2. Use the Ambari REST API to determine the name of your HAWQ cluster; also 
set `$AMBURLBASE` to include the cluster name:
-
-    ``` shell
-    $ export CLUSTER_NAME="$(curl -u ${AMBCREDS} -i -H 'X-Requested-By:ambari' 
$AMBURLBASE | sed -n 's/.*"cluster_name" : "\([^\"]*\)".*/\1/p')"
-    $ echo $CLUSTER_NAME
-    TestCluster
-    $ export AMBURLBASE=$AMBURLBASE/$CLUSTER_NAME
-    ```
-
-### <a id="ambari-rest-ex-mgmt"></a>Examples: Managing the HAWQ and PXF 
Services
-
-The following subsections provide `curl` commands for common HAWQ cluster 
management activities.
-
-Refer to [API usage scenarios, troubleshooting, and other 
FAQs](https://cwiki.apache.org/confluence/display/AMBARI/API+usage+scenarios%2C+troubleshooting%2C+and+other+FAQs)
 for additional Ambari REST API usage examples.
-
-
-#### <a id="ambari-rest-ex-get"></a>Viewing HAWQ Cluster Service and 
Configuration Information
-
-| Task              |Command           |
-|----------------------|------------------------|
-| View HAWQ service information. | `curl -u $AMBCREDS -X GET -H 
'X-Requested-By:ambari' $AMBURLBASE/services/HAWQ` |
-| List all HAWQ components. | `curl -u $AMBCREDS -X GET -H 
'X-Requested-By:ambari' $AMBURLBASE/services/HAWQ/components` |
-| View information about the HAWQ master. | `curl -u $AMBCREDS -X GET -H 
'X-Requested-By:ambari' $AMBURLBASE/services/HAWQ/components/HAWQMASTER` |
-| View the `hawq-site` configuration settings. | `curl -u $AMBCREDS -X GET -H 
'X-Requested-By:ambari' 
"$AMBURLBASE/configurations?type=hawq-site&tag=TOPOLOGY_RESOLVED"` |
-| View the initial `core-site` configuration settings. | `curl -u $AMBCREDS -X 
GET -H 'X-Requested-By:ambari' 
"$AMBURLBASE/configurations?type=core-site&tag=INITIAL"` |
-| View the `pxf-profiles` configuration file. | `curl -u $AMBCREDS -X GET -H 
'X-Requested-By:ambari' 
"$AMBURLBASE/configurations?type=pxf-profiles&tag=INITIAL"` |
-| View all components on node. | `curl -u $AMBCREDS -i  -X GET -H 
'X-Requested-B:ambari' $AMBURLBASE/hosts/<hawq-node>` |
-
-
-#### <a id="ambari-rest-ex-put"></a>Starting/Stopping HAWQ and PXF Services
-
-| Task              |Command           |
-|----------------------|------------------------|
-| Start the HAWQ service. | `curl -u $AMBCREDS -X PUT -H 
'X-Requested-By:ambari' -d '{"RequestInfo": {"context" :"Start HAWQ via REST"}, 
"Body": {"ServiceInfo": {"state": "STARTED"}}}' $AMBURLBASE/services/HAWQ` |
-| Stop the HAWQ service. | `curl -u $AMBCREDS -X PUT -H 
'X-Requested-By:ambari' -d '{"RequestInfo": {"context" :"Stop HAWQ via REST"}, 
"Body": {"ServiceInfo": {"state": "INSTALLED"}}}' $AMBURLBASE/services/HAWQ` |
-| Start the PXF service. | `curl -u $AMBCREDS -X PUT -H 
'X-Requested-By:ambari' -d '{"RequestInfo": {"context" :"Start PXF via REST"}, 
"Body": {"ServiceInfo": {"state": "STARTED"}}}' $AMBURLBASE//services/PXF` |
-| Stop the PXF service. | `curl -u $AMBCREDS -X PUT -H 'X-Requested-By:ambari' 
-d '{"RequestInfo": {"context" :"Stop PXF via REST"}, "Body": {"ServiceInfo": 
{"state": "INSTALLED"}}}' $AMBURLBASE/services/PXF` |
-
-#### <a id="ambari-rest-ex-post"></a>Invoking HAWQ and PXF Service Actions
-
-| Task              |Command           |
-|----------------------|------------------------|
-| Run a HAWQ service check. | `curl -u $AMBCREDS -X POST -H 
'X-Requested-By:ambari' -d '{"RequestInfo":{"context":"HAWQ Service 
Check","command":"HAWQ_SERVICE_CHECK"}, "Requests/resource_filters":[{ 
"service_name":"HAWQ"}]}'  $AMBURLBASE/requests` |
-| Run a PXF service check. | `curl -u $AMBCREDS -X POST -H 
'X-Requested-By:ambari' -d '{"RequestInfo":{"context":"PXF Service 
Check","command":"PXF_SERVICE_CHECK"}, "Requests/resource_filters":[{ 
"service_name":"PXF"}]}'  $AMBURLBASE/requests` |

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/admin/maintain.html.md.erb
----------------------------------------------------------------------
diff --git a/admin/maintain.html.md.erb b/admin/maintain.html.md.erb
deleted file mode 100644
index f4b1491..0000000
--- a/admin/maintain.html.md.erb
+++ /dev/null
@@ -1,31 +0,0 @@
----
-title: Routine System Maintenance Tasks
----
-
-## <a id="overview-topic"></a>Overview
-
-To keep a HAWQ system running efficiently, the database must be regularly 
cleared of expired data and the table statistics must be updated so that the 
query optimizer has accurate information.
-
-HAWQ requires that certain tasks be performed regularly to achieve optimal 
performance. The tasks discussed here are required, but database administrators 
can automate them using standard UNIX tools such as `cron` scripts. An 
administrator sets up the appropriate scripts and checks that they execute 
successfully. See [Recommended Monitoring and Maintenance 
Tasks](RecommendedMonitoringTasks.html) for additional suggested maintenance 
activities you can implement to keep your HAWQ system running optimally.
-
-## <a id="topic10"></a>Database Server Log Files 
-
-HAWQ log output tends to be voluminous, especially at higher debug levels, and 
you do not need to save it indefinitely. Administrators rotate the log files 
periodically so new log files are started and old ones are removed.
-
-HAWQ has log file rotation enabled on the master and all segment instances. 
Daily log files are created in the `pg_log` subdirectory of the master and each 
segment data directory using the following naming convention: 
<code>hawq-<i>YYYY-MM-DD\_hhmmss</i>.csv</code>. Although log files are rolled 
over daily, they are not automatically truncated or deleted. Administrators 
need to implement scripts or programs to periodically clean up old log files in 
the `pg_log` directory of the master and of every segment instance.
-
-For information about viewing the database server log files, see [Viewing the 
Database Server Log Files](monitor.html).
-
-## <a id="topic11"></a>Management Utility Log Files 
-
-Log files for the HAWQ management utilities are written to `~/hawqAdminLogs` 
by default. The naming convention for management log files is:
-
-<pre><code><i>script_name_date</i>.log
-</code></pre>
-
-The log entry format is:
-
-<pre><code><i>timestamp:utility:host:user</i>:[INFO|WARN|FATAL]:<i>message</i>
-</code></pre>
-
-The log file for a particular utility execution is appended to its daily log 
file each time that utility is run.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/admin/monitor.html.md.erb
----------------------------------------------------------------------
diff --git a/admin/monitor.html.md.erb b/admin/monitor.html.md.erb
deleted file mode 100644
index 418c8c3..0000000
--- a/admin/monitor.html.md.erb
+++ /dev/null
@@ -1,444 +0,0 @@
----
-title: Monitoring a HAWQ System
----
-
-You can monitor a HAWQ system using a variety of tools included with the 
system or available as add-ons.
-
-Observing the HAWQ system day-to-day performance helps administrators 
understand the system behavior, plan workflow, and troubleshoot problems. This 
chapter discusses tools for monitoring database performance and activity.
-
-Also, be sure to review [Recommended Monitoring and Maintenance 
Tasks](RecommendedMonitoringTasks.html) for monitoring activities you can 
script to quickly detect problems in the system.
-
-
-## <a id="topic31"></a>Using hawq\_toolkit 
-
-Use HAWQ's administrative schema 
[*hawq\_toolkit*](../reference/toolkit/hawq_toolkit.html) to query the system 
catalogs, log files, and operating environment for system status information. 
The *hawq\_toolkit* schema contains several views you can access using SQL 
commands. The *hawq\_toolkit* schema is accessible to all database users. Some 
objects require superuser permissions. Use a command similar to the following 
to add the *hawq\_toolkit* schema to your schema search path:
-
-```sql
-=> SET ROLE 'gpadmin' ;
-=# SET search_path TO myschema, hawq_toolkit ;
-```
-
-## <a id="topic3"></a>Monitoring System State 
-
-As a HAWQ administrator, you must monitor the system for problem events such 
as a segment going down or running out of disk space on a segment host. The 
following topics describe how to monitor the health of a HAWQ system and 
examine certain state information for a HAWQ system.
-
--   [Checking System State](#topic12)
--   [Checking Disk Space Usage](#topic15)
--   [Viewing Metadata Information about Database Objects](#topic24)
--   [Viewing Query Workfile Usage Information](#topic27)
-
-### <a id="topic12"></a>Checking System State 
-
-A HAWQ system is comprised of multiple PostgreSQL instances \(the master and 
segments\) spanning multiple machines. To monitor a HAWQ system, you need to 
know information about the system as a whole, as well as status information of 
the individual instances. The `hawq state` utility provides status information 
about a HAWQ system.
-
-#### <a id="topic13"></a>Viewing Master and Segment Status and Configuration 
-
-The default `hawq state` action is to check segment instances and show a brief 
status of the valid and failed segments. For example, to see a quick status of 
your HAWQ system:
-
-```shell
-$ hawq state -b
-```
-
-You can also display information about the HAWQ master data directory by 
invoking `hawq state` with the `-d` option:
-
-```shell
-$ hawq state -d <master_data_dir>
-```
-
-
-### <a id="topic15"></a>Checking Disk Space Usage 
-
-#### <a id="topic16"></a>Checking Sizing of Distributed Databases and Tables 
-
-The *hawq\_toolkit* administrative schema contains several views that you can 
use to determine the disk space usage for a distributed HAWQ database, schema, 
table, or index.
-
-##### <a id="topic17"></a>Viewing Disk Space Usage for a Database 
-
-To see the total size of a database \(in bytes\), use the 
*hawq\_size\_of\_database* view in the *hawq\_toolkit* administrative schema. 
For example:
-
-```sql
-=> SELECT * FROM hawq_toolkit.hawq_size_of_database
-     ORDER BY sodddatname;
-```
-
-##### <a id="topic18"></a>Viewing Disk Space Usage for a Table 
-
-The *hawq\_toolkit* administrative schema contains several views for checking 
the size of a table. The table sizing views list the table by object ID \(not 
by name\). To check the size of a table by name, you must look up the relation 
name \(`relname`\) in the *pg\_class* table. For example:
-
-```sql
-=> SELECT relname AS name, sotdsize AS size, sotdtoastsize
-     AS toast, sotdadditionalsize AS other
-     FROM hawq_toolkit.hawq_size_of_table_disk AS sotd, pg_class
-   WHERE sotd.sotdoid=pg_class.oid ORDER BY relname;
-```
-
-##### <a id="topic19"></a>Viewing Disk Space Usage for Indexes 
-
-The *hawq\_toolkit* administrative schema contains a number of views for 
checking index sizes. To see the total size of all index\(es\) on a table, use 
the *hawq\_size\_of\_all\_table\_indexes* view. To see the size of a particular 
index, use the *hawq\_size\_of\_index* view. The index sizing views list tables 
and indexes by object ID \(not by name\). To check the size of an index by 
name, you must look up the relation name \(`relname`\) in the *pg\_class* 
table. For example:
-
-```sql
-=> SELECT soisize, relname AS indexname
-     FROM pg_class, hawq_size_of_index
-   WHERE pg_class.oid=hawq_size_of_index.soioid
-     AND pg_class.relkind='i';
-```
-
-### <a id="topic24"></a>Viewing Metadata Information about Database Objects 
-
-HAWQ uses its system catalogs to track various metadata information about the 
objects stored in a database (tables, views, indexes and so on), as well as 
global objects including roles and tablespaces.
-
-#### <a id="topic25"></a>Viewing the Last Operation Performed 
-
-You can use the system views *pg\_stat\_operations* and 
*pg\_stat\_partition\_operations* to look up actions performed on a database 
object. For example, to view when the `cust` table was created and when it was 
last analyzed:
-
-```sql
-=> SELECT schemaname AS schema, objname AS table,
-     usename AS role, actionname AS action,
-     subtype AS type, statime AS time
-   FROM pg_stat_operations
-   WHERE objname='cust';
-```
-
-```
- schema | table | role | action  | type  | time
---------+-------+------+---------+-------+--------------------------
-  sales | cust  | main | CREATE  | TABLE | 2010-02-09 18:10:07.867977-08
-  sales | cust  | main | VACUUM  |       | 2010-02-10 13:32:39.068219-08
-  sales | cust  | main | ANALYZE |       | 2010-02-25 16:07:01.157168-08
-(3 rows)
-
-```
-
-#### <a id="topic26"></a>Viewing the Definition of an Object 
-
-You can use the `psql` `\d` meta-command to display the definition of an 
object, such as a table or view. For example, to see the definition of a table 
named `sales`:
-
-``` sql
-=> \d sales
-```
-
-```
-Append-Only Table "public.sales"
- Column |  Type   | Modifiers 
---------+---------+-----------
- id     | integer | 
- year   | integer | 
- qtr    | integer | 
- day    | integer | 
- region | text    | 
-Compression Type: None
-Compression Level: 0
-Block Size: 32768
-Checksum: f
-Distributed by: (id)
-```
-
-
-### <a id="topic27"></a>Viewing Query Workfile Usage Information 
-
-The HAWQ administrative schema *hawq\_toolkit* contains views that display 
information about HAWQ workfiles. HAWQ creates workfiles on disk if it does not 
have sufficient memory to execute the query in memory. This information can be 
used for troubleshooting and tuning queries. The information in the views can 
also be used to specify the values for the HAWQ configuration parameters 
`hawq_workfile_limit_per_query` and `hawq_workfile_limit_per_segment`.
-
-Views in the *hawq\_toolkit* schema include:
-
--   *hawq\_workfile\_entries* - one row for each operator currently using disk 
space for workfiles on a segment
--   *hawq\_workfile\_usage\_per\_query* - one row for each running query 
currently using disk space for workfiles on a segment
--   *hawq\_workfile\_usage\_per\_segment* - one row for each segment where 
each row displays the total amount of disk space currently in use for workfiles 
on the segment
-
-
-## <a id="topic28"></a>Viewing the Database Server Log Files 
-
-Every database instance in HAWQ \(master and segments\) runs a PostgreSQL 
database server with its own server log file. Daily log files are created in 
the `pg_log` directory of the master  and each segment data directory.
-
-### <a id="topic29"></a>Log File Format 
-
-The server log files are written in comma-separated values \(CSV\) format. Log 
entries may not include values for all log fields. For example, only log 
entries associated with a query worker process will have the `slice_id` 
populated. You can identify related log entries of a particular query by the 
query's session identifier \(`gp_session_id`\) and command identifier 
\(`gp_command_count`\).
-
-Log entries may include the following fields:
-
-<table>
-  <tr><th>#</th><th>Field Name</th><th>Data Type</th><th>Description</th></tr>
-  <tr><td>1</td><td>event_time</td><td>timestamp with time zone</td><td>Time 
that the log entry was written to the log</td></tr>
-  <tr><td>2</td><td>user_name</td><td>varchar(100)</td><td>The database user 
name</td></tr>
-  <tr><td>3</td><td>database_name</td><td>varchar(100)</td><td>The database 
name</td></tr>
-  <tr><td>4</td><td>process_id</td><td>varchar(10)</td><td>The system process 
ID (prefixed with "p")</td></tr>
-  <tr><td>5</td><td>thread_id</td><td>varchar(50)</td><td>The thread count 
(prefixed with "th")</td></tr>
-  <tr><td>6</td><td>remote_host</td><td>varchar(100)</td><td>On the master, 
the hostname/address of the client machine. On the segment, the 
hostname/address of the master.</td></tr>
-  <tr><td>7</td><td>remote_port</td><td>varchar(10)</td><td>The segment or 
master port number</td></tr>
-  <tr><td>8</td><td>session_start_time</td><td>timestamp with time 
zone</td><td>Time session connection was opened</td></tr>
-  <tr><td>9</td><td>transaction_id</td><td>int</td><td>Top-level transaction 
ID on the master. This ID is the parent of any subtransactions.</td></tr>
-  <tr><td>10</td><td>gp_session_id</td><td>text</td><td>Session identifier 
number (prefixed with "con")</td></tr>
-  <tr><td>11</td><td>gp_command_count</td><td>text</td><td>The command number 
within a session (prefixed with "cmd")</td></tr>
-  <tr><td>12</td><td>gp_segment</td><td>text</td><td>The segment content 
identifier. The master always has a content ID of -1.</td></tr>
-  <tr><td>13</td><td>slice_id</td><td>text</td><td>The slice ID (portion of 
the query plan being executed)</td></tr>
-  <tr><td>14</td><td>distr_tranx_id</td><td>text</td><td>Distributed 
transaction ID</td></tr>
-  <tr><td>15</td><td>local_tranx_id</td><td>text</td><td>Local transaction 
ID</td></tr>
-  <tr><td>16</td><td>sub_tranx_id</td><td>text</td><td>Subtransaction 
ID</td></tr>
-  <tr><td>17</td><td>event_severity</td><td>varchar(10)</td><td>Values 
include: LOG, ERROR, FATAL, PANIC, DEBUG1, DEBUG2</td></tr>
-  <tr><td>18</td><td>sql_state_code</td><td>varchar(10)</td><td>SQL state code 
associated with the log message</td></tr>
-  <tr><td>19</td><td>event_message</td><td>text</td><td>Log or error message 
text</td></tr>
-  <tr><td>20</td><td>event_detail</td><td>text</td><td>Detail message text 
associated with an error or warning message</td></tr>
-  <tr><td>21</td><td>event_hint</td><td>text</td><td>Hint message text 
associated with an error or warning message</td></tr>
-  <tr><td>22</td><td>internal_query</td><td>text</td><td>The 
internally-generated query text</td></tr>
-  <tr><td>23</td><td>internal_query_pos</td><td>int</td><td>The cursor index 
into the internally-generated query text</td></tr>
-  <tr><td>24</td><td>event_context</td><td>text</td><td>The context in which 
this message gets generated</td></tr>
-  <tr><td>25</td><td>debug_query_string</td><td>text</td><td>User-supplied 
query string with full detail for debugging. This string can be modified for 
internal use.</td></tr>
-  <tr><td>26</td><td>error_cursor_pos</td><td>int</td><td>The cursor index 
into the query string</td></tr>
-  <tr><td>27</td><td>func_name</td><td>text</td><td>The function in which this 
message is generated</td></tr>
-  <tr><td>28</td><td>file_name</td><td>text</td><td>The internal code file 
where the message originated</td></tr>
-  <tr><td>29</td><td>file_line</td><td>int</td><td>The line of the code file 
where the message originated</td></tr>
-  <tr><td>30</td><td>stack_trace</td><td>text</td><td>Stack trace text 
associated with this message</td></tr>
-</table>
-### <a id="topic30"></a>Searching the HAWQ Server Log Files 
-
-You can use the `gplogfilter` HAWQ utility to search through a HAWQ log file 
for entries matching specific criteria. By default, this utility searches 
through the HAWQ master log file in the default logging location. For example, 
to display the entries to the master log file starting after 2 pm on a certain 
date:
-
-``` shell
-$ gplogfilter -b '2016-01-18 14:00'
-```
-
-To search through all segment log files simultaneously, run `gplogfilter` 
through the `hawq ssh` utility. For example, specify a \<seg\_hosts\> file that 
includes all segment hosts of interest, then invoke `gplogfilter` to display 
the last three lines of each segment log file on each segment host. (Note: 
enter the commands after the `=>` prompt, do not include the `=>`.):
-
-``` shell
-$ hawq ssh -f <seg_hosts>
-=> source /usr/local/hawq/greenplum_path.sh
-=> gplogfilter -n 3 /data/hawq/segment/pg_log/hawq*.csv
-```
-
-## <a id="topic_jx2_rqg_kp"></a>HAWQ Error Codes 
-
-The following section describes SQL error codes for certain database events.
-
-### <a id="topic_pyh_sqg_kp"></a>SQL Standard Error Codes 
-
-The following table lists all the defined error codes. Some are not used, but 
are defined by the SQL standard. The error classes are also shown. For each 
error class there is a standard error code having the last three characters 
000. This code is used only for error conditions that fall within the class but 
do not have any more-specific code assigned.
-
-The PL/pgSQL condition name for each error code is the same as the phrase 
shown in the table, with underscores substituted for spaces. For example, code 
22012, DIVISION BY ZERO, has condition name DIVISION\_BY\_ZERO. Condition names 
can be written in either upper or lower case.
-
-**Note:** PL/pgSQL does not recognize warning, as opposed to error, condition 
names; those are classes 00, 01, and 02.
-
-|Error Code|Meaning|Constant|
-|----------|-------|--------|
-|**Class 00**— Successful Completion|
-|00000|SUCCESSFUL COMPLETION|successful\_completion|
-|Class 01 — Warning|
-|01000|WARNING|warning|
-|0100C|DYNAMIC RESULT SETS RETURNED|dynamic\_result\_sets\_returned|
-|01008|IMPLICIT ZERO BIT PADDING|implicit\_zero\_bit\_padding|
-|01003|NULL VALUE ELIMINATED IN SET 
FUNCTION|null\_value\_eliminated\_in\_set\_function|
-|01007|PRIVILEGE NOT GRANTED|privilege\_not\_granted|
-|01006|PRIVILEGE NOT REVOKED|privilege\_not\_revoked|
-|01004|STRING DATA RIGHT TRUNCATION|string\_data\_right\_truncation|
-|01P01|DEPRECATED FEATURE|deprecated\_feature|
-|**Class 02** — No Data \(this is also a warning class per the SQL 
standard\)|
-|02000|NO DATA|no\_data|
-|02001|NO ADDITIONAL DYNAMIC RESULT SETS 
RETURNED|no\_additional\_dynamic\_result\_sets\_returned|
-|**Class 03** — SQL Statement Not Yet Complete|
-|03000|SQL STATEMENT NOT YET COMPLETE|sql\_statement\_not\_yet\_complete|
-|**Class 08** — Connection Exception|
-|08000|CONNECTION EXCEPTION|connection\_exception|
-|08003|CONNECTION DOES NOT EXIST|connection\_does\_not\_exist|
-|08006|CONNECTION FAILURE|connection\_failure|
-|08001|SQLCLIENT UNABLE TO ESTABLISH 
SQLCONNECTION|sqlclient\_unable\_to\_establish\_sqlconnection|
-|08004|SQLSERVER REJECTED ESTABLISHMENT OF 
SQLCONNECTION|sqlserver\_rejected\_establishment\_of\_sqlconnection|
-|08007|TRANSACTION RESOLUTION UNKNOWN|transaction\_resolution\_unknown|
-|08P01|PROTOCOL VIOLATION|protocol\_violation|
-|**Class 09** — Triggered Action Exception|
-|09000|TRIGGERED ACTION EXCEPTION|triggered\_action\_exception|
-|**Class 0A** — Feature Not Supported|
-|0A000|FEATURE NOT SUPPORTED|feature\_not\_supported|
-|**Class 0B** — Invalid Transaction Initiation|
-|0B000|INVALID TRANSACTION INITIATION|invalid\_transaction\_initiation|
-|**Class 0F** — Locator Exception|
-|0F000|LOCATOR EXCEPTION|locator\_exception|
-|0F001|INVALID LOCATOR SPECIFICATION|invalid\_locator\_specification|
-|**Class 0L** — Invalid Grantor|
-|0L000|INVALID GRANTOR|invalid\_grantor|
-|0LP01|INVALID GRANT OPERATION|invalid\_grant\_operation|
-|**Class 0P** — Invalid Role Specification|
-|0P000|INVALID ROLE SPECIFICATION|invalid\_role\_specification|
-|**Class 21** — Cardinality Violation|
-|21000|CARDINALITY VIOLATION|cardinality\_violation|
-|**Class 22** — Data Exception|
-|22000|DATA EXCEPTION|data\_exception|
-|2202E|ARRAY SUBSCRIPT ERROR|array\_subscript\_error|
-|22021|CHARACTER NOT IN REPERTOIRE|character\_not\_in\_repertoire|
-|22008|DATETIME FIELD OVERFLOW|datetime\_field\_overflow|
-|22012|DIVISION BY ZERO|division\_by\_zero|
-|22005|ERROR IN ASSIGNMENT|error\_in\_assignment|
-|2200B|ESCAPE CHARACTER CONFLICT|escape\_character\_conflict|
-|22022|INDICATOR OVERFLOW|indicator\_overflow|
-|22015|INTERVAL FIELD OVERFLOW|interval\_field\_overflow|
-|2201E|INVALID ARGUMENT FOR LOGARITHM|invalid\_argument\_for\_logarithm|
-|2201F|INVALID ARGUMENT FOR POWER 
FUNCTION|invalid\_argument\_for\_power\_function|
-|2201G|INVALID ARGUMENT FOR WIDTH BUCKET 
FUNCTION|invalid\_argument\_for\_width\_bucket\_function|
-|22018|INVALID CHARACTER VALUE FOR CAST|invalid\_character\_value\_for\_cast|
-|22007|INVALID DATETIME FORMAT|invalid\_datetime\_format|
-|22019|INVALID ESCAPE CHARACTER|invalid\_escape\_character|
-|2200D|INVALID ESCAPE OCTET|invalid\_escape\_octet|
-|22025|INVALID ESCAPE SEQUENCE|invalid\_escape\_sequence|
-|22P06|NONSTANDARD USE OF ESCAPE 
CHARACTER|nonstandard\_use\_of\_escape\_character|
-|22010|INVALID INDICATOR PARAMETER VALUE|invalid\_indicator\_parameter\_value|
-|22020|INVALID LIMIT VALUE|invalid\_limit\_value|
-|22023|INVALID PARAMETER VALUE|invalid\_parameter\_value|
-|2201B|INVALID REGULAR EXPRESSION|invalid\_regular\_expression|
-|22009|INVALID TIME ZONE DISPLACEMENT 
VALUE|invalid\_time\_zone\_displacement\_value|
-|2200C|INVALID USE OF ESCAPE CHARACTER|invalid\_use\_of\_escape\_character|
-|2200G|MOST SPECIFIC TYPE MISMATCH|most\_specific\_type\_mismatch|
-|22004|NULL VALUE NOT ALLOWED|null\_value\_not\_allowed|
-|22002|NULL VALUE NO INDICATOR PARAMETER|null\_value\_no\_indicator\_parameter|
-|22003|NUMERIC VALUE OUT OF RANGE|numeric\_value\_out\_of\_range|
-|22026|STRING DATA LENGTH MISMATCH|string\_data\_length\_mismatch|
-|22001|STRING DATA RIGHT TRUNCATION|string\_data\_right\_truncation|
-|22011|SUBSTRING ERROR|substring\_error|
-|22027|TRIM ERROR|trim\_error|
-|22024|UNTERMINATED C STRING|unterminated\_c\_string|
-|2200F|ZERO LENGTH CHARACTER STRING|zero\_length\_character\_string|
-|22P01|FLOATING POINT EXCEPTION|floating\_point\_exception|
-|22P02|INVALID TEXT REPRESENTATION|invalid\_text\_representation|
-|22P03|INVALID BINARY REPRESENTATION|invalid\_binary\_representation|
-|22P04|BAD COPY FILE FORMAT|bad\_copy\_file\_format|
-|22P05|UNTRANSLATABLE CHARACTER|untranslatable\_character|
-|**Class 23** — Integrity Constraint Violation|
-|23000|INTEGRITY CONSTRAINT VIOLATION|integrity\_constraint\_violation|
-|23001|RESTRICT VIOLATION|restrict\_violation|
-|23502|NOT NULL VIOLATION|not\_null\_violation|
-|23503|FOREIGN KEY VIOLATION|foreign\_key\_violation|
-|23505|UNIQUE VIOLATION|unique\_violation|
-|23514|CHECK VIOLATION|check\_violation|
-|**Class 24** — Invalid Cursor State|
-|24000|INVALID CURSOR STATE|invalid\_cursor\_state|
-|**Class 25** — Invalid Transaction State|
-|25000|INVALID TRANSACTION STATE|invalid\_transaction\_state|
-|25001|ACTIVE SQL TRANSACTION|active\_sql\_transaction|
-|25002|BRANCH TRANSACTION ALREADY ACTIVE|branch\_transaction\_already\_active|
-|25008|HELD CURSOR REQUIRES SAME ISOLATION 
LEVEL|held\_cursor\_requires\_same\_isolation\_level|
-|25003|INAPPROPRIATE ACCESS MODE FOR BRANCH 
TRANSACTION|inappropriate\_access\_mode\_for\_branch\_transaction|
-|25004|INAPPROPRIATE ISOLATION LEVEL FOR BRANCH 
TRANSACTION|inappropriate\_isolation\_level\_for\_branch\_transaction|
-|25005|NO ACTIVE SQL TRANSACTION FOR BRANCH 
TRANSACTION|no\_active\_sql\_transaction\_for\_branch\_transaction|
-|25006|READ ONLY SQL TRANSACTION|read\_only\_sql\_transaction|
-|25007|SCHEMA AND DATA STATEMENT MIXING NOT 
SUPPORTED|schema\_and\_data\_statement\_mixing\_not\_supported|
-|25P01|NO ACTIVE SQL TRANSACTION|no\_active\_sql\_transaction|
-|25P02|IN FAILED SQL TRANSACTION|in\_failed\_sql\_transaction|
-|**Class 26** — Invalid SQL Statement Name|
-|26000|INVALID SQL STATEMENT NAME|invalid\_sql\_statement\_name|
-|**Class 27** — Triggered Data Change Violation|
-|27000|TRIGGERED DATA CHANGE VIOLATION|triggered\_data\_change\_violation|
-|**Class 28** — Invalid Authorization Specification|
-|28000|INVALID AUTHORIZATION 
SPECIFICATION|invalid\_authorization\_specification|
-|**Class 2B** — Dependent Privilege Descriptors Still Exist|
-|2B000|DEPENDENT PRIVILEGE DESCRIPTORS STILL 
EXIST|dependent\_privilege\_descriptors\_still\_exist|
-|2BP01|DEPENDENT OBJECTS STILL EXIST|dependent\_objects\_still\_exist|
-|**Class 2D** — Invalid Transaction Termination|
-|2D000|INVALID TRANSACTION TERMINATION|invalid\_transaction\_termination|
-|**Class 2F** — SQL Routine Exception|
-|2F000|SQL ROUTINE EXCEPTION|sql\_routine\_exception|
-|2F005|FUNCTION EXECUTED NO RETURN 
STATEMENT|function\_executed\_no\_return\_statement|
-|2F002|MODIFYING SQL DATA NOT PERMITTED|modifying\_sql\_data\_not\_permitted|
-|2F003|PROHIBITED SQL STATEMENT 
ATTEMPTED|prohibited\_sql\_statement\_attempted|
-|2F004|READING SQL DATA NOT PERMITTED|reading\_sql\_data\_not\_permitted|
-|**Class 34** — Invalid Cursor Name|
-|34000|INVALID CURSOR NAME|invalid\_cursor\_name|
-|**Class 38** — External Routine Exception|
-|38000|EXTERNAL ROUTINE EXCEPTION|external\_routine\_exception|
-|38001|CONTAINING SQL NOT PERMITTED|containing\_sql\_not\_permitted|
-|38002|MODIFYING SQL DATA NOT PERMITTED|modifying\_sql\_data\_not\_permitted|
-|38003|PROHIBITED SQL STATEMENT 
ATTEMPTED|prohibited\_sql\_statement\_attempted|
-|38004|READING SQL DATA NOT PERMITTED|reading\_sql\_data\_not\_permitted|
-|**Class 39** — External Routine Invocation Exception|
-|39000|EXTERNAL ROUTINE INVOCATION 
EXCEPTION|external\_routine\_invocation\_exception|
-|39001|INVALID SQLSTATE RETURNED|invalid\_sqlstate\_returned|
-|39004|NULL VALUE NOT ALLOWED|null\_value\_not\_allowed|
-|39P01|TRIGGER PROTOCOL VIOLATED|trigger\_protocol\_violated|
-|39P02|SRF PROTOCOL VIOLATED|srf\_protocol\_violated|
-|**Class 3B** — Savepoint Exception|
-|3B000|SAVEPOINT EXCEPTION|savepoint\_exception|
-|3B001|INVALID SAVEPOINT SPECIFICATION|invalid\_savepoint\_specification|
-|**Class 3D** — Invalid Catalog Name|
-|3D000|INVALID CATALOG NAME|invalid\_catalog\_name|
-|**Class 3F** — Invalid Schema Name|
-|3F000|INVALID SCHEMA NAME|invalid\_schema\_name|
-|**Class 40** — Transaction Rollback|
-|40000|TRANSACTION ROLLBACK|transaction\_rollback|
-|40002|TRANSACTION INTEGRITY CONSTRAINT 
VIOLATION|transaction\_integrity\_constraint\_violation|
-|40001|SERIALIZATION FAILURE|serialization\_failure|
-|40003|STATEMENT COMPLETION UNKNOWN|statement\_completion\_unknown|
-|40P01|DEADLOCK DETECTED|deadlock\_detected|
-|**Class 42** — Syntax Error or Access Rule Violation|
-|42000|SYNTAX ERROR OR ACCESS RULE 
VIOLATION|syntax\_error\_or\_access\_rule\_violation|
-|42601|SYNTAX ERROR|syntax\_error|
-|42501|INSUFFICIENT PRIVILEGE|insufficient\_privilege|
-|42846|CANNOT COERCE|cannot\_coerce|
-|42803|GROUPING ERROR|grouping\_error|
-|42830|INVALID FOREIGN KEY|invalid\_foreign\_key|
-|42602|INVALID NAME|invalid\_name|
-|42622|NAME TOO LONG|name\_too\_long|
-|42939|RESERVED NAME|reserved\_name|
-|42804|DATATYPE MISMATCH|datatype\_mismatch|
-|42P18|INDETERMINATE DATATYPE|indeterminate\_datatype|
-|42809|WRONG OBJECT TYPE|wrong\_object\_type|
-|42703|UNDEFINED COLUMN|undefined\_column|
-|42883|UNDEFINED FUNCTION|undefined\_function|
-|42P01|UNDEFINED TABLE|undefined\_table|
-|42P02|UNDEFINED PARAMETER|undefined\_parameter|
-|42704|UNDEFINED OBJECT|undefined\_object|
-|42701|DUPLICATE COLUMN|duplicate\_column|
-|42P03|DUPLICATE CURSOR|duplicate\_cursor|
-|42P04|DUPLICATE DATABASE|duplicate\_database|
-|42723|DUPLICATE FUNCTION|duplicate\_function|
-|42P05|DUPLICATE PREPARED STATEMENT|duplicate\_prepared\_statement|
-|42P06|DUPLICATE SCHEMA|duplicate\_schema|
-|42P07|DUPLICATE TABLE|duplicate\_table|
-|42712|DUPLICATE ALIAS|duplicate\_alias|
-|42710|DUPLICATE OBJECT|duplicate\_object|
-|42702|AMBIGUOUS COLUMN|ambiguous\_column|
-|42725|AMBIGUOUS FUNCTION|ambiguous\_function|
-|42P08|AMBIGUOUS PARAMETER|ambiguous\_parameter|
-|42P09|AMBIGUOUS ALIAS|ambiguous\_alias|
-|42P10|INVALID COLUMN REFERENCE|invalid\_column\_reference|
-|42611|INVALID COLUMN DEFINITION|invalid\_column\_definition|
-|42P11|INVALID CURSOR DEFINITION|invalid\_cursor\_definition|
-|42P12|INVALID DATABASE DEFINITION|invalid\_database\_definition|
-|42P13|INVALID FUNCTION DEFINITION|invalid\_function\_definition|
-|42P14|INVALID PREPARED STATEMENT 
DEFINITION|invalid\_prepared\_statement\_definition|
-|42P15|INVALID SCHEMA DEFINITION|invalid\_schema\_definition|
-|42P16|INVALID TABLE DEFINITION|invalid\_table\_definition|
-|42P17|INVALID OBJECT DEFINITION|invalid\_object\_definition|
-|**Class 44** — WITH CHECK OPTION Violation|
-|44000|WITH CHECK OPTION VIOLATION|with\_check\_option\_violation|
-|**Class 53** — Insufficient Resources|
-|53000|INSUFFICIENT RESOURCES|insufficient\_resources|
-|53100|DISK FULL|disk\_full|
-|53200|OUT OF MEMORY|out\_of\_memory|
-|53300|TOO MANY CONNECTIONS|too\_many\_connections|
-|**Class 54** — Program Limit Exceeded|
-|54000|PROGRAM LIMIT EXCEEDED|program\_limit\_exceeded|
-|54001|STATEMENT TOO COMPLEX|statement\_too\_complex|
-|54011|TOO MANY COLUMNS|too\_many\_columns|
-|54023|TOO MANY ARGUMENTS|too\_many\_arguments|
-|**Class 55** — Object Not In Prerequisite State|
-|55000|OBJECT NOT IN PREREQUISITE STATE|object\_not\_in\_prerequisite\_state|
-|55006|OBJECT IN USE|object\_in\_use|
-|55P02|CANT CHANGE RUNTIME PARAM|cant\_change\_runtime\_param|
-|55P03|LOCK NOT AVAILABLE|lock\_not\_available|
-|**Class 57** — Operator Intervention|
-|57000|OPERATOR INTERVENTION|operator\_intervention|
-|57014|QUERY CANCELED|query\_canceled|
-|57P01|ADMIN SHUTDOWN|admin\_shutdown|
-|57P02|CRASH SHUTDOWN|crash\_shutdown|
-|57P03|CANNOT CONNECT NOW|cannot\_connect\_now|
-|**Class 58** — System Error \(errors external to HAWQ \)|
-|58030|IO ERROR|io\_error|
-|58P01|UNDEFINED FILE|undefined\_file|
-|58P02|DUPLICATE FILE|duplicate\_file|
-|Class F0 — Configuration File Error|
-|F0000|CONFIG FILE ERROR|config\_file\_error|
-|F0001|LOCK FILE EXISTS|lock\_file\_exists|
-|**Class P0** — PL/pgSQL Error|
-|P0000|PLPGSQL ERROR|plpgsql\_error|
-|P0001|RAISE EXCEPTION|raise\_exception|
-|P0002|NO DATA FOUND|no\_data\_found|
-|P0003|TOO MANY ROWS|too\_many\_rows|
-|**Class XX** — Internal Error|
-|XX000|INTERNAL ERROR|internal\_error|
-|XX001|DATA CORRUPTED|data\_corrupted|
-|XX002|INDEX CORRUPTED|index\_corrupted|

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/admin/setuphawqopenv.html.md.erb
----------------------------------------------------------------------
diff --git a/admin/setuphawqopenv.html.md.erb b/admin/setuphawqopenv.html.md.erb
deleted file mode 100644
index 9d9b731..0000000
--- a/admin/setuphawqopenv.html.md.erb
+++ /dev/null
@@ -1,81 +0,0 @@
----
-title: Introducing the HAWQ Operating Environment
----
-
-Before invoking operations on a HAWQ cluster, you must set up your HAWQ 
environment. This set up is required for both administrative and 
non-administrative HAWQ users.
-
-## <a id="hawq_setupenv"></a>Procedure: Setting Up Your HAWQ Operating 
Environment
-
-HAWQ installs a script that you can use to set up your HAWQ cluster 
environment. The `greenplum_path.sh` script, located in your HAWQ root install 
directory, sets `$PATH` and other environment variables to find HAWQ files.  
Most importantly, `greenplum_path.sh` sets the `$GPHOME` environment variable 
to point to the root directory of the HAWQ installation.  If you installed HAWQ 
from a product distribution, the HAWQ root is typically `/usr/local/hawq`. If 
you built HAWQ from source or downloaded the tarball, you will have selected an 
install root directory on your own.
-
-Perform the following steps to set up your HAWQ operating environment:
-
-1. Log in to the HAWQ node as the desired user.  For example:
-
-    ``` shell
-    $ ssh gpadmin@<master>
-    gpadmin@master$ 
-    ```
-
-    Or, if you are already logged in to \<node\-type\> as a different user, 
switch to the desired user. For example:
-    
-    ``` shell
-    gpadmin@master$ su - <hawq-user>
-    Password:
-    hawq-user@master$ 
-    ```
-
-2. Set up your HAWQ operating environment by sourcing the `greenplum_path.sh` 
file:
-
-    ``` shell
-    hawq-node$ source /usr/local/hawq/greenplum_path.sh
-    ```
-
-    If you built HAWQ from source or downloaded the tarball, substitute the 
path to the installed or extracted `greenplum_path.sh` file \(for example 
`/opt/hawq-2.1.0.0/greenplum_path.sh`\).
-
-
-3. Edit your `.bash_profile` or other shell initialization file to source 
`greenplum_path.sh` on login.  For example, add:
-
-    ``` shell
-    source /usr/local/hawq/greenplum_path.sh
-    ```
-    
-4. Set HAWQ-specific environment variables relevant to your deployment in your 
shell initialization file. These include `PGAPPNAME`, `PGDATABASE`, `PGHOST`, 
`PGPORT`, and `PGUSER.` For example:
-
-    1.  If you use a custom HAWQ master port number, make this port number the 
default by setting the `PGPORT` environment variable in your shell 
initialization file; add:
-
-        ``` shell
-        export PGPORT=10432
-        ```
-    
-        Setting `PGPORT` simplifies `psql` invocation by providing a default 
for the `-p` (port) option.
-
-    1.  If you will routinely operate on a specific database, make this 
database the default by setting the `PGDATABASE` environment variable in your 
shell initialization file:
-
-        ``` shell
-        export PGDATABASE=<database-name>
-        ```
-    
-        Setting `PGDATABASE` simplifies `psql` invocation by providing a 
default for the `-d` (database) option.
-
-    You may choose to set additional HAWQ deployment-specific environment 
variables. See [Environment 
Variables](../reference/HAWQEnvironmentVariables.html#optionalenvironmentvariables).
-
-## <a id="hawq_env_files_and_dirs"></a>HAWQ Files and Directories
-
-The following table identifies some files and directories of interest in a 
default HAWQ installation.  Unless otherwise specified, the table entries are 
relative to `$GPHOME`.
-
-|File/Directory                   | Contents           |
-|---------------------------------|---------------------|
-| $HOME/hawqAdminLogs/            | Default HAWQ management utility log file 
directory |
-| greenplum_path.sh      | HAWQ environment set-up script |
-| bin/      | HAWQ admin, client, database, and administration utilities |
-| etc/              | HAWQ configuration files, including `hawq-site.xml` |
-| include/          | HDFS, PostgreSQL, `libpq` header files  |
-| lib/              | HAWQ libraries |
-| lib/postgresql/   | PostgreSQL shared libraries and JAR files |
-| share/postgresql/ | PostgreSQL and procedural languages samples and scripts  
  |
-| /data/hawq/[master&#124;segment]/ | Default location of HAWQ master and 
segment data directories |
-| /data/hawq/[master&#124;segment]/pg_log/ | Default location of HAWQ master 
and segment log file directories |
-| /etc/pxf/conf/               | PXF service and configuration files |
-| /usr/lib/pxf/                | PXF service and plug-in shared libraries  |
-| /usr/hdp/current/            | HDP runtime and configuration files |

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/admin/startstop.html.md.erb
----------------------------------------------------------------------
diff --git a/admin/startstop.html.md.erb b/admin/startstop.html.md.erb
deleted file mode 100644
index 7aac723..0000000
--- a/admin/startstop.html.md.erb
+++ /dev/null
@@ -1,146 +0,0 @@
----
-title: Starting and Stopping HAWQ
----
-
-In a HAWQ DBMS, the database server instances \(the master and all segments\) 
are started or stopped across all of the hosts in the system in such a way that 
they can work together as a unified DBMS.
-
-Because a HAWQ system is distributed across many machines, the process for 
starting and stopping a HAWQ system is different than the process for starting 
and stopping a regular PostgreSQL DBMS.
-
-Use the `hawq start `*`object`* and `hawq stop `*`object`* commands to start 
and stop HAWQ, respectively. These management tools are located in the 
`$GPHOME/bin` directory on your HAWQ master host. 
-
-Initializing a HAWQ system also starts the system.
-
-**Important:**
-
-Do not issue a `KILL` command to end any Postgres process. Instead, use the 
database command `pg_cancel_backend()`.
-
-For information about [hawq 
start](../reference/cli/admin_utilities/hawqstart.html) and [hawq 
stop](../reference/cli/admin_utilities/hawqstop.html), see the appropriate 
pages in the HAWQ Management Utility Reference or enter `hawq start -h` or 
`hawq stop -h` on the command line.
-
-
-## <a id="task_hkd_gzv_fp"></a>Starting HAWQ 
-
-When a HAWQ system is first initialized, it is also started. For more 
information about initializing HAWQ, see [hawq 
init](../reference/cli/admin_utilities/hawqinit.html). 
-
-To start a stopped HAWQ system that was previously initialized, run the `hawq 
start` command on the master instance.
-
-You can also use the `hawq start master` command to start only the HAWQ 
master, without segment nodes, then add these later, using `hawq start 
segment`. If you want HAWQ to ignore hosts that fail ssh validation, use the 
hawq start `--ignore-bad-hosts` option. 
-
-Use the `hawq start cluster` command to start a HAWQ system that has already 
been initialized by the `hawq init cluster` command, but has been stopped by 
the `hawq stop cluster` command. The `hawq start cluster` command starts a HAWQ 
system on the master host and starts all its segments. The command orchestrates 
this process and performs the process in parallel.
-
-
-## <a id="task_gpdb_restart"></a>Restarting HAWQ 
-
-Stop the HAWQ system and then restart it.
-
-The `hawq restart` command with the appropriate `cluster` or node-type option 
will stop and then restart HAWQ after the shutdown completes. If the master or 
segments are already stopped, restart will have no effect.
-
--   To restart a HAWQ cluster, enter the following command on the master host:
-
-    ```shell
-    $ hawq restart cluster
-    ```
-
-
-## <a id="task_upload_config"></a>Reloading Configuration File Changes Only 
-
-Reload changes to the HAWQ configuration files without interrupting the system.
-
-The `hawq stop` command can reload changes to the `pg_hba.conf `configuration 
file and to *runtime* parameters in the `hawq-site.xml` and `pg_hba.conf` files 
without service interruption. Active sessions pick up changes when they 
reconnect to the database. Many server configuration parameters require a full 
system restart \(`hawq restart cluster`\) to activate. For information about 
server configuration parameters, see the [Server Configuration Parameter 
Reference](../reference/guc/guc_config.html).
-
--   Reload configuration file changes without shutting down the system using 
the `hawq stop` command:
-
-    ```shell
-    $ hawq stop cluster --reload
-    ```
-    
-    Or:
-
-    ```shell
-    $ hawq stop cluster -u
-    ```
-    
-
-## <a id="task_maint_mode"></a>Starting the Master in Maintenance Mode 
-
-Start only the master to perform maintenance or administrative tasks without 
affecting data on the segments.
-
-Maintenance mode is a superuser-only mode that should only be used when 
required for a particular maintenance task. For example, you can connect to a 
database only on the master instance in maintenance mode and edit system 
catalog settings.
-
-1.  Run `hawq start` on the `master` using the `-m` option:
-
-    ```shell
-    $ hawq start master -m
-    ```
-
-2.  Connect to the master in maintenance mode to do catalog maintenance. For 
example:
-
-    ```shell
-    $ PGOPTIONS='-c gp_session_role=utility' psql template1
-    ```
-3.  After completing your administrative tasks, restart the master in 
production mode. 
-
-    ```shell
-    $ hawq restart master 
-    ```
-
-    **Warning:**
-
-    Incorrect use of maintenance mode connections can result in an 
inconsistent HAWQ system state. Only expert users should perform this operation.
-
-
-## <a id="task_gpdb_stop"></a>Stopping HAWQ 
-
-The `hawq stop cluster` command stops or restarts your HAWQ system and always 
runs on the master host. When activated, `hawq stop cluster` stops all 
`postgres` processes in the system, including the master and all segment 
instances. The `hawq stop cluster` command uses a default of up to 64 parallel 
worker threads to bring down the segments that make up the HAWQ cluster. The 
system waits for any active transactions to finish before shutting down. To 
stop HAWQ immediately, use fast mode. The commands `hawq stop master`, `hawq 
stop segment`, `hawq stop standby`, or `hawq stop allsegments` can be used to 
stop the master, the local segment node, standby, or all segments in the 
cluster. Stopping the master will stop only the master segment, and will not 
shut down a cluster.
-
--   To stop HAWQ:
-
-    ```shell
-    $ hawq stop cluster
-    ```
-
--   To stop HAWQ in fast mode:
-
-    ```shell
-    $ hawq stop cluster -M fast
-    ```
-
-
-## <a id="task_tx4_bl3_h5"></a>Best Practices to Start/Stop HAWQ Cluster 
Members 
-
-For best results in using `hawq start` and `hawq stop` to manage your HAWQ 
system, the following best practices are recommended.
-
--   Issue the `CHECKPOINT` command to update and flush all data files to disk 
and update the log file before stopping the cluster. A checkpoint ensures that, 
in the event of a crash, files can be restored from the checkpoint snapshot.
-
--   Stop the entire HAWQ system by stopping the cluster on the master host. 
-
-    ```shell
-    $ hawq stop cluster
-    ```
-
--   To stop segments and kill any running queries without causing data loss or 
inconsistency issues, use `fast` or `immediate` mode on the cluster:
-
-    ```shell
-    $ hawq stop cluster -M fast
-    $ hawq stop cluster -M immediate
-    ```
-
--   Use `hawq stop master` to stop the master only. If you cannot stop the 
master due to running transactions, try using `fast` shutdown. If `fast` 
shutdown does not work, use `immediate` shutdown. Use `immediate` shutdown with 
caution, as it will result in a crash-recovery run when the system is restarted.
-
-       ```shell
-    $ hawq stop master -M fast
-    $ hawq stop master -M immediate
-    ```
--   If you have changed or want to reload server parameter settings on a HAWQ 
database where there are active connections, use the command:
-
-
-       ```shell
-    $ hawq stop master -u -M fast 
-    ```   
-
--   When stopping a segment or all segments, use `smart` mode, which is the 
default. Using `fast` or `immediate` mode on segments will have no effect since 
segments are stateless.
-
-    ```shell
-    $ hawq stop segment
-    $ hawq stop allsegments
-    ```
--      You should typically always use `hawq start cluster` or `hawq restart 
cluster` to start the cluster. If you do end up starting nodes individually 
with `hawq start standby|master|segment`, make sure to always start the standby 
*before* the active master. Otherwise, the standby can become unsynchronized 
with the active master.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/bestpractices/HAWQBestPracticesOverview.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/HAWQBestPracticesOverview.html.md.erb 
b/bestpractices/HAWQBestPracticesOverview.html.md.erb
deleted file mode 100644
index 13b4dca..0000000
--- a/bestpractices/HAWQBestPracticesOverview.html.md.erb
+++ /dev/null
@@ -1,28 +0,0 @@
----
-title: Best Practices
----
-
-This chapter provides best practices on using the components and features that 
are part of a HAWQ system.
-
-
--   **[Best Practices for Operating 
HAWQ](../bestpractices/operating_hawq_bestpractices.html)**
-
-    This topic provides best practices for operating HAWQ, including 
recommendations for stopping, starting and monitoring HAWQ.
-
--   **[Best Practices for Securing 
HAWQ](../bestpractices/secure_bestpractices.html)**
-
-    To secure your HAWQ deployment, review the recommendations listed in this 
topic.
-
--   **[Best Practices for Managing 
Resources](../bestpractices/managing_resources_bestpractices.html)**
-
-    This topic describes best practices for managing resources in HAWQ.
-
--   **[Best Practices for Managing 
Data](../bestpractices/managing_data_bestpractices.html)**
-
-    This topic describes best practices for creating databases, loading data, 
partioning data, and recovering data in HAWQ.
-
--   **[Best Practices for Querying 
Data](../bestpractices/querying_data_bestpractices.html)**
-
-    To obtain the best results when querying data in HAWQ, review the best 
practices described in this topic.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/bestpractices/general_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/general_bestpractices.html.md.erb 
b/bestpractices/general_bestpractices.html.md.erb
deleted file mode 100644
index 503887b..0000000
--- a/bestpractices/general_bestpractices.html.md.erb
+++ /dev/null
@@ -1,26 +0,0 @@
----
-title: HAWQ Best Practices
----
-
-This topic addresses general best practices for users who are new to HAWQ.
-
-When using HAWQ, adhere to the following guidelines for best results:
-
--   **Use a consistent `hawq-site.xml` file to configure your entire cluster**:
-
-    Configuration guc/parameters are located in `$GPHOME/etc/hawq-site.xml`. 
This configuration file resides on all HAWQ instances and can be modified by 
using the `hawq config` utility. You can use the same configuration file 
cluster-wide across both master and segments.
-    
-    If you install and manage HAWQ using Ambari, do not use `hawq config` to 
set or change HAWQ configuration properties. Use the Ambari interface for all 
configuration changes. Configuration changes to `hawq-site.xml` made outside 
the Ambari interface will be overwritten when you restart or reconfigure  HAWQ 
using Ambari.
-
-    **Note:** While `postgresql.conf` still exists in HAWQ, any parameters 
defined in `hawq-site.xml` will overwrite configurations in `postgresql.conf`. 
For this reason, we recommend that you only use `hawq-site.xml` to configure 
your HAWQ cluster.
-
--   **Keep in mind the factors that impact the number of virtual segments used 
for queries. The number of virtual segments used directly impacts the query's 
performance.** The degree of parallelism achieved by a query is determined by 
multiple factors, including the following:
-    -   **Cost of the query**. Small queries use fewer segments and larger 
queries use more segments. Note that there are some techniques you can use when 
defining resource queues to influence the number of virtual segments and 
general resources that are allocated to queries. See [Best Practices for Using 
Resource Queues](managing_resources_bestpractices.html#topic_hvd_pls_wv).
-    -   **Available resources**. Resources available at query time. If more 
resources are available in the resource queue, the resources will be used.
-    -   **Hash table and bucket number**. If the query involves only 
hash-distributed tables, and the bucket number (bucketnum) configured for all 
the hash tables is either the same bucket number for all tables or the table 
size for random tables is no more than 1.5 times larger than the size of hash 
tables for the hash tables, then the query's parallelism is fixed (equal to the 
hash table bucket number). Otherwise, the number of virtual segments depends on 
the query's cost and hash-distributed table queries will behave like queries on 
randomly distributed tables.
-    -   **Query Type**: For queries with some user-defined functions or for 
external tables where calculating resource costs is difficult , then the number 
of virtual segments is controlled by `hawq_rm_nvseg_perquery_limit `and 
`hawq_rm_nvseg_perquery_perseg_limit` parameters, as well as by the ON clause 
and the location list of external tables. If the query has a hash result table 
(e.g. `INSERT into hash_table`) then the number of virtual segment number must 
be equal to the bucket number of the resulting hash table, If the query is 
performed in utility mode, such as for `COPY` and `ANALYZE` operations, the 
virtual segment number is calculated by different policies, which will be 
explained later in this section.
-    -   **PXF**: PXF external tables use the 
`default_hash_table_bucket_number` parameter, not the 
`hawq_rm_nvseg_perquery_perseg_limit` parameter, to control the number of 
virtual segments. 
-
-    See [Query Performance](../query/query-performance.html#topic38) for more 
details.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/bestpractices/managing_data_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/managing_data_bestpractices.html.md.erb 
b/bestpractices/managing_data_bestpractices.html.md.erb
deleted file mode 100644
index 11d6e02..0000000
--- a/bestpractices/managing_data_bestpractices.html.md.erb
+++ /dev/null
@@ -1,47 +0,0 @@
----
-title: Best Practices for Managing Data
----
-
-This topic describes best practices for creating databases, loading data, 
partioning data, and recovering data in HAWQ.
-
-## <a id="topic_xhy_v2j_1v"></a>Best Practices for Loading Data
-
-Loading data into HDFS is challenging due to the limit on the number of files 
that can be opened concurrently for write on both NameNodes and DataNodes.
-
-To obtain the best performance during data loading, observe the following best 
practices:
-
--   Typically the number of concurrent connections to a NameNode should not 
exceed 50,000, and the number of open files per DataNode should not exceed 
10,000. If you exceed these limits, NameNode and DataNode may become overloaded 
and slow.
--   If the number of partitions in a table is large, the recommended way to 
load data into the partitioned table is to load the data partition by 
partition. For example, you can use query such as the following to load data 
into only one partition:
-
-    ```sql
-    INSERT INTO target_partitioned_table_part1 SELECT * FROM source_table 
WHERE filter
-    ```
-
-    where *filter* selects only the data in the target partition.
-
--   To alleviate the load on NameNode, you can reduce the number of virtual 
segment used per node. You can do this on the statement-level or on the 
resource queue level. See [Configuring the Maximum Number of Virtual 
Segments](../resourcemgmt/ConfigureResourceManagement.html#topic_tl5_wq1_f5) 
for more information.
--   Use resource queues to limit load query and read query concurrency.
-
-The best practice for loading data into partitioned tables is to create an 
intermediate staging table, load it, and then exchange it into your partition 
design. See [Exchanging a Partition](../ddl/ddl-partition.html#topic83).
-
-## <a id="topic_s23_52j_1v"></a>Best Practices for Partitioning Data
-
-### <a id="topic65"></a>Deciding on a Table Partitioning Strategy
-
-Not all tables are good candidates for partitioning. If the answer is *yes* to 
all or most of the following questions, table partitioning is a viable database 
design strategy for improving query performance. If the answer is *no* to most 
of the following questions, table partitioning is not the right solution for 
that table. Test your design strategy to ensure that query performance improves 
as expected.
-
--   **Is the table large enough?** Large fact tables are good candidates for 
table partitioning. If you have millions or billions of records in a table, you 
may see performance benefits from logically breaking that data up into smaller 
chunks. For smaller tables with only a few thousand rows or less, the 
administrative overhead of maintaining the partitions will outweigh any 
performance benefits you might see.
--   **Are you experiencing unsatisfactory performance?** As with any 
performance tuning initiative, a table should be partitioned only if queries 
against that table are producing slower response times than desired.
--   **Do your query predicates have identifiable access patterns?** Examine 
the `WHERE` clauses of your query workload and look for table columns that are 
consistently used to access data. For example, if most of your queries tend to 
look up records by date, then a monthly or weekly date-partitioning design 
might be beneficial. Or if you tend to access records by region, consider a 
list-partitioning design to divide the table by region.
--   **Does your data warehouse maintain a window of historical data?** Another 
consideration for partition design is your organization's business requirements 
for maintaining historical data. For example, your data warehouse may require 
that you keep data for the past twelve months. If the data is partitioned by 
month, you can easily drop the oldest monthly partition from the warehouse and 
load current data into the most recent monthly partition.
--   **Can the data be divided into somewhat equal parts based on some defining 
criteria?** Choose partitioning criteria that will divide your data as evenly 
as possible. If the partitions contain a relatively equal number of records, 
query performance improves based on the number of partitions created. For 
example, by dividing a large table into 10 partitions, a query will execute 10 
times faster than it would against the unpartitioned table, provided that the 
partitions are designed to support the query's criteria.
-
-Do not create more partitions than are needed. Creating too many partitions 
can slow down management and maintenance jobs, such as vacuuming, recovering 
segments, expanding the cluster, checking disk usage, and others.
-
-Partitioning does not improve query performance unless the query optimizer can 
eliminate partitions based on the query predicates. Queries that scan every 
partition run slower than if the table were not partitioned, so avoid 
partitioning if few of your queries achieve partition elimination. Check the 
explain plan for queries to make sure that partitions are eliminated. See 
[Query Profiling](../query/query-profiling.html#topic39) for more about 
partition elimination.
-
-Be very careful with multi-level partitioning because the number of partition 
files can grow very quickly. For example, if a table is partitioned by both day 
and city, and there are 1,000 days of data and 1,000 cities, the total number 
of partitions is one million. Column-oriented tables store each column in a 
physical table, so if this table has 100 columns, the system would be required 
to manage 100 million files for the table.
-
-Before settling on a multi-level partitioning strategy, consider a single 
level partition with bitmap indexes. Indexes slow down data loads, so consider 
performance testing with your data and schema to decide on the best strategy.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/bestpractices/managing_resources_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/managing_resources_bestpractices.html.md.erb 
b/bestpractices/managing_resources_bestpractices.html.md.erb
deleted file mode 100644
index f770611..0000000
--- a/bestpractices/managing_resources_bestpractices.html.md.erb
+++ /dev/null
@@ -1,144 +0,0 @@
----
-title: Best Practices for Managing Resources
----
-
-This topic describes best practices for managing resources in HAWQ.
-
-## <a id="topic_ikz_ndx_15"></a>Best Practices for Configuring Resource 
Management
-
-When configuring resource management, you can apply certain best practices to 
ensure that resources are managed both efficiently and for best system 
performance.
-
-The following is a list of high-level best practices for optimal resource 
management:
-
--   Make sure segments do not have identical IP addresses. See [Segments Do 
Not Appear in 
gp\_segment\_configuration](../troubleshooting/Troubleshooting.html#topic_hlj_zxx_15)
 for an explanation of this problem.
--   Configure all segments to have the same resource capacity. See 
[Configuring Segment Resource 
Capacity](../resourcemgmt/ConfigureResourceManagement.html#topic_htk_fxh_15).
--   To prevent resource fragmentation, ensure that your deployment's segment 
resource capacity (standalone mode) or YARN node resource capacity (YARN mode) 
is a multiple of all virtual segment resource quotas. See [Configuring Segment 
Resource 
Capacity](../resourcemgmt/ConfigureResourceManagement.html#topic_htk_fxh_15) 
(HAWQ standalone mode) and [Setting HAWQ Segment Resource Capacity in 
YARN](../resourcemgmt/YARNIntegration.html#topic_pzf_kqn_c5).
--   Ensure that enough registered segments are available and usable for query 
resource requests. If the number of unavailable or unregistered segments is 
higher than a set limit, then query resource requests are rejected. Also ensure 
that the variance of dispatched virtual segments across physical segments is 
not greater than the configured limit. See [Rejection of Query Resource 
Requests](../troubleshooting/Troubleshooting.html#topic_vm5_znx_15).
--   Use multiple master and segment temporary directories on separate, large 
disks (2TB or greater) to load balance writes to temporary files (for example, 
`/disk1/tmp             /disk2/tmp`). For a given query, HAWQ will use a 
separate temp directory (if available) for each virtual segment to store spill 
files. Multiple HAWQ sessions will also use separate temp directories where 
available to avoid disk contention. If you configure too few temp directories, 
or you place multiple temp directories on the same disk, you increase the risk 
of disk contention or running out of disk space when multiple virtual segments 
target the same disk.
--   Configure minimum resource levels in YARN, and tune the timeout of when 
idle resources are returned to YARN. See [Tune HAWQ Resource Negotiations with 
YARN](../resourcemgmt/YARNIntegration.html#topic_wp3_4bx_15).
--   Make sure that the property `yarn.scheduler.minimum-allocation-mb` in 
`yarn-site.xml` is an equal subdivision of 1GB. For example, 1024, 512.
-
-## <a id="topic_hvd_pls_wv"></a>Best Practices for Using Resource Queues
-
-Design and configure your resource queues depending on the operational needs 
of your deployment. This topic describes the best practices for creating and 
modifying resource queues within the context of different operational scenarios.
-
-### Modifying Resource Queues for Overloaded HDFS
-
-A high number of concurrent HAWQ queries can cause HDFS to overload, 
especially when querying partitioned tables. Use the `ACTIVE_STATEMENTS` 
attribute to restrict statement concurrency in a resource queue. For example, 
if an external application is executing more than 100 concurrent queries, then 
limiting the number of active statements in your resource queues will instruct 
the HAWQ resource manager to restrict actual statement concurrency within HAWQ. 
You might want to modify an existing resource queue as follows:
-
-```sql
-ALTER RESOURCE QUEUE sampleque1 WITH (ACTIVE_STATEMENTS=20);
-```
-
-In this case, when this DDL is applied to queue `sampleque1`, the roles using 
this queue will have to wait until no more than 20 statements are running to 
execute their queries. Therefore, 80 queries will be waiting in the queue for 
later execution. Restricting the number of active query statements helps limit 
the usage of HDFS resources and protects HDFS. You can alter concurrency even 
when the resource queue is busy. For example, if a queue already has 40 
concurrent statements running, and you apply a DDL statement that specifies 
`ACTIVE_STATEMENTS=20`, then the resource queue pauses the allocation of 
resources to queries until more than 20 statements have returned their 
resources.
-
-### Isolating and Protecting Production Workloads
-
-Another best practice is using resource queues to isolate your workloads. 
Workload isolation prevents your production workload from being starved of 
resources. To create this isolation, divide your workload by creating roles for 
specific purposes. For example, you could create one role for production online 
verification and another role for the regular running of production processes.
-
-In this scenario, let us assign `role1` for the production workload and 
`role2` for production software verification. We can define the following 
resource queues under the same parent queue `dept1que`, which is the resource 
queue defined for the entire department.
-
-```sql
-CREATE RESOURCE QUEUE dept1product
-   WITH (PARENT='dept1que', MEMORY_LIMIT_CLUSTER=90%, CORE_LIMIT_CLUSTER=90%, 
RESOURCE_OVERCOMMIT_FACTOR=2);
-
-CREATE RESOURCE QUEUE dept1verification 
-   WITH (PARENT='dept1que', MEMORY_LIMIT_CLUSTER=10%, CORE_LIMIT_CLUSTER=10%, 
RESOURCE_OVERCOMMIT_FACTOR=10);
-
-ALTER ROLE role1 RESOURCE QUEUE dept1product;
-
-ALTER ROLE role2 RESOURCE QUEUE dept1verification;
-```
-
-With these resource queues defined, workload is spread across the resource 
queues as follows:
-
--   When both `role1` and `role2` have workloads, the test verification 
workload gets only 10% of the total available `dept1que` resources, leaving 90% 
of the `dept1que` resources available for running the production workload.
--   When `role1` has a workload but `role2` is idle, then 100% of all 
`dept1que` resources can be consumed by the production workload.
--   When only `role2` has a workload (for example, during a scheduled testing 
window), then 100% of all `dept1que` resources can also be utilized for testing.
-
-Even when the resource queues are busy, you can alter the resource queue's 
memory and core limits to change resource allocation policies before switching 
workloads.
-
-In addition, you can use resource queues to isolate workloads for different 
departments or different applications. For example, we can use the following 
DDL statements to define 3 departments, and an administrator can arbitrarily 
redistribute resource allocations among the departments according to usage 
requirements.
-
-```sql
-ALTER RESOURCE QUEUE pg_default 
-   WITH (MEMORY_LIMIT_CLUSTER=10%, CORE_LIMIT_CLUSTER=10%);
-
-CREATE RESOURCE QUEUE dept1 
-   WITH (PARENT='pg_root', MEMORY_LIMIT_CLUSTER=30%, CORE_LIMIT_CLUSTER=30%);
-
-CREATE RESOURCE QUEUE dept2 
-   WITH (PARENT='pg_root', MEMORY_LIMIT_CLUSTER=30%, CORE_LIMIT_CLUSTER=30%);
-
-CREATE RESOURCE QUEUE dept3 
-   WITH (PARENT='pg_root', MEMORY_LIMIT_CLUSTER=30%, CORE_LIMIT_CLUSTER=30%);
-
-CREATE RESOURCE QUEUE dept11
-   WITH (PARENT='dept1', MEMORY_LIMIT_CLUSTER=50%,CORE_LIMIT_CLUSTER=50%);
-
-CREATE RESOURCE QUEUE dept12
-   WITH (PARENT='dept1', MEMORY_LIMIT_CLUSTER=50%, CORE_LIMIT_CLUSTER=50%);
-```
-
-### Querying Parquet Tables with Large Table Size
-
-You can use resource queues to improve query performance on Parquet tables 
with a large page size. This type of query requires a large memory quota for 
virtual segments. Therefore, if one role mostly queries Parquet tables with a 
large page size, alter the resource queue associated with the role to increase 
its virtual segment resource quota. For example:
-
-```sql
-ALTER RESOURCE queue1 WITH (VSEG_RESOURCE_QUOTA='mem:2gb');
-```
-
-If there are only occasional queries on Parquet tables with a large page size, 
use a statement level specification instead of altering the resource queue. For 
example:
-
-```sql
-SET HAWQ_RM_STMT_NVSEG=10;
-SET HAWQ_RM_STMT_VSEG_MEMORY='2gb';
-query1;
-SET HAWQ_RM_STMT_NVSEG=0;
-```
-
-### Restricting Resource Consumption for Specific Queries
-
-In general, the HAWQ resource manager attempts to provide as much resources as 
possible to the current query to achieve high query performance. When a query 
is complex and large, however, the associated resource queue can use up many 
virtual segments causing other resource queues (and queries) to starve. Under 
these circumstances,you should enable nvseg limits on the resource queue 
associated with the large query. For example, you can specify that all queries 
can use no more than 200 virtual segments. To achieve this limit, alter the 
resource queue as follows
-
-``` sql
-ALTER RESOURCE QUEUE queue1 WITH (NVSEG_UPPER_LIMIT=200);
-```
-
-If we hope to make this limit vary according to the dynamic cluster size, we 
can use the following statement.
-
-```sql
-ALTER RESOURCE QUEUE queue1 WITH (NVSEG_UPPER_LIMIT_PERSEG=10);
-```
-
-After setting the limit in the above example, the actual limit will be 100 if 
you have a 10-node cluster. If the cluster is expanded to 20 nodes, then the 
limit increases automatically to 200.
-
-### Guaranteeing Resource Allocations for Individual Statements
-
-In general, the minimum number of virtual segments allocated to a statement is 
decided by the resource queue's actual capacity and its concurrency setting. 
For example, if there are 10 nodes in a cluster and the total resource capacity 
of the cluster is 640GB and 160 cores, then a resource queue having 20% 
capacity has a capacity of 128GB (640GB \* .20) and 32 cores (160 \*.20). If 
the virtual segment quota is set to 256MB, then this queue has 512 virtual 
segments allocated (128GB/256MB=512). If the `ACTIVE_STATEMENTS` concurrency 
setting for the resource queue is 20, then the minimum number of allocated 
virtual segments for each query is **25** (*trunc*(512/20)=25). However, this 
minimum number of virtual segments is a soft restriction. If a query statement 
requires only 5 virtual segments, then this minimum number of 25 is ignored 
since it is not necessary to allocate 25 for this statement.
-
-In order to raise the minimum number of virtual segments available for a query 
statement, there are two options.
-
--   *Option 1*: Alter the resource queue to reduce concurrency. This is the 
recommended way to achieve the goal. For example:
-
-    ```sql
-    ALTER RESOURCE QUEUE queue1 WITH (ACTIVE_STATEMENTS=10);
-    ```
-
-    If the original concurrency setting is 20, then the minimum number of 
virtual segments is doubled.
-
--   *Option 2*: Alter the nvseg limits of the resource queue. For example:
-
-    ```sql
-    ALTER RESOURCE QUEUE queue1 WITH (NVSEG_LOWER_LIMIT=50);
-    ```
-
-    or, alternately:
-
-    ```sql
-    ALTER RESOURCE QUEUE queue1 WITH (NVSEG_LOWER_LIMIT_PERSEG=5);
-    ```
-
-    In the second DDL, if there are 10 nodes in the cluster, the actual 
minimum number of virtual segments is 50 (5 \* 10 = 50).
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/bestpractices/operating_hawq_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/operating_hawq_bestpractices.html.md.erb 
b/bestpractices/operating_hawq_bestpractices.html.md.erb
deleted file mode 100644
index 9dc56e9..0000000
--- a/bestpractices/operating_hawq_bestpractices.html.md.erb
+++ /dev/null
@@ -1,298 +0,0 @@
----
-title: Best Practices for Operating HAWQ
----
-
-This topic provides best practices for operating HAWQ, including 
recommendations for stopping, starting and monitoring HAWQ.
-
-## <a id="best_practice_config"></a>Best Practices for Configuring HAWQ 
Parameters
-
-The HAWQ configuration guc/parameters are located in 
`$GPHOME/etc/hawq-site.xml`. This configuration file resides on all HAWQ 
instances and can be modified either by the Ambari interface or the command 
line. 
-
-If you install and manage HAWQ using Ambari, use the Ambari interface for all 
configuration changes. Do not use command line utilities such as `hawq config` 
to set or change HAWQ configuration properties for Ambari-managed clusters. 
Configuration changes to `hawq-site.xml` made outside the Ambari interface will 
be overwritten when you restart or reconfigure HAWQ using Ambari.
-
-If you manage your cluster using command line tools instead of Ambari, use a 
consistent `hawq-site.xml` file to configure your entire cluster. 
-
-**Note:** While `postgresql.conf` still exists in HAWQ, any parameters defined 
in `hawq-site.xml` will overwrite configurations in `postgresql.conf`. For this 
reason, we recommend that you only use `hawq-site.xml` to configure your HAWQ 
cluster. For Ambari clusters, always use Ambari for configuring `hawq-site.xml` 
parameters.
-
-## <a id="task_qgk_bz3_1v"></a>Best Practices to Start/Stop HAWQ Cluster 
Members
-
-For best results in using `hawq start` and `hawq stop` to manage your HAWQ 
system, the following best practices are recommended.
-
--   Issue the `CHECKPOINT` command to update and flush all data files to disk 
and update the log file before stopping the cluster. A checkpoint ensures that, 
in the event of a crash, files can be restored from the checkpoint snapshot.
--   Stop the entire HAWQ system by stopping the cluster on the master host:
-    ```shell
-    $ hawq stop cluster
-    ```
-
--   To stop segments and kill any running queries without causing data loss or 
inconsistency issues, use `fast` or `immediate` mode on the cluster:
-
-    ```shell
-    $ hawq stop cluster -M fast
-    ```
-    ```shell
-    $ hawq stop cluster -M immediate
-    ```
-
--   Use `hawq stop master` to stop the master only. If you cannot stop the 
master due to running transactions, try using fast shutdown. If fast shutdown 
does not work, use immediate shutdown. Use immediate shutdown with caution, as 
it will result in a crash-recovery run when the system is restarted. 
-
-    ```shell
-    $ hawq stop master -M fast
-    ```
-    ```shell
-    $ hawq stop master -M immediate
-    ```
-
--   When stopping a segment or all segments, you can use the default mode of 
smart mode. Using fast or immediate mode on segments will have no effect since 
segments are stateless.
-
-    ```shell
-    $ hawq stop segment
-    ```
-    ```shell
-    $ hawq stop allsegments
-    ```
-
--   Typically you should always use `hawq start cluster` or `hawq              
 restart cluster` to start the cluster. If you do end up using `hawq start 
standby|master|segment` to start nodes individually, make sure you always start 
the standby before the active master. Otherwise, the standby can become 
unsynchronized with the active master.
-
-## <a id="id_trr_m1j_1v"></a>Guidelines for Cluster Expansion
-
-This topic provides some guidelines around expanding your HAWQ cluster.
-
-There are several recommendations to keep in mind when modifying the size of 
your running HAWQ cluster:
-
--   When you add a new node, install both a DataNode and a physical segment on 
the new node.
--   After adding a new node, you should always rebalance HDFS data to maintain 
cluster performance.
--   Adding or removing a node also necessitates an update to the HDFS metadata 
cache. This update will happen eventually, but can take some time. To speed the 
update of the metadata cache, execute **`select gp_metadata_cache_clear();`**.
--   Note that for hash distributed tables, expanding the cluster will not 
immediately improve performance since hash distributed tables use a fixed 
number of virtual segments. In order to obtain better performance with hash 
distributed tables, you must redistribute the table to the updated cluster by 
either the [ALTER TABLE](../reference/sql/ALTER-TABLE.html) or [CREATE TABLE 
AS](../reference/sql/CREATE-TABLE-AS.html#topic1) command.
--   If you are using hash tables, consider updating the 
`default_hash_table_bucket_number` server configuration parameter to a larger 
value after expanding the cluster but before redistributing the hash tables.
-
-## <a id="id_o5n_p1j_1v"></a>Database State Monitoring Activities
-
-<a id="id_o5n_p1j_1v__d112e31"></a>
-
-<table>
-<caption><span class="tablecap">Table 1. Database State Monitoring 
Activities</span></caption>
-<colgroup>
-<col width="33%" />
-<col width="33%" />
-<col width="33%" />
-</colgroup>
-<thead>
-<tr class="header">
-<th>Activity</th>
-<th>Procedure</th>
-<th>Corrective Actions</th>
-</tr>
-</thead>
-<tbody>
-<tr class="odd">
-<td>List segments that are currently down. If any rows are returned, this 
should generate a warning or alert.
-<p>Recommended frequency: run every 5 to 10 minutes</p>
-<p>Severity: IMPORTANT</p></td>
-<td>Run the following query in the <code class="ph codeph">postgres</code> 
database:
-<pre class="pre codeblock"><code>SELECT * FROM gp_segment_configuration
-WHERE status &lt;&gt; &#39;u&#39;;</code></pre></td>
-<td>If the query returns any rows, follow these steps to correct the problem:
-<ol>
-<li>Verify that the hosts with down segments are responsive.</li>
-<li>If hosts are OK, check the <span class="ph filepath">pg_log</span> files 
for the down segments to discover the root cause of the segments going 
down.</li>
-</ol></td>
-</tr>
-</tbody>
-</table>
-
-
-## <a id="id_d3w_p1j_1v"></a>Hardware and Operating System Monitoring
-
-<a id="id_d3w_p1j_1v__d112e111"></a>
-
-<table>
-<caption><span class="tablecap">Table 2. Hardware and Operating System 
Monitoring Activities</span></caption>
-<colgroup>
-<col width="33%" />
-<col width="33%" />
-<col width="33%" />
-</colgroup>
-<thead>
-<tr class="header">
-<th>Activity</th>
-<th>Procedure</th>
-<th>Corrective Actions</th>
-</tr>
-</thead>
-<tbody>
-<tr class="odd">
-<td>Underlying platform check for maintenance required or system down of the 
hardware.
-<p>Recommended frequency: real-time, if possible, or every 15 minutes</p>
-<p>Severity: CRITICAL</p></td>
-<td>Set up system check for hardware and OS errors.</td>
-<td>If required, remove a machine from the HAWQ cluster to resolve hardware 
and OS issues, then add it back to the cluster after the issues are 
resolved.</td>
-</tr>
-<tr class="even">
-<td>Check disk space usage on volumes used for HAWQ data storage and the OS.
-<p>Recommended frequency: every 5 to 30 minutes</p>
-<p>Severity: CRITICAL</p></td>
-<td><div class="p">
-Set up a disk space check.
-<ul>
-<li>Set a threshold to raise an alert when a disk reaches a percentage of 
capacity. The recommended threshold is 75% full.</li>
-<li>It is not recommended to run the system with capacities approaching 
100%.</li>
-</ul>
-</div></td>
-<td>Free space on the system by removing some data or files.</td>
-</tr>
-<tr class="odd">
-<td>Check for errors or dropped packets on the network interfaces.
-<p>Recommended frequency: hourly</p>
-<p>Severity: IMPORTANT</p></td>
-<td>Set up a network interface checks.</td>
-<td><p>Work with network and OS teams to resolve errors.</p></td>
-</tr>
-<tr class="even">
-<td>Check for RAID errors or degraded RAID performance.
-<p>Recommended frequency: every 5 minutes</p>
-<p>Severity: CRITICAL</p></td>
-<td>Set up a RAID check.</td>
-<td><ul>
-<li>Replace failed disks as soon as possible.</li>
-<li>Work with system administration team to resolve other RAID or controller 
errors as soon as possible.</li>
-</ul></td>
-</tr>
-<tr class="odd">
-<td>Check for adequate I/O bandwidth and I/O skew.
-<p>Recommended frequency: when create a cluster or when hardware issues are 
suspected.</p></td>
-<td>Run the HAWQ <code class="ph codeph">hawq checkperf</code> utility.</td>
-<td><div class="p">
-The cluster may be under-specified if data transfer rates are not similar to 
the following:
-<ul>
-<li>2GB per second disk read</li>
-<li>1 GB per second disk write</li>
-<li>10 Gigabit per second network read and write</li>
-</ul>
-If transfer rates are lower than expected, consult with your data architect 
regarding performance expectations.
-</div>
-<p>If the machines on the cluster display an uneven performance profile, work 
with the system administration team to fix faulty machines.</p></td>
-</tr>
-</tbody>
-</table>
-
-
-## <a id="id_khd_q1j_1v"></a>Data Maintenance
-
-<a id="id_khd_q1j_1v__d112e279"></a>
-
-<table>
-<caption><span class="tablecap">Table 3. Data Maintenance 
Activities</span></caption>
-<colgroup>
-<col width="33%" />
-<col width="33%" />
-<col width="33%" />
-</colgroup>
-<thead>
-<tr class="header">
-<th>Activity</th>
-<th>Procedure</th>
-<th>Corrective Actions</th>
-</tr>
-</thead>
-<tbody>
-<tr class="odd">
-<td>Check for missing statistics on tables.</td>
-<td>Check the <code class="ph codeph">hawq_stats_missing</code> view in each 
database:
-<pre class="pre codeblock"><code>SELECT * FROM 
hawq_toolkit.hawq_stats_missing;</code></pre></td>
-<td>Run <code class="ph codeph">ANALYZE</code> on tables that are missing 
statistics.</td>
-</tr>
-</tbody>
-</table>
-
-
-## <a id="id_lx4_q1j_1v"></a>Database Maintenance
-
-<a id="id_lx4_q1j_1v__d112e343"></a>
-
-<table>
-<caption><span class="tablecap">Table 4. Database Maintenance 
Activities</span></caption>
-<colgroup>
-<col width="33%" />
-<col width="33%" />
-<col width="33%" />
-</colgroup>
-<thead>
-<tr class="header">
-<th>Activity</th>
-<th>Procedure</th>
-<th>Corrective Actions</th>
-</tr>
-</thead>
-<tbody>
-<tr class="odd">
-<td>Mark deleted rows in HAWQ system catalogs (tables in the <code class="ph 
codeph">pg_catalog</code> schema) so that the space they occupy can be reused.
-<p>Recommended frequency: daily</p>
-<p>Severity: CRITICAL</p></td>
-<td>Vacuum each system catalog:
-<pre class="pre codeblock"><code>VACUUM &lt;table&gt;;</code></pre></td>
-<td>Vacuum system catalogs regularly to prevent bloating.</td>
-</tr>
-<tr class="even">
-<td>Update table statistics.
-<p>Recommended frequency: after loading data and before executing queries</p>
-<p>Severity: CRITICAL</p></td>
-<td>Analyze user tables:
-<pre class="pre codeblock"><code>ANALYZEDB -d &lt;database&gt; 
-a</code></pre></td>
-<td>Analyze updated tables regularly so that the optimizer can produce 
efficient query execution plans.</td>
-</tr>
-<tr class="odd">
-<td>Backup the database data.
-<p>Recommended frequency: daily, or as required by your backup plan</p>
-<p>Severity: CRITICAL</p></td>
-<td>See <a href="../admin/BackingUpandRestoringHAWQDatabases.html">Backing up 
and Restoring HAWQ Databases</a> for a discussion of backup procedures</td>
-<td>Best practice is to have a current backup ready in case the database must 
be restored.</td>
-</tr>
-<tr class="even">
-<td>Reindex system catalogs (tables in the <code class="ph 
codeph">pg_catalog</code> schema) to maintain an efficient catalog.
-<p>Recommended frequency: weekly, or more often if database objects are 
created and dropped frequently</p></td>
-<td>Run <code class="ph codeph">REINDEX SYSTEM</code> in each database.
-<pre class="pre codeblock"><code>REINDEXDB -s</code></pre></td>
-<td>The optimizer retrieves information from the system tables to create query 
plans. If system tables and indexes are allowed to become bloated over time, 
scanning the system tables increases query execution time.</td>
-</tr>
-</tbody>
-</table>
-
-
-## <a id="id_blv_q1j_1v"></a>Patching and Upgrading
-
-<a id="id_blv_q1j_1v__d112e472"></a>
-
-<table>
-<caption><span class="tablecap">Table 5. Patch and Upgrade 
Activities</span></caption>
-<colgroup>
-<col width="33%" />
-<col width="33%" />
-<col width="33%" />
-</colgroup>
-<thead>
-<tr class="header">
-<th>Activity</th>
-<th>Procedure</th>
-<th>Corrective Actions</th>
-</tr>
-</thead>
-<tbody>
-<tr class="odd">
-<td>Ensure any bug fixes or enhancements are applied to the kernel.
-<p>Recommended frequency: at least every 6 months</p>
-<p>Severity: IMPORTANT</p></td>
-<td>Follow the vendor's instructions to update the Linux kernel.</td>
-<td>Keep the kernel current to include bug fixes and security fixes, and to 
avoid difficult future upgrades.</td>
-</tr>
-<tr class="even">
-<td>Install HAWQ minor releases.
-<p>Recommended frequency: quarterly</p>
-<p>Severity: IMPORTANT</p></td>
-<td>Always upgrade to the latest in the series.</td>
-<td>Keep the HAWQ software current to incorporate bug fixes, performance 
enhancements, and feature enhancements into your HAWQ cluster.</td>
-</tr>
-</tbody>
-</table>
-
-
-


Reply via email to