remove book branching on incubator-hawq-docs

yozie Fri, 06 Jan 2017 09:33:41 -0800

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/ddl/ddl-partition.html.md.erb
----------------------------------------------------------------------
diff --git a/ddl/ddl-partition.html.md.erb b/ddl/ddl-partition.html.md.erb
deleted file mode 100644
index f790161..0000000
--- a/ddl/ddl-partition.html.md.erb
+++ /dev/null
@@ -1,483 +0,0 @@
----
-title: Partitioning Large Tables
----
-
-Table partitioning enables supporting very large tables, such as fact tables, 
by logically dividing them into smaller, more manageable pieces. Partitioned 
tables can improve query performance by allowing the HAWQ query optimizer to 
scan only the data needed to satisfy a given query instead of scanning all the 
contents of a large table.
-
-Partitioning does not change the physical distribution of table data across 
the segments. Table distribution is physical: HAWQ physically divides 
partitioned tables and non-partitioned tables across segments to enable 
parallel query processing. Table *partitioning* is logical: HAWQ logically 
divides big tables to improve query performance and facilitate data warehouse 
maintenance tasks, such as rolling old data out of the data warehouse.
-
-HAWQ supports:
-
--   *range partitioning*: division of data based on a numerical range, such as 
date or price.
--   *list partitioning*: division of data based on a list of values, such as 
sales territory or product line.
--   A combination of both types.
-<a id="im207241"></a>
-
-![](../mdimages/partitions.jpg "Example Multi-level Partition Design")
-
-## <a id="topic64"></a>Table Partitioning in HAWQ 
-
-HAWQ divides tables into parts \(also known as partitions\) to enable 
massively parallel processing. Tables are partitioned during `CREATE TABLE` 
using the `PARTITION BY` \(and optionally the `SUBPARTITION BY`\) clause. 
Partitioning creates a top-level \(or parent\) table with one or more levels of 
sub-tables \(or child tables\). Internally, HAWQ creates an inheritance 
relationship between the top-level table and its underlying partitions, similar 
to the functionality of the `INHERITS` clause of PostgreSQL.
-
-HAWQ uses the partition criteria defined during table creation to create each 
partition with a distinct `CHECK` constraint, which limits the data that table 
can contain. The query optimizer uses `CHECK` constraints to determine which 
table partitions to scan to satisfy a given query predicate.
-
-The HAWQ system catalog stores partition hierarchy information so that rows 
inserted into the top-level parent table propagate correctly to the child table 
partitions. To change the partition design or table structure, alter the parent 
table using `ALTER TABLE` with the `PARTITION` clause.
-
-To insert data into a partitioned table, you specify the root partitioned 
table, the table created with the `CREATE TABLE` command. You also can specify 
a leaf child table of the partitioned table in an `INSERT` command. An error is 
returned if the data is not valid for the specified leaf child table. 
Specifying a child table that is not a leaf child table in the `INSERT` command 
is not supported.
-
-## <a id="topic65"></a>Deciding on a Table Partitioning Strategy 
-
-Not all tables are good candidates for partitioning. If the answer is *yes* to 
all or most of the following questions, table partitioning is a viable database 
design strategy for improving query performance. If the answer is *no* to most 
of the following questions, table partitioning is not the right solution for 
that table. Test your design strategy to ensure that query performance improves 
as expected.
-
--   **Is the table large enough?** Large fact tables are good candidates for 
table partitioning. If you have millions or billions of records in a table, you 
may see performance benefits from logically breaking that data up into smaller 
chunks. For smaller tables with only a few thousand rows or less, the 
administrative overhead of maintaining the partitions will outweigh any 
performance benefits you might see.
--   **Are you experiencing unsatisfactory performance?** As with any 
performance tuning initiative, a table should be partitioned only if queries 
against that table are producing slower response times than desired.
--   **Do your query predicates have identifiable access patterns?** Examine 
the `WHERE` clauses of your query workload and look for table columns that are 
consistently used to access data. For example, if most of your queries tend to 
look up records by date, then a monthly or weekly date-partitioning design 
might be beneficial. Or if you tend to access records by region, consider a 
list-partitioning design to divide the table by region.
--   **Does your data warehouse maintain a window of historical data?** Another 
consideration for partition design is your organization's business requirements 
for maintaining historical data. For example, your data warehouse may require 
that you keep data for the past twelve months. If the data is partitioned by 
month, you can easily drop the oldest monthly partition from the warehouse and 
load current data into the most recent monthly partition.
--   **Can the data be divided into somewhat equal parts based on some defining 
criteria?** Choose partitioning criteria that will divide your data as evenly 
as possible. If the partitions contain a relatively equal number of records, 
query performance improves based on the number of partitions created. For 
example, by dividing a large table into 10 partitions, a query will execute 10 
times faster than it would against the unpartitioned table, provided that the 
partitions are designed to support the query's criteria.
-
-Do not create more partitions than are needed. Creating too many partitions 
can slow down management and maintenance jobs, such as vacuuming, recovering 
segments, expanding the cluster, checking disk usage, and others.
-
-Partitioning does not improve query performance unless the query optimizer can 
eliminate partitions based on the query predicates. Queries that scan every 
partition run slower than if the table were not partitioned, so avoid 
partitioning if few of your queries achieve partition elimination. Check the 
explain plan for queries to make sure that partitions are eliminated. See 
[Query Profiling](../query/query-profiling.html) for more about partition 
elimination.
-
-Be very careful with multi-level partitioning because the number of partition 
files can grow very quickly. For example, if a table is partitioned by both day 
and city, and there are 1,000 days of data and 1,000 cities, the total number 
of partitions is one million. Column-oriented tables store each column in a 
physical table, so if this table has 100 columns, the system would be required 
to manage 100 million files for the table.
-
-Before settling on a multi-level partitioning strategy, consider a single 
level partition with bitmap indexes. Indexes slow down data loads, so consider 
performance testing with your data and schema to decide on the best strategy.
-
-## <a id="topic66"></a>Creating Partitioned Tables 
-
-You partition tables when you create them with `CREATE TABLE`. This topic 
provides examples of SQL syntax for creating a table with various partition 
designs.
-
-To partition a table:
-
-1.  Decide on the partition design: date range, numeric range, or list of 
values.
-2.  Choose the column\(s\) on which to partition the table.
-3.  Decide how many levels of partitions you want. For example, you can create 
a date range partition table by month and then subpartition the monthly 
partitions by sales region.
-
--   [Defining Date Range Table Partitions](#topic67)
--   [Defining Numeric Range Table Partitions](#topic68)
--   [Defining List Table Partitions](#topic69)
--   [Defining Multi-level Partitions](#topic70)
--   [Partitioning an Existing Table](#topic71)
-
-### <a id="topic67"></a>Defining Date Range Table Partitions 
-
-A date range partitioned table uses a single `date` or `timestamp` column as 
the partition key column. You can use the same partition key column to create 
subpartitions if necessary, for example, to partition by month and then 
subpartition by day. Consider partitioning by the most granular level. For 
example, for a table partitioned by date, you can partition by day and have 365 
daily partitions, rather than partition by year then subpartition by month then 
subpartition by day. A multi-level design can reduce query planning time, but a 
flat partition design runs faster.
-
-You can have HAWQ automatically generate partitions by giving a `START` value, 
an `END` value, and an `EVERY` clause that defines the partition increment 
value. By default, `START` values are always inclusive and `END` values are 
always exclusive. For example:
-
-``` sql
-CREATE TABLE sales (id int, date date, amt decimal(10,2))
-DISTRIBUTED BY (id)
-PARTITION BY RANGE (date)
-( START (date '2008-01-01') INCLUSIVE
-   END (date '2009-01-01') EXCLUSIVE
-   EVERY (INTERVAL '1 day') );
-```
-
-You can also declare and name each partition individually. For example:
-
-``` sql
-CREATE TABLE sales (id int, date date, amt decimal(10,2))
-DISTRIBUTED BY (id)
-PARTITION BY RANGE (date)
-( PARTITION Jan08 START (date '2008-01-01') INCLUSIVE ,
-  PARTITION Feb08 START (date '2008-02-01') INCLUSIVE ,
-  PARTITION Mar08 START (date '2008-03-01') INCLUSIVE ,
-  PARTITION Apr08 START (date '2008-04-01') INCLUSIVE ,
-  PARTITION May08 START (date '2008-05-01') INCLUSIVE ,
-  PARTITION Jun08 START (date '2008-06-01') INCLUSIVE ,
-  PARTITION Jul08 START (date '2008-07-01') INCLUSIVE ,
-  PARTITION Aug08 START (date '2008-08-01') INCLUSIVE ,
-  PARTITION Sep08 START (date '2008-09-01') INCLUSIVE ,
-  PARTITION Oct08 START (date '2008-10-01') INCLUSIVE ,
-  PARTITION Nov08 START (date '2008-11-01') INCLUSIVE ,
-  PARTITION Dec08 START (date '2008-12-01') INCLUSIVE
-                  END (date '2009-01-01') EXCLUSIVE );
-```
-
-You do not have to declare an `END` value for each partition, only the last 
one. In this example, `Jan08` ends where `Feb08` starts.
-
-### <a id="topic68"></a>Defining Numeric Range Table Partitions 
-
-A numeric range partitioned table uses a single numeric data type column as 
the partition key column. For example:
-
-``` sql
-CREATE TABLE rank (id int, rank int, year int, gender
-char(1), count int)
-DISTRIBUTED BY (id)
-PARTITION BY RANGE (year)
-( START (2001) END (2008) EVERY (1),
-  DEFAULT PARTITION extra );
-```
-
-For more information about default partitions, see [Adding a Default 
Partition](#topic80).
-
-### <a id="topic69"></a>Defining List Table Partitions 
-
-A list partitioned table can use any data type column that allows equality 
comparisons as its partition key column. A list partition can also have a 
multi-column \(composite\) partition key, whereas a range partition only allows 
a single column as the partition key. For list partitions, you must declare a 
partition specification for every partition \(list value\) you want to create. 
For example:
-
-``` sql
-CREATE TABLE rank (id int, rank int, year int, gender
-char(1), count int )
-DISTRIBUTED BY (id)
-PARTITION BY LIST (gender)
-( PARTITION girls VALUES ('F'),
-  PARTITION boys VALUES ('M'),
-  DEFAULT PARTITION other );
-```
-
-**Note:** The HAWQ legacy optimizer allows list partitions with multi-column 
\(composite\) partition keys. A range partition only allows a single column as 
the partition key. GPORCA does not support composite keys.
-
-For more information about default partitions, see [Adding a Default 
Partition](#topic80).
-
-### <a id="topic70"></a>Defining Multi-level Partitions 
-
-You can create a multi-level partition design with subpartitions of 
partitions. Using a *subpartition template* ensures that every partition has 
the same subpartition design, including partitions that you add later. For 
example, the following SQL creates the two-level partition design shown in 
[Figure 1](#im207241):
-
-``` sql
-CREATE TABLE sales (trans_id int, date date, amount
-decimal(9,2), region text)
-DISTRIBUTED BY (trans_id)
-PARTITION BY RANGE (date)
-SUBPARTITION BY LIST (region)
-SUBPARTITION TEMPLATE
-( SUBPARTITION usa VALUES ('usa'),
-  SUBPARTITION asia VALUES ('asia'),
-  SUBPARTITION europe VALUES ('europe'),
-  DEFAULT SUBPARTITION other_regions)
-  (START (date '2011-01-01') INCLUSIVE
-   END (date '2012-01-01') EXCLUSIVE
-   EVERY (INTERVAL '1 month'),
-   DEFAULT PARTITION outlying_dates );
-```
-
-The following example shows a three-level partition design where the `sales` 
table is partitioned by `year`, then `month`, then `region`. The `SUBPARTITION 
TEMPLATE` clauses ensure that each yearly partition has the same subpartition 
structure. The example declares a `DEFAULT` partition at each level of the 
hierarchy.
-
-``` sql
-CREATE TABLE p3_sales (id int, year int, month int, day int,
-region text)
-DISTRIBUTED BY (id)
-PARTITION BY RANGE (year)
-    SUBPARTITION BY RANGE (month)
-      SUBPARTITION TEMPLATE (
-        START (1) END (13) EVERY (1),
-        DEFAULT SUBPARTITION other_months )
-           SUBPARTITION BY LIST (region)
-             SUBPARTITION TEMPLATE (
-               SUBPARTITION usa VALUES ('usa'),
-               SUBPARTITION europe VALUES ('europe'),
-               SUBPARTITION asia VALUES ('asia'),
-               DEFAULT SUBPARTITION other_regions )
-( START (2002) END (2012) EVERY (1),
-  DEFAULT PARTITION outlying_years );
-```
-
-**CAUTION**:
-
-When you create multi-level partitions on ranges, it is easy to create a large 
number of subpartitions, some containing little or no data. This can add many 
entries to the system tables, which increases the time and memory required to 
optimize and execute queries. Increase the range interval or choose a different 
partitioning strategy to reduce the number of subpartitions created.
-
-### <a id="topic71"></a>Partitioning an Existing Table 
-
-Tables can be partitioned only at creation. If you have a table that you want 
to partition, you must create a partitioned table, load the data from the 
original table into the new table, drop the original table, and rename the 
partitioned table with the original table's name. You must also re-grant any 
table permissions. For example:
-
-``` sql
-CREATE TABLE sales2 (LIKE sales)
-PARTITION BY RANGE (date)
-( START (date '2008-01-01') INCLUSIVE
-   END (date '2009-01-01') EXCLUSIVE
-   EVERY (INTERVAL '1 month') );
-INSERT INTO sales2 SELECT * FROM sales;
-DROP TABLE sales;
-ALTER TABLE sales2 RENAME TO sales;
-GRANT ALL PRIVILEGES ON sales TO admin;
-GRANT SELECT ON sales TO guest;
-```
-
-## <a id="topic73"></a>Loading Partitioned Tables 
-
-After you create the partitioned table structure, top-level parent tables are 
empty. Data is routed to the bottom-level child table partitions. In a 
multi-level partition design, only the subpartitions at the bottom of the 
hierarchy can contain data.
-
-Rows that cannot be mapped to a child table partition are rejected and the 
load fails. To avoid unmapped rows being rejected at load time, define your 
partition hierarchy with a `DEFAULT` partition. Any rows that do not match a 
partition's `CHECK` constraints load into the `DEFAULT` partition. See [Adding 
a Default Partition](#topic80).
-
-At runtime, the query optimizer scans the entire table inheritance hierarchy 
and uses the `CHECK` table constraints to determine which of the child table 
partitions to scan to satisfy the query's conditions. The `DEFAULT` partition 
\(if your hierarchy has one\) is always scanned. `DEFAULT` partitions that 
contain data slow down the overall scan time.
-
-When you use `COPY` or `INSERT` to load data into a parent table, the data is 
automatically rerouted to the correct partition, just like a regular table.
-
-Best practice for loading data into partitioned tables is to create an 
intermediate staging table, load it, and then exchange it into your partition 
design. See [Exchanging a Partition](#topic83).
-
-## <a id="topic74"></a>Verifying Your Partition Strategy 
-
-When a table is partitioned based on the query predicate, you can use 
`EXPLAIN` to verify that the query optimizer scans only the relevant data to 
examine the query plan.
-
-For example, suppose a *sales* table is date-range partitioned by month and 
subpartitioned by region as shown in [Figure 1](#im207241). For the following 
query:
-
-``` sql
-EXPLAIN SELECT * FROM sales WHERE date='01-07-12' AND
-region='usa';
-```
-
-The query plan for this query should show a table scan of only the following 
tables:
-
--   the default partition returning 0-1 rows \(if your partition design has 
one\)
--   the January 2012 partition \(*sales\_1\_prt\_1*\) returning 0-1 rows
--   the USA region subpartition \(*sales\_1\_2\_prt\_usa*\) returning *some 
number* of rows.
-
-The following example shows the relevant portion of the query plan.
-
-``` pre
-->  `Seq Scan on``sales_1_prt_1` sales (cost=0.00..0.00 `rows=0`
-Â Â Â Â Â width=0)
-Filter: "date"=01-07-08::date AND region='USA'::text
-->  `Seq Scan on``sales_1_2_prt_usa` sales (cost=0.00..9.87
-`rows=20`
-Â Â Â Â Â Â width=40)
-```
-
-Ensure that the query optimizer does not scan unnecessary partitions or 
subpartitions \(for example, scans of months or regions not specified in the 
query predicate\), and that scans of the top-level tables return 0-1 rows.
-
-### <a id="topic75"></a>Troubleshooting Selective Partition Scanning 
-
-The following limitations can result in a query plan that shows a 
non-selective scan of your partition hierarchy.
-
--   The query optimizer can selectively scan partitioned tables only when the 
query contains a direct and simple restriction of the table using immutable 
operators such as:
-
-    =, < , <=Â , \>,Â Â \>=Â , and <\>
-
--   Selective scanning recognizes `STABLE` and `IMMUTABLE` functions, but does 
not recognize `VOLATILE` functions within a query. For example, `WHERE` clauses 
such as `date > CURRENT_DATE` cause the query optimizer to selectively scan 
partitioned tables, but `time > TIMEOFDAY` does not.
-
-## <a id="topic76"></a>Viewing Your Partition Design 
-
-You can look up information about your partition design using the 
*pg\_partitions* view. For example, to see the partition design of the *sales* 
table:
-
-``` sql
-SELECT partitionboundary, partitiontablename, partitionname,
-partitionlevel, partitionrank
-FROM pg_partitions
-WHERE tablename='sales';
-```
-
-The following table and views show information about partitioned tables.
-
--   *pg\_partition* - Tracks partitioned tables and their inheritance level 
relationships.
--   *pg\_partition\_templates* - Shows the subpartitions created using a 
subpartition template.
--   *pg\_partition\_columns* - Shows the partition key columns used in a 
partition design.
-
-## <a id="topic77"></a>Maintaining Partitioned Tables 
-
-To maintain a partitioned table, use the `ALTER TABLE` command against the 
top-level parent table. The most common scenario is to drop old partitions and 
add new ones to maintain a rolling window of data in a range partition design. 
If you have a default partition in your partition design, you add a partition 
by *splitting* the default partition.
-
--   [Adding a Partition](#topic78)
--   [Renaming a Partition](#topic79)
--   [Adding a Default Partition](#topic80)
--   [Dropping a Partition](#topic81)
--   [Truncating a Partition](#topic82)
--   [Exchanging a Partition](#topic83)
--   [Splitting a Partition](#topic84)
--   [Modifying a Subpartition Template](#topic85)
-
-**Note:** When using multi-level partition designs, the following operations 
are not supported with ALTER TABLE:
-
--   ADD DEFAULT PARTITION
--   ADD PARTITION
--   DROP DEFAULT PARTITION
--   DROP PARTITION
--   SPLIT PARTITION
--   All operations that involve modifying subpartitions.
-
-**Important:** When defining and altering partition designs, use the given 
partition name, not the table object name. Although you can query and load any 
table \(including partitioned tables\) directly using SQL commands, you can 
only modify the structure of a partitioned table using the `ALTER 
TABLE...PARTITION` clauses.
-
-Partitions are not required to have names. If a partition does not have a 
name, use one of the following expressions to specify a part: `PARTITION FOR 
(value)` or \)`PARTITION FOR(RANK(number)`.
-
-### <a id="topic78"></a>Adding a Partition 
-
-You can add a partition to a partition design with the `ALTER TABLE` command. 
If the original partition design included subpartitions defined by a 
*subpartition template*, the newly added partition is subpartitioned according 
to that template. For example:
-
-``` sql
-ALTER TABLE sales ADD PARTITION
-    START (date '2009-02-01') INCLUSIVE
-    END (date '2009-03-01') EXCLUSIVE;
-```
-
-If you did not use a subpartition template when you created the table, you 
define subpartitions when adding a partition:
-
-``` sql
-ALTER TABLE sales ADD PARTITION
-    START (date '2009-02-01') INCLUSIVE
-    END (date '2009-03-01') EXCLUSIVE
-     ( SUBPARTITION usa VALUES ('usa'),
-       SUBPARTITION asia VALUES ('asia'),
-       SUBPARTITION europe VALUES ('europe') );
-```
-
-When you add a subpartition to an existing partition, you can specify the 
partition to alter. For example:
-
-``` sql
-ALTER TABLE sales ALTER PARTITION FOR (RANK(12))
-      ADD PARTITION africa VALUES ('africa');
-```
-
-**Note:** You cannot add a partition to a partition design that has a default 
partition. You must split the default partition to add a partition. See 
[Splitting a Partition](#topic84).
-
-### <a id="topic79"></a>Renaming a Partition 
-
-Partitioned tables use the following naming convention. Partitioned subtable 
names are subject to uniqueness requirements and length limitations.
-
-<pre><code><i>&lt;parentname&gt;</i>_<i>&lt;level&gt;</i>_prt_<i>&lt;partition_name&gt;</i></code></pre>
-
-For example:
-
-```
-sales_1_prt_jan08
-```
-
-For auto-generated range partitions, where a number is assigned when no name 
is given\):
-
-```
-sales_1_prt_1
-```
-
-To rename a partitioned child table, rename the top-level parent table. The 
*&lt;parentname&gt;* changes in the table names of all associated child table 
partitions. For example, the following command:
-
-``` sql
-ALTER TABLE sales RENAME TO globalsales;
-```
-
-Changes the associated table names:
-
-```
-globalsales_1_prt_1
-```
-
-You can change the name of a partition to make it easier to identify. For 
example:
-
-``` sql
-ALTER TABLE sales RENAME PARTITION FOR ('2008-01-01') TO jan08;
-```
-
-Changes the associated table name as follows:
-
-```
-sales_1_prt_jan08
-```
-
-When altering partitioned tables with the `ALTER TABLE` command, always refer 
to the tables by their partition name \(*jan08*\) and not their full table name 
\(*sales\_1\_prt\_jan08*\).
-
-**Note:** The table name cannot be a partition name in an `ALTER TABLE` 
statement. For example, `ALTER TABLE sales...` is correct, `ALTER TABLE 
sales_1_part_jan08...` is not allowed.
-
-### <a id="topic80"></a>Adding a Default Partition 
-
-You can add a default partition to a partition design with the `ALTER TABLE` 
command.
-
-``` sql
-ALTER TABLE sales ADD DEFAULT PARTITION other;
-```
-
-If incoming data does not match a partition's `CHECK` constraint and there is 
no default partition, the data is rejected. Default partitions ensure that 
incoming data that does not match a partition is inserted into the default 
partition.
-
-### <a id="topic81"></a>Dropping a Partition 
-
-You can drop a partition from your partition design using the `ALTER TABLE` 
command. When you drop a partition that has subpartitions, the subpartitions 
\(and all data in them\) are automatically dropped as well. For range 
partitions, it is common to drop the older partitions from the range as old 
data is rolled out of the data warehouse. For example:
-
-``` sql
-ALTER TABLE sales DROP PARTITION FOR (RANK(1));
-```
-
-### <a id="topic_enm_vrk_kv"></a>Sorting AORO Partitioned Tables 
-
-HDFS read access for large numbers of append-only, row-oriented \(AORO\) 
tables with large numbers of partitions can be tuned by using the 
`optimizer_parts_to_force_sort_on_insert` parameter to control how HDFS opens 
files. This parameter controls the way the optimizer sorts tuples during INSERT 
operations, to maximize HDFS performance.
-
-The user-tunable parameter `optimizer_parts_to_force_sort_on_insert` can force 
the GPORCA query optimizer to generate a plan for sorting tuples during 
insertion into an append-only, row-oriented \(AORO\) partitioned tables. 
Sorting the insert tuples reduces the number of partition switches, thus 
improving the overall INSERT performance. For a given AORO table, if its number 
of leaf-partitioned tables is greater than or equal to the number specified in 
`optimizer_parts_to_force_sort_on_insert`, the plan generated by the GPORCA 
will sort inserts by their partition IDs before performing the INSERT 
operation. Otherwise, the inserts are not sorted. The default value for 
`optimizer_parts_to_force_sort_on_insert` is 160.
-
-### <a id="topic82"></a>Truncating a Partition 
-
-You can truncate a partition using the `ALTER TABLE` command. When you 
truncate a partition that has subpartitions, the subpartitions are 
automatically truncated as well.
-
-``` sql
-ALTER TABLE sales TRUNCATE PARTITION FOR (RANK(1));
-```
-
-### <a id="topic83"></a>Exchanging a Partition 
-
-You can exchange a partition using the `ALTER TABLE` command. Exchanging a 
partition swaps one table in place of an existing partition. You can exchange 
partitions only at the lowest level of your partition hierarchy \(only 
partitions that contain data can be exchanged\).
-
-Partition exchange can be useful for data loading. For example, load a staging 
table and swap the loaded table into your partition design. You can use 
partition exchange to change the storage type of older partitions to 
append-only tables. For example:
-
-``` sql
-CREATE TABLE jan12 (LIKE sales) WITH (appendonly=true);
-INSERT INTO jan12 SELECT * FROM sales_1_prt_1 ;
-ALTER TABLE sales EXCHANGE PARTITION FOR (DATE '2012-01-01')
-WITH TABLE jan12;
-```
-
-**Note:** This example refers to the single-level definition of the table 
`sales`, before partitions were added and altered in the previous examples.
-
-### <a id="topic84"></a>Splitting a Partition 
-
-Splitting a partition divides a partition into two partitions. You can split a 
partition using the `ALTER TABLE` command. You can split partitions only at the 
lowest level of your partition hierarchy: only partitions that contain data can 
be split. The split value you specify goes into the *latter* partition.
-
-For example, to split a monthly partition into two with the first partition 
containing dates January 1-15 and the second partition containing dates January 
16-31:
-
-``` sql
-ALTER TABLE sales SPLIT PARTITION FOR ('2008-01-01')
-AT ('2008-01-16')
-INTO (PARTITION jan081to15, PARTITION jan0816to31);
-```
-
-If your partition design has a default partition, you must split the default 
partition to add a partition.
-
-When using the `INTO` clause, specify the current default partition as the 
second partition name. For example, to split a default range partition to add a 
new monthly partition for January 2009:
-
-``` sql
-ALTER TABLE sales SPLIT DEFAULT PARTITION
-START ('2009-01-01') INCLUSIVE
-END ('2009-02-01') EXCLUSIVE
-INTO (PARTITION jan09, default partition);
-```
-
-### <a id="topic85"></a>Modifying a Subpartition Template 
-
-Use `ALTER TABLE` SET SUBPARTITION TEMPLATE to modify the subpartition 
template of a partitioned table. Partitions added after you set a new 
subpartition template have the new partition design. Existing partitions are 
not modified.
-
-The following example alters the subpartition template of this partitioned 
table:
-
-``` sql
-CREATE TABLE sales (trans_id int, date date, amount decimal(9,2), region text)
-  DISTRIBUTED BY (trans_id)
-  PARTITION BY RANGE (date)
-  SUBPARTITION BY LIST (region)
-  SUBPARTITION TEMPLATE
-    ( SUBPARTITION usa VALUES ('usa'),
-      SUBPARTITION asia VALUES ('asia'),
-      SUBPARTITION europe VALUES ('europe'),
-      DEFAULT SUBPARTITION other_regions )
-  ( START (date '2014-01-01') INCLUSIVE
-    END (date '2014-04-01') EXCLUSIVE
-    EVERY (INTERVAL '1 month') );
-```
-
-This `ALTER TABLE` command, modifies the subpartition template.
-
-``` sql
-ALTER TABLE sales SET SUBPARTITION TEMPLATE
-( SUBPARTITION usa VALUES ('usa'),
-  SUBPARTITION asia VALUES ('asia'),
-  SUBPARTITION europe VALUES ('europe'),
-  SUBPARTITION africa VALUES ('africa'),
-  DEFAULT SUBPARTITION regions );
-```
-
-When you add a date-range partition of the table sales, it includes the new 
regional list subpartition for Africa. For example, the following command 
creates the subpartitions `usa`, `asia`, `europe`, `africa`, and a default 
partition named `other`:
-
-``` sql
-ALTER TABLE sales ADD PARTITION "4"
-  START ('2014-04-01') INCLUSIVE
-  END ('2014-05-01') EXCLUSIVE ;
-```
-
-To view the tables created for the partitioned table `sales`, you can use the 
command `\dt sales*` from the psql command line.
-
-To remove a subpartition template, use `SET SUBPARTITION TEMPLATE` with empty 
parentheses. For example, to clear the sales table subpartition template:
-
-``` sql
-ALTER TABLE sales SET SUBPARTITION TEMPLATE ();
-```


http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/ddl/ddl-schema.html.md.erb
----------------------------------------------------------------------
diff --git a/ddl/ddl-schema.html.md.erb b/ddl/ddl-schema.html.md.erb
deleted file mode 100644
index 7c361ba..0000000
--- a/ddl/ddl-schema.html.md.erb
+++ /dev/null
@@ -1,88 +0,0 @@
----
-title: Creating and Managing Schemas
----
-
-Schemas logically organize objects and data in a database. Schemas allow you 
to have more than one object \(such as tables\) with the same name in the 
database without conflict if the objects are in different schemas.
-
-## <a id="topic18"></a>The Default "Public" Schema 
-
-Every database has a default schema named *public*. If you do not create any 
schemas, objects are created in the *public* schema. All database roles 
\(users\) have `CREATE` and `USAGE` privileges in the *public* schema. When you 
create a schema, you grant privileges to your users to allow access to the 
schema.
-
-## <a id="topic19"></a>Creating a Schema 
-
-Use the `CREATE SCHEMA` command to create a new schema. For example:
-
-``` sql
-=> CREATE SCHEMA myschema;
-```
-
-To create or access objects in a schema, write a qualified name consisting of 
the schema name and table name separated by a period. For example:
-
-```
-myschema.table
-```
-
-See [Schema Search Paths](#topic20) for information about accessing a schema.
-
-You can create a schema owned by someone else, for example, to restrict the 
activities of your users to well-defined namespaces. The syntax is:
-
-``` sql
-=> CREATE SCHEMA schemaname AUTHORIZATION username;
-```
-
-## <a id="topic20"></a>Schema Search Paths 
-
-To specify an object's location in a database, use the schema-qualified name. 
For example:
-
-``` sql
-=> SELECT * FROM myschema.mytable;
-```
-
-You can set the `search_path` configuration parameter to specify the order in 
which to search the available schemas for objects. The schema listed first in 
the search path becomes the *default* schema. If a schema is not specified, 
objects are created in the default schema.
-
-### <a id="topic21"></a>Setting the Schema Search Path 
-
-The `search_path` configuration parameter sets the schema search order. The 
`ALTER DATABASE` command sets the search path. For example:
-
-``` sql
-=> ALTER DATABASE mydatabase SET search_path TO myschema,
-public, pg_catalog;
-```
-
-### <a id="topic22"></a>Viewing the Current Schema 
-
-Use the `current_schema()` function to view the current schema. For example:
-
-``` sql
-=> SELECT current_schema();
-```
-
-Use the `SHOW` command to view the current search path. For example:
-
-``` sql
-=> SHOW search_path;
-```
-
-## <a id="topic23"></a>Dropping a Schema 
-
-Use the `DROP SCHEMA` command to drop \(delete\) a schema. For example:
-
-``` sql
-=> DROP SCHEMA myschema;
-```
-
-By default, the schema must be empty before you can drop it. To drop a schema 
and all of its objects \(tables, data, functions, and so on\) use:
-
-``` sql
-=> DROP SCHEMA myschema CASCADE;
-```
-
-## <a id="topic24"></a>System Schemas 
-
-The following system-level schemas exist in every database:
-
--   `pg_catalog` contains the system catalog tables, built-in data types, 
functions, and operators. It is always part of the schema search path, even if 
it is not explicitly named in the search path.
--   `information_schema` consists of a standardized set of views that contain 
information about the objects in the database. These views get system 
information from the system catalog tables in a standardized way.
--   `pg_toast` stores large objects such as records that exceed the page size. 
This schema is used internally by the HAWQ system.
--   `pg_bitmapindex` stores bitmap index objects such as lists of values. This 
schema is used internally by the HAWQ system.
--   `hawq_toolkit` is an administrative schema that contains external tables, 
views, and functions that you can access with SQL commands. All database users 
can access `hawq_toolkit` to view and query the system log files and other 
system metrics.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/ddl/ddl-storage.html.md.erb
----------------------------------------------------------------------
diff --git a/ddl/ddl-storage.html.md.erb b/ddl/ddl-storage.html.md.erb
deleted file mode 100644
index 264e552..0000000
--- a/ddl/ddl-storage.html.md.erb
+++ /dev/null
@@ -1,71 +0,0 @@
----
-title: Table Storage Model and Distribution Policy
----
-
-HAWQ supports several storage models and a mix of storage models. When you 
create a table, you choose how to store its data. This topic explains the 
options for table storage and how to choose the best storage model for your 
workload.
-
-**Note:** To simplify the creation of database tables, you can specify the 
default values for some table storage options with the HAWQ server 
configuration parameter `gp_default_storage_options`.
-
-## <a id="topic39"></a>Row-Oriented Storage 
-
-HAWQ provides storage orientation models of either row-oriented or Parquet 
tables. Evaluate performance using your own data and query workloads to 
determine the best alternatives.
-
--   Row-oriented storage: good for OLTP types of workloads with many iterative 
transactions and many columns of a single row needed all at once, so retrieving 
is efficient.
-
-    **Note:** Column-oriented storage is no longer available. Parquet storage 
should be used, instead.
-
-Row-oriented storage provides the best options for the following situations:
-
--   **Frequent INSERTs.** Where rows are frequently inserted into the table
--   **Number of columns requested in queries.** Where you typically request 
all or the majority of columns in the `SELECT` list or `WHERE` clause of your 
queries, choose a row-oriented model. 
--   **Number of columns in the table.** Row-oriented storage is most efficient 
when many columns are required at the same time, or when the row-size of a 
table is relatively small. 
-
-## <a id="topic55"></a>Altering a Table 
-
-The `ALTER TABLE`command changes the definition of a table. Use `ALTER TABLE` 
to change table attributes such as column definitions, distribution policy, 
storage model, and partition structure \(see also [Maintaining Partitioned 
Tables](ddl-partition.html)\). For example, to add a not-null constraint to a 
table column:
-
-``` sql
-=> ALTER TABLE address ALTER COLUMN street SET NOT NULL;
-```
-
-### <a id="topic56"></a>Altering Table Distribution 
-
-`ALTER TABLE` provides options to change a table's distribution policy . When 
the table distribution options change, the table data is redistributed on disk, 
which can be resource intensive. You can also redistribute table data using the 
existing distribution policy.
-
-### <a id="topic57"></a>Changing the Distribution Policy 
-
-For partitioned tables, changes to the distribution policy apply recursively 
to the child partitions. This operation preserves the ownership and all other 
attributes of the table. For example, the following command redistributes the 
table sales across all segments using the customer\_id column as the 
distribution key:
-
-``` sql
-ALTER TABLE sales SET DISTRIBUTED BY (customer_id);
-```
-
-When you change the hash distribution of a table, table data is automatically 
redistributed. Changing the distribution policy to a random distribution does 
not cause the data to be redistributed. For example:
-
-``` sql
-ALTER TABLE sales SET DISTRIBUTED RANDOMLY;
-```
-
-### <a id="topic58"></a>Redistributing Table Data 
-
-To redistribute table data for tables with a random distribution policy \(or 
when the hash distribution policy has not changed\) use `REORGANIZE=TRUE`. 
Reorganizing data may be necessary to correct a data skew problem, or when 
segment resources are added to the system. For example, the following command 
redistributes table data across all segments using the current distribution 
policy, including random distribution.
-
-``` sql
-ALTER TABLE sales SET WITH (REORGANIZE=TRUE);
-```
-
-## <a id="topic62"></a>Dropping a Table 
-
-The`DROP TABLE`command removes tables from the database. For example:
-
-``` sql
-DROP TABLE mytable;
-```
-
-`DROP TABLE` always removes any indexes, rules, triggers, and constraints that 
exist for the target table. Specify `CASCADE`to drop a table that is referenced 
by a view. `CASCADE` removes dependent views.
-
-To empty a table of rows without removing the table definition, use 
`TRUNCATE`. For example:
-
-``` sql
-TRUNCATE mytable;
-```

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/ddl/ddl-table.html.md.erb
----------------------------------------------------------------------
diff --git a/ddl/ddl-table.html.md.erb b/ddl/ddl-table.html.md.erb
deleted file mode 100644
index bc4f0c4..0000000
--- a/ddl/ddl-table.html.md.erb
+++ /dev/null
@@ -1,149 +0,0 @@
----
-title: Creating and Managing Tables
----
-
-HAWQ Tables are similar to tables in any relational database, except that 
table rows are distributed across the different segments in the system. When 
you create a table, you specify the table's distribution policy.
-
-## <a id="topic26"></a>Creating a Table 
-
-The `CREATE TABLE` command creates a table and defines its structure. When you 
create a table, you define:
-
--   The columns of the table and their associated data types. See [Choosing 
Column Data Types](#topic27).
--   Any table constraints to limit the data that a column or table can 
contain. See [Setting Table Constraints](#topic28).
--   The distribution policy of the table, which determines how HAWQ divides 
data is across the segments. See [Choosing the Table Distribution 
Policy](#topic34).
--   The way the table is stored on disk.
--   The table partitioning strategy for large tables, which specifies how the 
data should be divided. See [Creating and Managing 
Databases](../ddl/ddl-database.html).
-
-### <a id="topic27"></a>Choosing Column Data Types 
-
-The data type of a column determines the types of data values the column can 
contain. Choose the data type that uses the least possible space but can still 
accommodate your data and that best constrains the data. For example, use 
character data types for strings, date or timestamp data types for dates, and 
numeric data types for numbers.
-
-There are no performance differences among the character data types `CHAR`, 
`VARCHAR`, and `TEXT` apart from the increased storage size when you use the 
blank-padded type. In most situations, use `TEXT` or `VARCHAR` rather than 
`CHAR`.
-
-Use the smallest numeric data type that will accommodate your numeric data and 
allow for future expansion. For example, using `BIGINT` for data that fits in 
`INT` or `SMALLINT` wastes storage space. If you expect that your data values 
will expand over time, consider that changing from a smaller datatype to a 
larger datatype after loading large amounts of data is costly. For example, if 
your current data values fit in a `SMALLINT` but it is likely that the values 
will expand, `INT` is the better long-term choice.
-
-Use the same data types for columns that you plan to use in cross-table joins. 
When the data types are different, the database must convert one of them so 
that the data values can be compared correctly, which adds unnecessary overhead.
-
-HAWQ supports the parquet columnar storage format, which can increase 
performance on large queries. Use parquet tables for HAWQ internal tables.
-
-### <a id="topic28"></a>Setting Table Constraints 
-
-You can define constraints to restrict the data in your tables. HAWQ support 
for constraints is the same as PostgreSQL with some limitations, including:
-
--   `CHECK` constraints can refer only to the table on which they are defined.
--   `FOREIGN KEY` constraints are allowed, but not enforced.
--   Constraints that you define on partitioned tables apply to the partitioned 
table as a whole. You cannot define constraints on the individual parts of the 
table.
-
-#### <a id="topic29"></a>Check Constraints 
-
-Check constraints allow you to specify that the value in a certain column must 
satisfy a Boolean \(truth-value\) expression. For example, to require positive 
product prices:
-
-``` sql
-=> CREATE TABLE products
-     ( product_no integer,
-       name text,
-       price numeric CHECK (price > 0) );
-```
-
-#### <a id="topic30"></a>Not-Null Constraints 
-
-Not-null constraints specify that a column must not assume the null value. A 
not-null constraint is always written as a column constraint. For example:
-
-``` sql
-=> CREATE TABLE products
-     ( product_no integer NOT NULL,
-       name text NOT NULL,
-       price numeric );
-```
-
-#### <a id="topic33"></a>Foreign Keys 
-
-Foreign keys are not supported. You can declare them, but referential 
integrity is not enforced.
-
-Foreign key constraints specify that the values in a column or a group of 
columns must match the values appearing in some row of another table to 
maintain referential integrity between two related tables. Referential 
integrity checks cannot be enforced between the distributed table segments of a 
HAWQ database.
-
-### <a id="topic34"></a>Choosing the Table Distribution Policy 
-
-All HAWQ tables are distributed. The default is `DISTRIBUTED RANDOMLY` 
\(round-robin distribution\) to determine the table row distribution. However, 
when you create or alter a table, you can optionally specify `DISTRIBUTED BY` 
to distribute data according to a hash-based policy. In this case, the 
`bucketnum` attribute sets the number of hash buckets used by a 
hash-distributed table. Columns of geometric or user-defined data types are not 
eligible as HAWQ distribution key columns. 
-
-Randomly distributed tables have benefits over hash distributed tables. For 
example, after expansion, HAWQ's elasticity feature lets it automatically use 
more resources without needing to redistribute the data. For extremely large 
tables, redistribution is very expensive. Also, data locality for randomly 
distributed tables is better, especially after the underlying HDFS 
redistributes its data during rebalancing or because of DataNode failures. This 
is quite common when the cluster is large.
-
-However, hash distributed tables can be faster than randomly distributed 
tables. For example, for TPCH queries, where there are several queries, HASH 
distributed tables can have performance benefits. Choose a distribution policy 
that best suits your application scenario. When you `CREATE TABLE`, you can 
also specify the `bucketnum` option. The `bucketnum` determines the number of 
hash buckets used in creating a hash-distributed table or for PXF external 
table intermediate processing. The number of buckets also affects how many 
virtual segments will be created when processing this data. The bucketnumber of 
a gpfdist external table is the number of gpfdist location, and the 
bucketnumber of a command external table is `ON #num`. PXF external tables use 
the `default_hash_table_bucket_number` parameter to control virtual segments. 
-
-HAWQ's elastic execution runtime is based on virtual segments, which are 
allocated on demand, based on the cost of the query. Each node uses one 
physical segment and a number of dynamically allocated virtual segments 
distributed to different hosts, thus simplifying performance tuning. Large 
queries use large numbers of virtual segments, while smaller queries use fewer 
virtual segments. Tables do not need to be redistributed when nodes are added 
or removed.
-
-In general, the more virtual segments are used, the faster the query will be 
executed. You can tune the parameters for `default_hash_table_bucket_number` 
and `hawq_rm_nvseg_perquery_limit` to adjust performance by controlling the 
number of virtual segments used for a query. However, be aware that if the 
value of `default_hash_table_bucket_number` is changed, data must be 
redistributed, which can be costly. Therefore, it is better to set the 
`default_hash_table_bucket_number` up front, if you expect to need a larger 
number of virtual segments. However, you might need to adjust the value in 
`default_hash_table_bucket_number` after cluster expansion, but should take 
care not to exceed the number of virtual segments per query set in 
`hawq_rm_nvseg_perquery_limit`. Refer to the recommended guidelines for setting 
the value of `default_hash_table_bucket_number`, later in this section.
-
-For random or gpfdist external tables, as well as user-defined functions, the 
value set in the `hawq_rm_nvseg_perquery_perseg_limit` parameter limits the 
number of virtual segments that are used for one segment for one query, to 
optimize query resources. Resetting this parameter is not recommended.
-
-Consider the following points when deciding on a table distribution policy.
-
--   **Even Data Distribution** â For the best possible performance, all 
segments should contain equal portions of data. If the data is unbalanced or 
skewed, the segments with more data must work harder to perform their portion 
of the query processing.
--   **Local and Distributed Operations** â Local operations are faster than 
distributed operations. Query processing is fastest if the work associated with 
join, sort, or aggregation operations is done locally, at the segment level. 
Work done at the system level requires distributing tuples across the segments, 
which is less efficient. When tables share a common distribution key, the work 
of joining or sorting on their shared distribution key columns is done locally. 
With a random distribution policy, local join operations are not an option.
--   **Even Query Processing** â For best performance, all segments should 
handle an equal share of the query workload. Query workload can be skewed if a 
table's data distribution policy and the query predicates are not well matched. 
For example, suppose that a sales transactions table is distributed based on a 
column that contains corporate names \(the distribution key\), and the hashing 
algorithm distributes the data based on those values. If a predicate in a query 
references a single value from the distribution key, query processing runs on 
only one segment. This works if your query predicates usually select data on a 
criteria other than corporation name. For queries that use corporation name in 
their predicates, it's possible that only one segment instance will handle the 
query workload.
-
-HAWQ utilizes dynamic parallelism, which can affect the performance of a query 
execution significantly. Performance depends on the following factors:
-
--   The size of a randomly distributed table.
--   The `bucketnum` of a hash distributed table.
--   Data locality.
--   The values of `default_hash_table_bucket_number`, and 
`hawq_rm_nvseg_perquery_limit` \(including defaults and user-defined values\).
-
-For any specific query, the first four factors are fixed values, while the 
configuration parameters in the last item can be used to tune performance of 
the query execution. In querying a random table, the query resource load is 
related to the data size of the table, usually one virtual segment for one HDFS 
block. As a result, querying a large table could use a large number of 
resources.
-
-The `bucketnum` for a hash table specifies the number of hash buckets to be 
used in creating virtual segments. A HASH distributed table is created with 
`default_hash_table_bucket_number` buckets. The default bucket value can be 
changed in session level or in the `CREATE TABLE` DDL by using the `bucketnum` 
storage parameter.
-
-In an Ambari-managed HAWQ cluster, the default bucket number 
\(`default_hash_table_bucket_number`\) is derived from the number of segment 
nodes. In command-line-managed HAWQ environments, you can use the 
`--bucket_number` option of `hawq init` to explicitly set 
`default_hash_table_bucket_number` during cluster initialization.
-
-**Note:** For best performance with large tables, the number of buckets should 
not exceed the value of the `default_hash_table_bucket_number` parameter. Small 
tables can use one segment node, `WITH bucketnum=1`. For larger tables, the 
`bucketnum` is set to a multiple of the number of segment nodes, for the best 
load balancing on different segment nodes. The elastic runtime will attempt to 
find the optimal number of buckets for the number of nodes being processed. 
Larger tables need more virtual segments, and hence use larger numbers of 
buckets.
-
-The following statement creates a table âsalesâ with 8 buckets, which 
would be similar to a hash-distributed table on 8 segments.
-
-``` sql
-=> CREATE TABLE sales(id int, profit float)  WITH (bucketnum=8) DISTRIBUTED BY 
(id);
-```
-
-There are four ways of creating a table from an origin table. The ways in 
which the new table is generated from the original table are listed below.
-
-<table>
-  <tr>
-    <th></th>
-    <th>Syntax</th>
-  </tr>
-  <tr><td>INHERITS</td><td><pre><code>CREATE TABLE new_table INHERITS 
(origintable) [WITH(bucketnum=x)] <br/>[DISTRIBUTED BY 
col]</code></pre></td></tr>
-  <tr><td>LIKE</td><td><pre><code>CREATE TABLE new_table (LIKE origintable) 
[WITH(bucketnum=x)] <br/>[DISTRIBUTED BY col]</code></pre></td></tr>
-  <tr><td>AS</td><td><pre><code>CREATE TABLE new_table [WITH(bucketnum=x)] AS 
SUBQUERY [DISTRIBUTED BY col]</code></pre></td></tr>
-  <tr><td>SELECT INTO</td><td><pre><code>CREATE TABLE origintable 
[WITH(bucketnum=x)] [DISTRIBUTED BY col]; SELECT * <br/>INTO new_table FROM 
origintable;</code></pre></td></tr>
-</table>
-
-The optional `INHERITS` clause specifies a list of tables from which the new 
table automatically inherits all columns. Hash tables inherit bucketnumbers 
from their origin table if not otherwise specified. If `WITH` specifies 
`bucketnum` in creating a hash-distributed table, it will be copied. If 
distribution is specified by column, the table will inherit it. Otherwise, the 
table will use default distribution from `default_hash_table_bucket_number`.
-
-The `LIKE` clause specifies a table from which the new table automatically 
copies all column names, data types, not-null constraints, and distribution 
policy. If a `bucketnum` is specified, it will be copied. Otherwise, the table 
will use default distribution.
-
-For hash tables, the `SELECT INTO` function always uses random distribution.
-
-#### <a id="topic_kjg_tqm_gv"></a>Declaring Distribution Keys 
-
-`CREATE TABLE`'s optional clause `DISTRIBUTED BY` specifies the distribution 
policy for a table. The default is a random distribution policy. You can also 
choose to distribute data as a hash-based policy, where the `bucketnum` 
attribute sets the number of hash buckets used by a hash-distributed table. 
HASH distributed tables are created with the number of hash buckets specified 
by the `default_hash_table_bucket_number` parameter.
-
-Policies for different application scenarios can be specified to optimize 
performance. The number of virtual segments used for query execution can now be 
tuned using the `hawq_rm_nvseg_perquery_limit `and 
`hawq_rm_nvseg_perquery_perseg_limit` parameters, in connection with the 
`default_hash_table_bucket_number` parameter, which sets the default 
`bucketnum`. For more information, see the guidelines for Virtual Segments in 
the next section and in [Query 
Performance](../query/query-performance.html#topic38).
-
-#### <a id="topic_wff_mqm_gv"></a>Performance Tuning 
-
-Adjusting the values of the configuration parameters 
`default_hash_table_bucket_number` and `hawq_rm_nvseg_perquery_limit` can tune 
performance by controlling the number of virtual segments being used. In most 
circumstances, HAWQ's elastic runtime will dynamically allocate virtual 
segments to optimize performance, so further tuning should not be needed..
-
-Hash tables are created using the value specified in 
`default_hash_table_bucket_number`. Queries for hash tables use a fixed number 
of buckets, regardless of the amount of data present. Explicitly setting 
`default_hash_table_bucket_number` can be useful in managing resources. If you 
desire a larger or smaller number of hash buckets, set this value before you 
create tables. Resources are dynamically allocated to a multiple of the number 
of nodes. If you use `hawq init --bucket_number` to set the value of 
`default_hash_table_bucket_number` during cluster initialization or expansion, 
the value should not exceed the value of `hawq_rm_nvseg_perquery_limit`. This 
server parameter defines the maximum number of virtual segments that can be 
used for a query \(default = 512, with a maximum of 65535\). Modifying the 
value to greater than 1000 segments is not recommended.
-
-The following per-node guidelines apply to values for 
`default_hash_table_bucket_number`.
-
-|Number of Nodes|default\_hash\_table\_bucket\_number value|
-|---------------|------------------------------------------|
-|<= 85|6 \* \#nodes|
-|\> 85 and <= 102|5 \* \#nodes|
-|\> 102 and <= 128|4 \* \#nodes|
-|\> 128 and <= 170|3 \* \#nodes|
-|\> 170 and <= 256|2 \* \#nodes|
-|\> 256 and <= 512|1 \* \#nodes|
-|\> 512|512|
-
-Reducing the value of `hawq_rm_nvseg_perquery_perseg_limit`can improve 
concurrency and increasing the value of 
`hawq_rm_nvseg_perquery_perseg_limit`could possibly increase the degree of 
parallelism. However, for some queries, increasing the degree of parallelism 
will not improve performance if the query has reached the limits set by the 
hardware. Therefore, increasing the value of 
`hawq_rm_nvseg_perquery_perseg_limit` above the default value is not 
recommended. Also, changing the value of `default_hash_table_bucket_number` 
after initializing a cluster means the hash table data must be redistributed. 
If you are expanding a cluster, you might wish to change this value, but be 
aware that retuning could adversely affect performance.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/ddl/ddl-tablespace.html.md.erb
----------------------------------------------------------------------
diff --git a/ddl/ddl-tablespace.html.md.erb b/ddl/ddl-tablespace.html.md.erb
deleted file mode 100644
index 8720665..0000000
--- a/ddl/ddl-tablespace.html.md.erb
+++ /dev/null
@@ -1,154 +0,0 @@
----
-title: Creating and Managing Tablespaces
----
-
-Tablespaces allow database administrators to have multiple file systems per 
machine and decide how to best use physical storage to store database objects. 
They are named locations within a filespace in which you can create objects. 
Tablespaces allow you to assign different storage for frequently and 
infrequently used database objects or to control the I/O performance on certain 
database objects. For example, place frequently-used tables on file systems 
that use high performance solid-state drives \(SSD\), and place other tables on 
standard hard drives.
-
-A tablespace requires a file system location to store its database files. In 
HAWQ, the master and each segment require a distinct storage location. The 
collection of file system locations for all components in a HAWQ system is a 
*filespace*. Filespaces can be used by one or more tablespaces.
-
-## <a id="topic10"></a>Creating a Filespace 
-
-A filespace sets aside storage for your HAWQ system. A filespace is a symbolic 
storage identifier that maps onto a set of locations in your HAWQ hosts' file 
systems. To create a filespace, prepare the logical file systems on all of your 
HAWQ hosts, then use the `hawq filespace` utility to define the filespace. You 
must be a database superuser to create a filespace.
-
-**Note:** HAWQ is not directly aware of the file system boundaries on your 
underlying systems. It stores files in the directories that you tell it to use. 
You cannot control the location on disk of individual files within a logical 
file system.
-
-### <a id="im178954"></a>To create a filespace using hawq filespace 
-
-1.  Log in to the HAWQ master as the `gpadmin` user.
-
-    ``` shell
-    $ su - gpadmin
-    ```
-
-2.  Create a filespace configuration file:
-
-    ``` shell
-    $ hawq filespace -o hawqfilespace_config
-    ```
-
-3.  At the prompt, enter a name for the filespace, a master file system 
location, and the primary segment file system locations. For example:
-
-    ``` shell
-    $ hawq filespace -o hawqfilespace_config
-    ```
-    ``` pre
-    Enter a name for this filespace
-    > testfs
-    Enter replica num for filespace. If 0, default replica num is used 
(default=3)
-    > 
-
-    Please specify the DFS location for the filespace (for example: 
localhost:9000/fs)
-    location> localhost:8020/fs        
-    20160409:16:53:25:028082 hawqfilespace:gpadmin:gpadmin-[INFO]:-[created]
-    20160409:16:53:25:028082 hawqfilespace:gpadmin:gpadmin-[INFO]:-
-    To add this filespace to the database please run the command:
-       hawqfilespace --config 
/Users/gpadmin/curwork/git/hawq/hawqfilespace_config
-    ```
-       
-    ``` shell
-    $ cat /Users/gpadmin/curwork/git/hawq/hawqfilespace_config
-    ```
-    ``` pre
-    filespace:testfs
-    fsreplica:3
-    dfs_url::localhost:8020/fs
-    ```
-    ``` shell
-    $ hawq filespace --config 
/Users/gpadmin/curwork/git/hawq/hawqfilespace_config
-    ```
-    ``` pre
-    Reading Configuration file: 
'/Users/gpadmin/curwork/git/hawq/hawqfilespace_config'
-
-    CREATE FILESPACE testfs ON hdfs 
-    ('localhost:8020/fs/testfs') WITH (NUMREPLICA = 3);
-    20160409:16:57:56:028104 hawqfilespace:gpadmin:gpadmin-[INFO]:-Connecting 
to database
-    20160409:16:57:56:028104 hawqfilespace:gpadmin:gpadmin-[INFO]:-Filespace 
"testfs" successfully created
-
-    ```
-
-
-4.  `hawq filespace` creates a configuration file. Examine the file to verify 
that the hawq filespace configuration is correct. The following is a sample 
configuration file:
-
-    ```
-    filespace:fastdisk
-    mdw:1:/hawq_master_filespc/gp-1
-    sdw1:2:/hawq_pri_filespc/gp0
-    sdw2:3:/hawq_pri_filespc/gp1
-    ```
-
-5.  Run hawq filespace again to create the filespace based on the 
configuration file:
-
-    ``` shell
-    $ hawq filespace -c hawqfilespace_config
-    ```
-
-
-## <a id="topic13"></a>Creating a Tablespace 
-
-After you create a filespace, use the `CREATE TABLESPACE` command to define a 
tablespace that uses that filespace. For example:
-
-``` sql
-=# CREATE TABLESPACE fastspace FILESPACE fastdisk;
-```
-
-Database superusers define tablespaces and grant access to database users with 
the `GRANT``CREATE`command. For example:
-
-``` sql
-=# GRANT CREATE ON TABLESPACE fastspace TO admin;
-```
-
-## <a id="topic14"></a>Using a Tablespace to Store Database Objects 
-
-Users with the `CREATE` privilege on a tablespace can create database objects 
in that tablespace, such as tables, indexes, and databases. The command is:
-
-``` sql
-CREATE TABLE tablename(options) TABLESPACE spacename
-```
-
-For example, the following command creates a table in the tablespace *space1*:
-
-``` sql
-CREATE TABLE foo(i int) TABLESPACE space1;
-```
-
-You can also use the `default_tablespace` parameter to specify the default 
tablespace for `CREATE TABLE` and `CREATE INDEX` commands that do not specify a 
tablespace:
-
-``` sql
-SET default_tablespace = space1;
-CREATE TABLE foo(i int);
-```
-
-The tablespace associated with a database stores that database's system 
catalogs, temporary files created by server processes using that database, and 
is the default tablespace selected for tables and indexes created within the 
database, if no `TABLESPACE` is specified when the objects are created. If you 
do not specify a tablespace when you create a database, the database uses the 
same tablespace used by its template database.
-
-You can use a tablespace from any database if you have appropriate privileges.
-
-## <a id="topic15"></a>Viewing Existing Tablespaces and Filespaces 
-
-Every HAWQ system has the following default tablespaces.
-
--   `pg_global` for shared system catalogs.
--   `pg_default`, the default tablespace. Used by the *template1* and 
*template0* databases.
-
-These tablespaces use the system default filespace, `pg_system`, the data 
directory location created at system initialization.
-
-To see filespace information, look in the *pg\_filespace* and 
*pg\_filespace\_entry* catalog tables. You can join these tables with 
*pg\_tablespace* to see the full definition of a tablespace. For example:
-
-``` sql
-=# SELECT spcname AS tblspc, fsname AS filespc,
-          fsedbid AS seg_dbid, fselocation AS datadir
-   FROM   pg_tablespace pgts, pg_filespace pgfs,
-          pg_filespace_entry pgfse
-   WHERE  pgts.spcfsoid=pgfse.fsefsoid
-          AND pgfse.fsefsoid=pgfs.oid
-   ORDER BY tblspc, seg_dbid;
-```
-
-## <a id="topic16"></a>Dropping Tablespaces and Filespaces 
-
-To drop a tablespace, you must be the tablespace owner or a superuser. You 
cannot drop a tablespace until all objects in all databases using the 
tablespace are removed.
-
-Only a superuser can drop a filespace. A filespace cannot be dropped until all 
tablespaces using that filespace are removed.
-
-The `DROP TABLESPACE` command removes an empty tablespace.
-
-The `DROP FILESPACE` command removes an empty filespace.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/ddl/ddl-view.html.md.erb
----------------------------------------------------------------------
diff --git a/ddl/ddl-view.html.md.erb b/ddl/ddl-view.html.md.erb
deleted file mode 100644
index 35da41e..0000000
--- a/ddl/ddl-view.html.md.erb
+++ /dev/null
@@ -1,25 +0,0 @@
----
-title: Creating and Managing Views
----
-
-Views enable you to save frequently used or complex queries, then access them 
in a `SELECT` statement as if they were a table. A view is not physically 
materialized on disk: the query runs as a subquery when you access the view.
-
-If a subquery is associated with a single query, consider using the `WITH` 
clause of the `SELECT` command instead of creating a seldom-used view.
-
-## <a id="topic101"></a>Creating Views 
-
-The `CREATE VIEW`command defines a view of a query. For example:
-
-``` sql
-CREATE VIEW comedies AS SELECT * FROM films WHERE kind = 'comedy';
-```
-
-Views ignore `ORDER BY` and `SORT` operations stored in the view.
-
-## <a id="topic102"></a>Dropping Views 
-
-The `DROP VIEW` command removes a view. For example:
-
-``` sql
-DROP VIEW topten;
-```

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/ddl/ddl.html.md.erb
----------------------------------------------------------------------
diff --git a/ddl/ddl.html.md.erb b/ddl/ddl.html.md.erb
deleted file mode 100644
index 7873fe7..0000000
--- a/ddl/ddl.html.md.erb
+++ /dev/null
@@ -1,19 +0,0 @@
----
-title: Defining Database Objects
----
-
-This section covers data definition language \(DDL\) in HAWQ and how to create 
and manage database objects.
-
-Creating objects in a HAWQ includes making up-front choices about data 
distribution, storage options, data loading, and other HAWQ features that will 
affect the ongoing performance of your database system. Understanding the 
options that are available and how the database will be used will help you make 
the right decisions.
-
-Most of the advanced HAWQ features are enabled with extensions to the SQL 
`CREATE` DDL statements.
-
-This section contains the topics:
-
-*  <a class="subnav" href="./ddl-database.html">Creating and Managing 
Databases</a>
-*  <a class="subnav" href="./ddl-tablespace.html">Creating and Managing 
Tablespaces</a>
-*  <a class="subnav" href="./ddl-schema.html">Creating and Managing Schemas</a>
-*  <a class="subnav" href="./ddl-table.html">Creating and Managing Tables</a>
-*  <a class="subnav" href="./ddl-storage.html">Table Storage Model and 
Distribution Policy</a>
-*  <a class="subnav" href="./ddl-partition.html">Partitioning Large Tables</a>
-*  <a class="subnav" href="./ddl-view.html">Creating and Managing Views</a>

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/02-pipeline.png
----------------------------------------------------------------------
diff --git a/images/02-pipeline.png b/images/02-pipeline.png
deleted file mode 100644
index 26fec1b..0000000
Binary files a/images/02-pipeline.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/03-gpload-files.jpg
----------------------------------------------------------------------
diff --git a/images/03-gpload-files.jpg b/images/03-gpload-files.jpg
deleted file mode 100644
index d50435f..0000000
Binary files a/images/03-gpload-files.jpg and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/basic_query_flow.png
----------------------------------------------------------------------
diff --git a/images/basic_query_flow.png b/images/basic_query_flow.png
deleted file mode 100644
index 59172a2..0000000
Binary files a/images/basic_query_flow.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/ext-tables-xml.png
----------------------------------------------------------------------
diff --git a/images/ext-tables-xml.png b/images/ext-tables-xml.png
deleted file mode 100644
index f208828..0000000
Binary files a/images/ext-tables-xml.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/ext_tables.jpg
----------------------------------------------------------------------
diff --git a/images/ext_tables.jpg b/images/ext_tables.jpg
deleted file mode 100644
index d5a0940..0000000
Binary files a/images/ext_tables.jpg and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/ext_tables_multinic.jpg
----------------------------------------------------------------------
diff --git a/images/ext_tables_multinic.jpg b/images/ext_tables_multinic.jpg
deleted file mode 100644
index fcf09c4..0000000
Binary files a/images/ext_tables_multinic.jpg and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/gangs.jpg
----------------------------------------------------------------------
diff --git a/images/gangs.jpg b/images/gangs.jpg
deleted file mode 100644
index 0d14585..0000000
Binary files a/images/gangs.jpg and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/gporca.png
----------------------------------------------------------------------
diff --git a/images/gporca.png b/images/gporca.png
deleted file mode 100644
index 2909443..0000000
Binary files a/images/gporca.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/hawq_hcatalog.png
----------------------------------------------------------------------
diff --git a/images/hawq_hcatalog.png b/images/hawq_hcatalog.png
deleted file mode 100644
index 35b74c3..0000000
Binary files a/images/hawq_hcatalog.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/images/slice_plan.jpg
----------------------------------------------------------------------
diff --git a/images/slice_plan.jpg b/images/slice_plan.jpg
deleted file mode 100644
index ad8da83..0000000
Binary files a/images/slice_plan.jpg and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/install/aws-config.html.md.erb
----------------------------------------------------------------------
diff --git a/install/aws-config.html.md.erb b/install/aws-config.html.md.erb
deleted file mode 100644
index 21cadf5..0000000
--- a/install/aws-config.html.md.erb
+++ /dev/null
@@ -1,123 +0,0 @@
----
-title: Amazon EC2 Configuration
----
-
-Amazon Elastic Compute Cloud (EC2) is a service provided by Amazon Web 
Services (AWS).  You can install and configure HAWQ on virtual servers provided 
by Amazon EC2. The following information describes some considerations when 
deploying a HAWQ cluster in an Amazon EC2 environment.
-
-## <a id="topic_wqv_yfx_y5"></a>About Amazon EC2 
-
-Amazon EC2 can be used to launch as many virtual servers as you need, 
configure security and networking, and manage storage. An EC2 *instance* is a 
virtual server in the AWS cloud virtual computing environment.
-
-EC2 instances are managed by AWS. AWS isolates your EC2 instances from other 
users in a virtual private cloud (VPC) and lets you control access to the 
instances. You can configure instance features such as operating system, 
network connectivity (network ports and protocols, IP addresses), access to the 
Internet, and size and type of disk storage. 
-
-For information about Amazon EC2, see the [EC2 User 
Guide](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html).
-
-## <a id="topic_nhk_df4_2v"></a>Create and Launch HAWQ Instances
-
-Use the *Amazon EC2 Console* to launch instances and configure, start, stop, 
and terminate (delete) virtual servers. When you launch a HAWQ instance, you 
select and configure key attributes via the EC2 Console.
-
-
-### <a id="topic_amitype"></a>Choose AMI Type
-
-An Amazon Machine Image (AMI) is a template that contains a software 
configuration including the operating system, application server, and 
applications that best suit your purpose. When configuring a HAWQ virtual 
instance, we recommend you use a *hardware virtualized* AMI running 64-bit Red 
Hat Enterprise Linux version 6.4 or 6.5 or 64-bit CentOS 6.4 or 6.5.  Obtain 
the licenses and instances directly from the OS provider.
-
-### <a id="topic_selcfgstorage"></a>Consider Storage
-EC2 instances can be launched as either Elastic Block Store (EBS)-backed or 
instance store-backed.  
-
-Instance store-backed storage is generally better performing than EBS and 
recommended for HAWQ's large data workloads. SSD (solid state) instance store 
is preferred over magnetic drives.
-
-**Note** EC2 *instance store* provides temporary block-level storage. This 
storage is located on disks that are physically attached to the host computer. 
While instance store provides high performance, powering off the instance 
causes data loss. Soft reboots preserve instance store data. 
-     
-Virtual devices for instance store volumes for HAWQ EC2 instance store 
instances are named `ephemeralN` (where *N* varies based on instance type). 
CentOS instance store block device are named `/dev/xvdletter` (where *letter* 
is a lower case letter of the alphabet).
-
-### <a id="topic_cfgplacegrp"></a>Configure Placement Group 
-
-A placement group is a logical grouping of instances within a single 
availability zone that together participate in a low-latency, 10 Gbps network.  
Your HAWQ master and segment cluster instances should support enhanced 
networking and reside in a single placement group (and subnet) for optimal 
network performance.  
-
-If your Ambari node is not a DataNode, locating the Ambari node instance in a 
subnet separate from the HAWQ master/segment placement group enables you to 
manage multiple HAWQ clusters from the single Ambari instance.
-
-Amazon recommends that you use the same instance type for all instances in the 
placement group and that you launch all instances within the placement group at 
the same time.
-
-Membership in a placement group has some implications on your HAWQ cluster.  
Specifically, growing the cluster over capacity may require shutting down all 
HAWQ instances in the current placement group and restarting the instances to a 
new placement group. Instance store volumes are lost in this scenario.
-
-### <a id="topic_selinsttype"></a>Select EC2 Instance Type
-
-An EC2 instance type is a specific combination of CPU, memory, default 
storage, and networking capacity.  
-
-Several instance store-backed EC2 instance types have shown acceptable 
performance for HAWQ nodes in development and production environments: 
-
-| Instance Type  | Env | vCPUs | Memory (GB) | Disk Capacity (GB) | Storage 
Type |
-|-------|-----|------|--------|----------|--------|
-| cc2.8xlarge  | Dev | 32 | 60.5 | 4 x 840 | HDD |
-| d2.2xlarge  | Dev | 8 | 60 | 6 x 2000 | HDD |
-| d2.4xlarge  | Dev/QA | 16 | 122 | 12 x 2000 | HDD |
-| i2.8xlarge  | Prod | 32 | 244 | 8 x 800 | SSD |
-| hs1.8xlarge  | Prod | 16 | 117 | 24 x 2000 | HDD |
-| d2.8xlarge  | Prod | 36 | 244 | 24 x 2000 | HDD |
- 
-For optimal network performance, the chosen HAWQ instance type should support 
EC2 enhanced networking. Enhanced networking results in higher performance, 
lower latency, and lower jitter. Refer to [Enhanced Networking on Linux 
Instances](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html)
 for detailed information on enabling enhanced networking in your instances.
-
-All instance types identified in the table above support enhanced networking.
-
-### <a id="topic_cfgnetw"></a>Configure Networking 
-
-Your HAWQ cluster instances should be in a single VPC and on the same subnet. 
Instances are always assigned a VPC internal IP address. This internal IP 
address should be used for HAWQ communication between hosts. You can also use 
the internal IP address to access an instance from another instance within the 
HAWQ VPC.
-
-You may choose to locate your Ambari node on a separate subnet in the VPC. 
Both a public IP address for the instance and an Internet gateway configured 
for the EC2 VPC are required to access the Ambari instance from an external 
source and for the instance to access the Internet. 
-
-Ensure your Ambari and HAWQ master instances are each assigned a public IP 
address for external and internet access. We recommend you also assign an 
Elastic IP Address to the HAWQ master instance.
-
-
-###Configure Security Groups<a id="topic_cfgsecgrp"></a>
-
-A security group is a set of rules that control network traffic to and from 
your HAWQ instance.  One or more rules may be associated with a security group, 
and one or more security groups may be associated with an instance.
-
-To configure HAWQ communication between nodes in the HAWQ cluster, include and 
open the following ports in the appropriate security group for the HAWQ master 
and segment nodes:
-
-| Port  | Application |
-|-------|-------------------------------------|
-| 22    | ssh - secure connect to other hosts |
-
-To allow access to/from a source external to the Ambari management node, 
include and open the following ports in an appropriate security group for your 
Ambari node:
-
-| Port  | Application |
-|-------|-------------------------------------|
-| 22    | ssh - secure connect to other hosts |
-| 8080  | Ambari - HAWQ admin/config web console |  
-
-
-###Generate Key Pair<a id="topic_cfgkeypair"></a>
-AWS uses public-key cryptography to secure the login information for your 
instance. You use the EC2 console to generate and name a key pair when you 
launch your instance.  
-
-A key pair for an EC2 instance consists of a *public key* that AWS stores, and 
a *private key file* that you maintain. Together, they allow you to connect to 
your instance securely. The private key file name typically has a `.pem` suffix.
-
-This example logs into an into EC2 instance from an external location with the 
private key file `my-test.pem` as user `user1`.  In this example, the instance 
is configured with the public IP address `192.0.2.0` and the private key file 
resides in the current directory.
-
-```shell
-$ ssh -i my-test.pem [email protected]
-```
-
-##Additional HAWQ Considerations <a id="topic_mj4_524_2v"></a>
-
-After launching your HAWQ instance, you will connect to and configure the 
instance. The  *Instances* page of the EC2 Console lists the running instances 
and their associated network access information.
-
-Before installing HAWQ, set up the EC2 instances as you would local host 
server machines. Configure the host operating system, configure host network 
information (for example, update the `/etc/hosts` file), set operating system 
parameters, and install operating system packages. For information about how to 
prepare your operating system environment for HAWQ, see [Apache HAWQ System 
Requirements](../requirements/system-requirements.html) and [Select HAWQ Host 
Machines](../install/select-hosts.html).
-
-###Passwordless SSH Configuration<a id="topic_pwdlessssh_cc"></a>
-
-HAWQ hosts will be configured during the installation process to use 
passwordless SSH for intra-cluster communications. Temporary password-based 
authentication must be enabled on each HAWQ host in preparation for this 
configuration. Password authentication is typically disabled by default in 
cloud images. Update the cloud configuration in `/etc/cloud/cloud.cfg` to 
enable password authentication in your AMI(s). Set `ssh_pwauth: True` in this 
file. If desired, disable password authentication after HAWQ installation by 
setting the property back to `False`.
-  
-##References<a id="topic_hgz_zwy_bv"></a>
-
-Links to related Amazon Web Services and EC2 features and information.
-
-- [Amazon Web Services](https://aws.amazon.com)
-- [Amazon Machine Image 
\(AMI\)](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AMIs.html)
-- [EC2 Instance 
Store](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html)
-- [Elastic Block 
Store](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html)
-- [EC2 Key 
Pairs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html)
-- [Elastic IP 
Address](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html)
-- [Enhanced Networking on Linux 
Instances](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html)
-- [Internet Gateways] 
(http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Internet_Gateway.html)
-- [Subnet Public IP 
Addressing](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-ip-addressing.html#subnet-public-ip)
-- [Virtual Private 
Cloud](http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Introduction.html)

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/install/select-hosts.html.md.erb
----------------------------------------------------------------------
diff --git a/install/select-hosts.html.md.erb b/install/select-hosts.html.md.erb
deleted file mode 100644
index ecbe0b5..0000000
--- a/install/select-hosts.html.md.erb
+++ /dev/null
@@ -1,19 +0,0 @@
----
-title: Select HAWQ Host Machines
----
-
-Before you begin to install HAWQ, follow these steps to select and prepare the 
host machines.
-
-Complete this procedure for all HAWQ deployments:
-
-1.  **Choose the host machines that will host a HAWQ segment.** Keep in mind 
these restrictions and requirements:
-    -   Each host must meet the system requirements for the version of HAWQ 
you are installing.
-    -   Each HAWQ segment must be co-located on a host that runs an HDFS 
DataNode.
-    -   The HAWQ master segment and standby master segment must be hosted on 
separate machines.
-2.  **Choose the host machines that will run PXF.** Keep in mind these 
restrictions and requirements:
-    -   PXF must be installed on the HDFS NameNode *and* on all HDFS DataNodes.
-    -   If you have configured Hadoop with high availability, PXF must also be 
installed on all HDFS nodes including all NameNode services.
-    -   If you want to use PXF with HBase or Hive, you must first install the 
HBase client \(hbase-client\) and/or Hive client \(hive-client\) on each 
machine where you intend to install PXF. See the [HDP installation 
documentation](https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/index.html)
 for more information.
-3.  **Verify that required ports on all machines are unused.** By default, a 
HAWQ master or standby master service configuration uses port 5432. Hosts that 
run other PostgreSQL instances cannot be used to run a default HAWQ master or 
standby service configuration because the default PostgreSQL port \(5432\) 
conflicts with the default HAWQ port. You must either change the default port 
configuration of the running PostgreSQL instance or change the HAWQ master port 
setting during the HAWQ service installation to avoid port conflicts.
-    
-    **Note:** The Ambari server node uses PostgreSQL as the default metadata 
database. The Hive Metastore uses MySQL as the default metadata database.
\ No newline at end of file

[43/51] [partial] incubator-hawq-docs git commit: HAWQ-1254 Fix/remove book branching on incubator-hawq-docs

Reply via email to