http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/defining-queries.html.md.erb
----------------------------------------------------------------------
diff --git a/query/defining-queries.html.md.erb 
b/query/defining-queries.html.md.erb
deleted file mode 100644
index b796511..0000000
--- a/query/defining-queries.html.md.erb
+++ /dev/null
@@ -1,528 +0,0 @@
----
-title: Defining Queries
----
-
-HAWQ is based on the PostgreSQL implementation of the SQL standard. SQL 
commands are typically entered using the standard PostgreSQL interactive 
terminal `psql`, but other programs that have similar functionality can be used 
as well.
-
-
-## <a id="topic3"></a>SQL Lexicon
-
-SQL is a standard language for accessing databases. The language consists of 
elements that enable data storage, retrieval, analysis, viewing, and so on. You 
use SQL commands to construct queries and commands that the HAWQ engine 
understands.
-
-SQL queries consist of a sequence of commands. Commands consist of a sequence 
of valid tokens in correct syntax order, terminated by a semicolon (`;`).
-
-H uses PostgreSQL's structure and syntax, with some exceptions. For more 
information about SQL rules and concepts in PostgreSQL, see "SQL Syntax" in the 
PostgreSQL documentation.
-
-## <a id="topic4"></a>SQL Value Expressions
-
-SQL value expressions consist of one or more values, symbols, operators, SQL 
functions, and data. The expressions compare data or perform calculations and 
return a value as the result. Calculations include logical, arithmetic, and set 
operations.
-
-The following are value expressions:
-
--   Aggregate expressions
--   Array constructors
--   Column references
--   Constant or literal values
--   Correlated subqueries
--   Field selection expressions
--   Function calls
--   New column values in an `INSERT`
--   Operator invocation column references
--   Positional parameter references, in the body of a function definition or 
prepared statement
--   Row constructors
--   Scalar subqueries
--   Search conditions in a `WHERE` clause
--   Target lists of a `SELECT` command
--   Type casts
--   Value expressions in parentheses, useful to group sub-expressions and 
override precedence
--   Window expressions
-
-SQL constructs such as functions and operators are expressions but do not 
follow any general syntax rules. For more information about these constructs, 
see [Using Functions and Operators](functions-operators.html#topic26).
-
-### <a id="topic5"></a>Column References
-
-A column reference has the form:
-
-```
-correlation.columnname
-```
-
-Here, `correlation` is the name of a table (possibly qualified with a schema 
name) or an alias for a table defined with a `FROM` clause or one of the 
keywords `NEW` or `OLD`. `NEW` and `OLD` can appear only in rewrite rules, but 
you can use other correlation names in any SQL statement. If the column name is 
unique across all tables in the query, you can omit the "`correlation.`" part 
of the column reference.
-
-### <a id="topic6"></a>Positional Parameters
-
-Positional parameters are arguments to SQL statements or functions that you 
reference by their positions in a series of arguments. For example, `$1` refers 
to the first argument, `$2` to the second argument, and so on. The values of 
positional parameters are set from arguments external to the SQL statement or 
supplied when SQL functions are invoked. Some client libraries support 
specifying data values separately from the SQL command, in which case 
parameters refer to the out-of-line data values. A parameter reference has the 
form:
-
-```
-$number
-```
-
-For example:
-
-``` pre
-CREATE FUNCTION dept(text) RETURNS dept
-    AS $$ SELECT * FROM dept WHERE name = $1 $$
-    LANGUAGE SQL;
-```
-
-Here, the `$1` references the value of the first function argument whenever 
the function is invoked.
-
-### <a id="topic7"></a>Subscripts
-
-If an expression yields a value of an array type, you can extract a specific 
element of the array value as follows:
-
-``` pre
-expression[subscript]
-```
-
-You can extract multiple adjacent elements, called an array slice, as follows 
(including the brackets):
-
-``` pre
-expression[lower_subscript:upper_subscript]
-```
-
-Each subscript is an expression and yields an integer value.
-
-Array expressions usually must be in parentheses, but you can omit the 
parentheses when the expression to be subscripted is a column reference or 
positional parameter. You can concatenate multiple subscripts when the original 
array is multidimensional. For example (including the parentheses):
-
-``` pre
-mytable.arraycolumn[4]
-```
-
-``` pre
-mytable.two_d_column[17][34]
-```
-
-``` pre
-$1[10:42]
-```
-
-``` pre
-(arrayfunction(a,b))[42]
-```
-
-### <a id="topic8"></a>Field Selections
-
-If an expression yields a value of a composite type (row type), you can 
extract a specific field of the row as follows:
-
-```
-expression.fieldname
-```
-
-The row expression usually must be in parentheses, but you can omit these 
parentheses when the expression to be selected from is a table reference or 
positional parameter. For example:
-
-``` pre
-mytable.mycolumn
-```
-
-``` pre
-$1.somecolumn
-```
-
-``` pre
-(rowfunction(a,b)).col3
-```
-
-A qualified column reference is a special case of field selection syntax.
-
-### <a id="topic9"></a>Operator Invocations
-
-Operator invocations have the following possible syntaxes:
-
-``` pre
-expression operator expression(binary infix operator)
-```
-
-``` pre
-operator expression(unary prefix operator)
-```
-
-``` pre
-expression operator(unary postfix operator)
-```
-
-Where *operator* is an operator token, one of the key words `AND`, `OR`, or 
`NOT`, or qualified operator name in the form:
-
-``` pre
-OPERATOR(schema.operatorname)
-```
-
-Available operators and whether they are unary or binary depends on the 
operators that the system or user defines. For more information about built-in 
operators, see [Built-in Functions and 
Operators](functions-operators.html#topic29).
-
-### <a id="topic10"></a>Function Calls
-
-The syntax for a function call is the name of a function (possibly qualified 
with a schema name), followed by its argument list enclosed in parentheses:
-
-``` pre
-function ([expression [, expression ... ]])
-```
-
-For example, the following function call computes the square root of 2:
-
-``` pre
-sqrt(2)
-```
-
-### <a id="topic11"></a>Aggregate Expressions
-
-An aggregate expression applies an aggregate function across the rows that a 
query selects. An aggregate function performs a calculation on a set of values 
and returns a single value, such as the sum or average of the set of values. 
The syntax of an aggregate expression is one of the following:
-
--   `aggregate_name ([ , ... ] ) [FILTER (WHERE                 condition)] ` 
— operates across all input rows for which the expected result value is 
non-null. `ALL` is the default.
--   `aggregate_name(ALLexpression[ , ... ] ) [FILTER               (WHERE 
condition)]` — operates identically to the first form because `ALL` is the 
default
--   `aggregate_name(DISTINCT expression[ , ... ] )               [FILTER 
(WHERE condition)]` — operates across all distinct non-null values of input 
rows
--   `aggregate_name(*) [FILTER (WHERE               condition)]` — operates 
on all rows with values both null and non-null. Generally, this form is most 
useful for the `count(*)` aggregate function.
-
-Where *aggregate\_name* is a previously defined aggregate (possibly 
schema-qualified) and *expression* is any value expression that does not 
contain an aggregate expression.
-
-For example, `count(*)` yields the total number of input rows, `count(f1)` 
yields the number of input rows in which `f1` is <span class="ph">non-null, and 
</span>`count(distinct f1)` yields the number of distinct non-null values of 
`f1`.
-
-You can specify a condition with the `FILTER` clause to limit the input rows 
to the aggregate function. For example:
-
-``` sql
-SELECT count(*) FILTER (WHERE gender='F') FROM employee;
-```
-
-The `WHERE condition` of the `FILTER` clause cannot contain a set-returning 
function, subquery, window function, or outer reference. If you use a 
user-defined aggregate function, declare the state transition function as 
`STRICT` (see `CREATE AGGREGATE`).
-
-For predefined aggregate functions, see [Built-in Functions and 
Operators](functions-operators.html#topic29). You can also add custom aggregate 
functions.
-
-HAWQ provides the `MEDIAN` aggregate function, which returns the fiftieth 
percentile of the `PERCENTILE_CONT` result and special aggregate expressions 
for inverse distribution functions as follows:
-
-``` sql
-PERCENTILE_CONT(_percentage_) WITHIN GROUP (ORDER BY _expression_)
-```
-
-``` sql
-PERCENTILE_DISC(_percentage_) WITHIN GROUP (ORDER BY _expression_)
-```
-
-Currently you can use only these two expressions with the keyword `WITHIN      
       GROUP`.
-
-#### <a id="topic12"></a>Limitations of Aggregate Expressions
-
-The following are current limitations of the aggregate expressions:
-
--   HAWQ does not support the following keywords: ALL, DISTINCT, FILTER and 
OVER. See [Advanced Aggregate 
Functions](functions-operators.html#topic31__in2073121) for more details.
--   An aggregate expression can appear only in the result list or HAVING 
clause of a SELECT command. It is forbidden in other clauses, such as WHERE, 
because those clauses are logically evaluated before the results of aggregates 
form. This restriction applies to the query level to which the aggregate 
belongs.
--   When an aggregate expression appears in a subquery, the aggregate is 
normally evaluated over the rows of the subquery. If the aggregate's arguments 
contain only outer-level variables, the aggregate belongs to the nearest such 
outer level and evaluates over the rows of that query. The aggregate expression 
as a whole is then an outer reference for the subquery in which it appears, and 
the aggregate expression acts as a constant over any one evaluation of that 
subquery. See [Scalar Subqueries](#topic15) and [Built-in functions and 
operators](functions-operators.html#topic29__in204913).
--   HAWQ does not support DISTINCT with multiple input expressions.
-
-### <a id="topic13"></a>Window Expressions
-
-Window expressions allow application developers to more easily compose complex 
online analytical processing (OLAP) queries using standard SQL commands. For 
example, with window expressions, users can calculate moving averages or sums 
over various intervals, reset aggregations and ranks as selected column values 
change, and express complex ratios in simple terms.
-
-A window expression represents the application of a *window function* applied 
to a *window frame*, which is defined in a special `OVER()` clause. A window 
partition is a set of rows that are grouped together to apply a window 
function. Unlike aggregate functions, which return a result value for each 
group of rows, window functions return a result value for every row, but that 
value is calculated with respect to the rows in a particular window partition. 
If no partition is specified, the window function is computed over the complete 
intermediate result set.
-
-The syntax of a window expression is:
-
-``` pre
-window_function ( [expression [, ...]] ) OVER ( window_specification )
-```
-
-Where *`window_function`* is one of the functions listed in [Window 
functions](functions-operators.html#topic30__in164369), *`expression`* is any 
value expression that does not contain a window expression, and 
*`window_specification`* is:
-
-```
-[window_name]
-[PARTITION BY expression [, ...]]
-[[ORDER BY expression [ASC | DESC | USING operator] [, ...]
-    [{RANGE | ROWS} 
-       { UNBOUNDED PRECEDING
-       | expression PRECEDING
-       | CURRENT ROW
-       | BETWEEN window_frame_bound AND window_frame_bound }]]
-```
-
-and where `window_frame_bound` can be one of:
-
-``` 
-    UNBOUNDED PRECEDING
-    expression PRECEDING
-    CURRENT ROW
-    expression FOLLOWING
-    UNBOUNDED FOLLOWING
-```
-
-A window expression can appear only in the select list of a `SELECT` command. 
For example:
-
-``` sql
-SELECT count(*) OVER(PARTITION BY customer_id), * FROM sales;
-```
-
-The `OVER` clause differentiates window functions from other aggregate or 
reporting functions. The `OVER` clause defines the *`window_specification`* to 
which the window function is applied. A window specification has the following 
characteristics:
-
--   The `PARTITION BY` clause defines the window partitions to which the 
window function is applied. If omitted, the entire result set is treated as one 
partition.
--   The `ORDER BY` clause defines the expression(s) for sorting rows within a 
window partition. The `ORDER BY` clause of a window specification is separate 
and distinct from the `ORDER BY` clause of a regular query expression. The 
`ORDER BY` clause is required for the window functions that calculate rankings, 
as it identifies the measure(s) for the ranking values. For OLAP aggregations, 
the `ORDER BY` clause is required to use window frames (the `ROWS` | `RANGE` 
clause).
-
-**Note:** Columns of data types without a coherent ordering, such as `time`, 
are not good candidates for use in the `ORDER BY` clause of a window 
specification. `Time`, with or without a specified time zone, lacks a coherent 
ordering because addition and subtraction do not have the expected effects. For 
example, the following is not generally true: `x::time < x::time +             
'2 hour'::interval`
-
--   The `ROWS/RANGE` clause defines a window frame for aggregate (non-ranking) 
window functions. A window frame defines a set of rows within a window 
partition. When a window frame is defined, the window function computes on the 
contents of this moving frame rather than the fixed contents of the entire 
window partition. Window frames are row-based (`ROWS`) or value-based (`RANGE`).
-
-### <a id="topic14"></a>Type Casts
-
-A type cast specifies a conversion from one data type to another. HAWQ accepts 
two equivalent syntaxes for type casts:
-
-``` sql
-CAST ( expression AS type )
-expression::type
-```
-
-The `CAST` syntax conforms to SQL; the syntax with `::` is historical 
PostgreSQL usage.
-
-A cast applied to a value expression of a known type is a run-time type 
conversion. The cast succeeds only if a suitable type conversion function is 
defined. This differs from the use of casts with constants. A cast applied to a 
string literal represents the initial assignment of a type to a literal 
constant value, so it succeeds for any type if the contents of the string 
literal are acceptable input syntax for the data type.
-
-You can usually omit an explicit type cast if there is no ambiguity about the 
type a value expression must produce; for example, when it is assigned to a 
table column, the system automatically applies a type cast. The system applies 
automatic casting only to casts marked "OK to apply implicitly" in system 
catalogs. Other casts must be invoked with explicit casting syntax to prevent 
unexpected conversions from being applied without the user's knowledge.
-
-### <a id="topic15"></a>Scalar Subqueries
-
-A scalar subquery is a `SELECT` query in parentheses that returns exactly one 
row with one column. Do not use a `SELECT` query that returns multiple rows or 
columns as a scalar subquery. The query runs and uses the returned value in the 
surrounding value expression. A correlated scalar subquery contains references 
to the outer query block.
-
-### <a id="topic16"></a>Correlated Subqueries
-
-A correlated subquery (CSQ) is a `SELECT` query with a `WHERE` clause or 
target list that contains references to the parent outer clause. CSQs 
efficiently express results in terms of results of another query. HAWQ supports 
correlated subqueries that provide compatibility with many existing 
applications. A CSQ is a scalar or table subquery, depending on whether it 
returns one or multiple rows. HAWQ does not support correlated subqueries with 
skip-level correlations.
-
-### <a id="topic17"></a>Correlated Subquery Examples
-
-#### <a id="topic18"></a>Example 1 – Scalar correlated subquery
-
-``` sql
-SELECT * FROM t1 WHERE t1.x 
-> (SELECT MAX(t2.x) FROM t2 WHERE t2.y = t1.y);
-```
-
-#### <a id="topic19"></a>Example 2 – Correlated EXISTS subquery
-
-``` sql
-SELECT * FROM t1 WHERE 
-EXISTS (SELECT 1 FROM t2 WHERE t2.x = t1.x);
-```
-
-HAWQ uses one of the following methods to run CSQs:
-
--   Unnest the CSQ into join operations – This method is most efficient, and 
it is how HAWQ runs most CSQs, including queries from the TPC-H benchmark.
--   Run the CSQ on every row of the outer query – This method is relatively 
inefficient, and it is how HAWQ runs queries that contain CSQs in the `SELECT` 
list or are connected by `OR` conditions.
-
-The following examples illustrate how to rewrite some of these types of 
queries to improve performance.
-
-#### <a id="topic20"></a>Example 3 - CSQ in the Select List
-
-*Original Query*
-
-``` sql
-SELECT T1.a,
-(SELECT COUNT(DISTINCT T2.z) FROM t2 WHERE t1.x = t2.y) dt2 
-FROM t1;
-```
-
-Rewrite this query to perform an inner join with `t1` first and then perform a 
left join with `t1` again. The rewrite applies for only an equijoin in the 
correlated condition.
-
-*Rewritten Query*
-
-``` sql
-SELECT t1.a, dt2 FROM t1 
-LEFT JOIN 
-(SELECT t2.y AS csq_y, COUNT(DISTINCT t2.z) AS dt2 
-FROM t1, t2 WHERE t1.x = t2.y 
-GROUP BY t1.x) 
-ON (t1.x = csq_y);
-```
-
-### <a id="topic21"></a>Example 4 - CSQs connected by OR Clauses
-
-*Original Query*
-
-``` sql
-SELECT * FROM t1 
-WHERE 
-x > (SELECT COUNT(*) FROM t2 WHERE t1.x = t2.x) 
-OR x < (SELECT COUNT(*) FROM t3 WHERE t1.y = t3.y)
-```
-
-Rewrite this query to separate it into two parts with a union on the `OR` 
conditions.
-
-*Rewritten Query*
-
-``` sql
-SELECT * FROM t1 
-WHERE x > (SELECT count(*) FROM t2 WHERE t1.x = t2.x) 
-UNION 
-SELECT * FROM t1 
-WHERE x < (SELECT count(*) FROM t3 WHERE t1.y = t3.y)
-```
-
-To view the query plan, use `EXPLAIN SELECT` or `EXPLAIN ANALYZE             
SELECT`. Subplan nodes in the query plan indicate that the query will run on 
every row of the outer query, and the query is a candidate for rewriting. For 
more information about these statements, see [Query 
Profiling](query-profiling.html#topic39).
-
-### <a id="topic22"></a>Advanced Table Functions
-
-HAWQ supports table functions with `TABLE` value expressions. You can sort 
input rows for advanced table functions with an `ORDER BY` clause. You can 
redistribute them with a `SCATTER BY` clause to specify one or more columns or 
an expression for which rows with the specified characteristics are available 
to the same process. This usage is similar to using a `DISTRIBUTED BY` clause 
when creating a table, but the redistribution occurs when the query runs.
-
-**Note:**
-Based on the distribution of data, HAWQ automatically parallelizes table 
functions with `TABLE` value parameters over the nodes of the cluster.
-
-### <a id="topic23"></a>Array Constructors
-
-An array constructor is an expression that builds an array value from values 
for its member elements. A simple array constructor consists of the key word 
`ARRAY`, a left square bracket `[`, one or more expressions separated by commas 
for the array element values, and a right square bracket `]`. For example,
-
-``` sql
-SELECT ARRAY[1,2,3+4];
-```
-
-```
-  array
----------
- {1,2,7}
-```
-
-The array element type is the common type of its member expressions, 
determined using the same rules as for `UNION` or `CASE` constructs.
-
-You can build multidimensional array values by nesting array constructors. In 
the inner constructors, you can omit the keyword `ARRAY`. For example, the 
following two `SELECT` statements produce the same result:
-
-``` sql
-SELECT ARRAY[ARRAY[1,2], ARRAY[3,4]];
-SELECT ARRAY[[1,2],[3,4]];
-```
-
-```
-     array
----------------
- {{1,2},{3,4}}
-```
-
-Since multidimensional arrays must be rectangular, inner constructors at the 
same level must produce sub-arrays of identical dimensions.
-
-Multidimensional array constructor elements are not limited to a sub-`ARRAY` 
construct; they are anything that produces an array of the proper kind. For 
example:
-
-``` sql
-CREATE TABLE arr(f1 int[], f2 int[]);
-INSERT INTO arr VALUES (ARRAY[[1,2],[3,4]], 
-ARRAY[[5,6],[7,8]]);
-SELECT ARRAY[f1, f2, '{{9,10},{11,12}}'::int[]] FROM arr;
-```
-
-```
-                     array
-------------------------------------------------
- {{{1,2},{3,4}},{{5,6},{7,8}},{{9,10},{11,12}}}
-```
-
-You can construct an array from the results of a subquery. Write the array 
constructor with the keyword `ARRAY` followed by a subquery in parentheses. For 
example:
-
-``` sql
-SELECT ARRAY(SELECT oid FROM pg_proc WHERE proname LIKE 'bytea%');
-```
-
-```
-                          ?column?
------------------------------------------------------------
- {2011,1954,1948,1952,1951,1244,1950,2005,1949,1953,2006,31}
-```
-
-The subquery must return a single column. The resulting one-dimensional array 
has an element for each row in the subquery result, with an element type 
matching that of the subquery's output column. The subscripts of an array value 
built with `ARRAY` always begin with `1`.
-
-### <a id="topic24"></a>Row Constructors
-
-A row constructor is an expression that builds a row value (also called a 
composite value) from values for its member fields. For example,
-
-``` sql
-SELECT ROW(1,2.5,'this is a test');
-```
-
-Row constructors have the syntax `rowvalue.*`, which expands to a list of the 
elements of the row value, as when you use the syntax `.*` at the top level of 
a `SELECT` list. For example, if table `t` has columns `f1` and `f2`, the 
following queries are the same:
-
-``` sql
-SELECT ROW(t.*, 42) FROM t;
-SELECT ROW(t.f1, t.f2, 42) FROM t;
-```
-
-By default, the value created by a `ROW` expression has an anonymous record 
type. If necessary, it can be cast to a named composite type — either the row 
type of a table, or a composite type created with `CREATE TYPE AS`. To avoid 
ambiguity, you can explicitly cast the value if necessary. For example:
-
-``` sql
-CREATE TABLE mytable(f1 int, f2 float, f3 text);
-CREATE FUNCTION getf1(mytable) RETURNS int AS 'SELECT $1.f1' 
-LANGUAGE SQL;
-```
-
-In the following query, you do not need to cast the value because there is 
only one `getf1()` function and therefore no ambiguity:
-
-``` sql
-SELECT getf1(ROW(1,2.5,'this is a test'));
-```
-
-```
- getf1
--------
-     1
-```
-
-``` sql
-CREATE TYPE myrowtype AS (f1 int, f2 text, f3 numeric);
-CREATE FUNCTION getf1(myrowtype) RETURNS int AS 'SELECT 
-$1.f1' LANGUAGE SQL;
-```
-
-Now we need a cast to indicate which function to call:
-
-``` sql
-SELECT getf1(ROW(1,2.5,'this is a test'));
-```
-```
-ERROR:  function getf1(record) is not unique
-```
-
-``` sql
-SELECT getf1(ROW(1,2.5,'this is a test')::mytable);
-```
-
-```
- getf1
--------
-     1
-```
-
-``` sql
-SELECT getf1(CAST(ROW(11,'this is a test',2.5) AS myrowtype));
-```
-
-```
- getf1
--------
-    11
-```
-
-You can use row constructors to build composite values to be stored in a 
composite-type table column or to be passed to a function that accepts a 
composite parameter.
-
-### <a id="topic25"></a>Expression Evaluation Rules
-
-The order of evaluation of subexpressions is undefined. The inputs of an 
operator or function are not necessarily evaluated left-to-right or in any 
other fixed order.
-
-If you can determine the result of an expression by evaluating only some parts 
of the expression, then other subexpressions might not be evaluated at all. For 
example, in the following expression:
-
-``` sql
-SELECT true OR somefunc();
-```
-
-`somefunc()` would probably not be called at all. The same is true in the 
following expression:
-
-``` sql
-SELECT somefunc() OR true;
-```
-
-This is not the same as the left-to-right evaluation order that Boolean 
operators enforce in some programming languages.
-
-Do not use functions with side effects as part of complex expressions, 
especially in `WHERE` and `HAVING` clauses, because those clauses are 
extensively reprocessed when developing an execution plan. Boolean expressions 
(`AND`/`OR`/`NOT` combinations) in those clauses can be reorganized in any 
manner that Boolean algebra laws allow.
-
-Use a `CASE` construct to force evaluation order. The following example is an 
untrustworthy way to avoid division by zero in a `WHERE` clause:
-
-``` sql
-SELECT ... WHERE x <> 0 AND y/x > 1.5;
-```
-
-The following example shows a trustworthy evaluation order:
-
-``` sql
-SELECT ... WHERE CASE WHEN x <> 0 THEN y/x > 1.5 ELSE false 
-END;
-```
-
-This `CASE` construct usage defeats optimization attempts; use it only when 
necessary.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/functions-operators.html.md.erb
----------------------------------------------------------------------
diff --git a/query/functions-operators.html.md.erb 
b/query/functions-operators.html.md.erb
deleted file mode 100644
index 8f14ee6..0000000
--- a/query/functions-operators.html.md.erb
+++ /dev/null
@@ -1,437 +0,0 @@
----
-title: Using Functions and Operators
----
-
-HAWQ evaluates functions and operators used in SQL expressions.
-
-## <a id="topic27"></a>Using Functions in HAWQ
-
-In HAWQ, functions can only be run on master.
-
-<a id="topic27__in201681"></a>
-
-<span class="tablecap">Table 1. Functions in HAWQ</span>
-
-
-| Function Type | HAWQ Support       | Description                             
                                                                                
  | Comments                                                                    
                                                                           |
-|---------------|--------------------|---------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|
-| IMMUTABLE     | Yes                | Relies only on information directly in 
its argument list. Given the same argument values, always returns the same 
result. |                                                                      
                                                                                
  |
-| STABLE        | Yes, in most cases | Within a single table scan, returns the 
same result for same argument values, but results change across SQL statements. 
  | Results depend on database lookups or parameter values. `current_timestamp` 
family of functions is `STABLE`; values do not change within an execution. |
-| VOLATILE      | Restricted         | Function values can change within a 
single table scan. For example: `random()`, `currval()`, `timeofday()`.         
      | Any function with side effects is volatile, even if its result is 
predictable. For example: `setval()`.                                           
     |
-
-HAWQ does not support functions that return a table reference (`rangeFuncs`) 
or functions that use the `refCursor` datatype.
-
-## <a id="topic28"></a>User-Defined Functions
-
-HAWQ supports user-defined functions. See [Extending 
SQL](http://www.postgresql.org/docs/8.2/static/extend.html) in the PostgreSQL 
documentation for more information.
-
-In HAWQ, the shared library files for user-created functions must reside in 
the same library path location on every host in the HAWQ array (masters and 
segments).
-
-**Important:**
-HAWQ does not support the following:
-
--   Enhanced table functions
--   PL/Java Type Maps
-
-
-Use the `CREATE FUNCTION` statement to register user-defined functions that 
are used as described in [Using Functions in HAWQ](#topic27). By default, 
user-defined functions are declared as `VOLATILE`, so if your user-defined 
function is `IMMUTABLE` or `STABLE`, you must specify the correct volatility 
level when you register your function.
-
-### <a id="functionvolatility"></a>Function Volatility
-
-Every function has a **volatility** classification, with the possibilities 
being VOLATILE, STABLE, or IMMUTABLE. VOLATILE is the default if the 
[CREATE FUNCTION](../reference/sql/CREATE-FUNCTION.html) command does not 
specify a category. The volatility category is a promise to the optimizer about 
the behavior of the function:
-
--   A VOLATILE function can do anything, including modifying the database. 
It can return different results on successive calls with the same arguments. 
The optimizer makes no assumptions about the behavior of such functions. A 
query using a volatile function will re-evaluate the function at every row 
where its value is needed.
--   A STABLE function cannot modify the database and is guaranteed to return 
the same results given the same arguments for all rows within a single 
statement. This category allows the optimizer to optimize multiple calls of the 
function to a single call.
--   An IMMUTABLE function cannot modify the database and is guaranteed to 
return the same results given the same arguments forever. This category allows 
the optimizer to pre-evaluate the function when a query calls it with constant 
arguments. For example, a query like SELECT ... WHERE x = 2 + 2 can be 
simplified on sight to SELECT ... WHERE x = 4, because the function underlying 
the integer addition operator is marked IMMUTABLE.
-
-For best optimization results, you should label your functions with the 
strictest volatility category that is valid for them.
-
-Any function with side-effects must be labeled VOLATILE, so that calls to 
it cannot be optimized away. Even a function with no side-effects needs to be 
labeled VOLATILE if its value can change within a single query; some examples 
are random(), currval(), timeofday().
-
-Another important example is that the `current_timestamp` family of 
functions qualify as STABLE, since their values do not change within a 
transaction.
-
-There is relatively little difference between STABLE and IMMUTABLE 
categories when considering simple interactive queries that are planned and 
immediately executed: it doesn't matter a lot whether a function is executed 
once during planning or once during query execution startup. But there is a big 
difference if the plan is saved and reused later. Labeling a function 
IMMUTABLE when it really isn't might allow it to be prematurely folded to a 
constant during planning, resulting in a stale value being re-used during 
subsequent uses of the plan. This is a hazard when using prepared statements or 
when using function languages that cache plans (such as PL/pgSQL).
-
-For functions written in SQL or in any of the standard procedural languages, 
there is a second important property determined by the volatility category, 
namely the visibility of any data changes that have been made by the SQL 
command that is calling the function. A VOLATILE function will see such 
changes, a STABLE or IMMUTABLE function will not. STABLE and IMMUTABLE 
functions use a snapshot established as of the start of the calling query, 
whereas VOLATILE functions obtain a fresh snapshot at the start of each query 
they execute.
-
-Because of this snapshotting behavior, a function containing only SELECT 
commands can safely be marked STABLE, even if it selects from tables that 
might be undergoing modifications by concurrent queries. PostgreSQL will 
execute all commands of a STABLE function using the snapshot established for 
the calling query, and so it will see a fixed view of the database throughout 
that query.
-
-The same snapshotting behavior is used for SELECT commands within 
IMMUTABLE functions. It is generally unwise to select from database tables 
within an IMMUTABLE function at all, since the immutability will be broken if 
the table contents ever change. However, PostgreSQL does not enforce that you 
do not do that.
-
-A common error is to label a function IMMUTABLE when its results depend on a 
configuration parameter. For example, a function that manipulates timestamps 
might well have results that depend on the timezone setting. For safety, such 
functions should be labeled STABLE instead.
-
-When you create user defined functions, avoid using fatal errors or 
destructive calls. HAWQ may respond to such errors with a sudden shutdown or 
restart.
-
-### <a id="nestedUDFs"></a>Nested Function Query Limitations
-
-HAWQ queries employing nested user-defined functions will fail when dispatched 
to segment node(s). 
-
-HAWQ stores the system catalog only on the master node. User-defined functions 
are stored in system catalog tables. HAWQ has no built-in knowledge about how 
to interpret the source text of a user-defined function. Consequently, the text 
is not parsed by HAWQ.
-
-This behavior may be problematic in queries where a user-defined function 
includes a nested function(s). When a query includes a user-defined function, 
metadata passed to the query executor includes function invocation information. 
 If run on the HAWQ master node, the nested function will be recognized. If 
such a query is dispatched to a segment, the nested function will not be found 
and the query will throw an error.
-
-## <a id="userdefinedtypes"></a>User Defined Types
-
-HAWQ can be extended to support new data types. This section describes how to 
define new base types, which are data types defined below the level of the 
SQL language. Creating a new base type requires implementing functions to 
operate on the type in a low-level language, usually C.
-
-A user-defined type must always have input and output functions.  These 
functions determine how the type appears in strings (for input by the user and 
output to the user) and how the type is organized in memory. The input function 
takes a null-terminated character string as its argument and returns the 
internal (in memory) representation of the type. The output function takes the 
internal representation of the type as argument and returns a null-terminated 
character string. If we want to do anything more with the type than merely 
store it, we must provide additional functions to implement whatever operations 
we'd like to have for the type.
-
-You should be careful to make the input and output functions inverses of each 
other. If you do not, you will have severe problems when you need to dump your 
data into a file and then read it back in. This is a particularly common 
problem when floating-point numbers are involved.
-
-Optionally, a user-defined type can provide binary input and output routines. 
Binary I/O is normally faster but less portable than textual I/O. As with 
textual I/O, it is up to you to define exactly what the external binary 
representation is. Most of the built-in data types try to provide a 
machine-independent binary representation. 
-
-Once we have written the I/O functions and compiled them into a shared 
library, we can define the complex type in SQL. First we declare it as a 
shell type:
-
-``` sql
-CREATE TYPE complex;
-```
-
-This serves as a placeholder that allows us to reference the type while 
defining its I/O functions. Now we can define the I/O functions:
-
-``` sql
-CREATE FUNCTION complex_in(cstring)
-    RETURNS complex
-    AS 'filename'
-    LANGUAGE C IMMUTABLE STRICT;
-
-CREATE FUNCTION complex_out(complex)
-    RETURNS cstring
-    AS 'filename'
-    LANGUAGE C IMMUTABLE STRICT;
-
-CREATE FUNCTION complex_recv(internal)
-   RETURNS complex
-   AS 'filename'
-   LANGUAGE C IMMUTABLE STRICT;
-
-CREATE FUNCTION complex_send(complex)
-   RETURNS bytea
-   AS 'filename'
-   LANGUAGE C IMMUTABLE STRICT;
-```
-
-Finally, we can provide the full definition of the data type:
-
-``` sql
-CREATE TYPE complex (
-   internallength = 16, 
-   input = complex_in,
-   output = complex_out,
-   receive = complex_recv,
-   send = complex_send,
-   alignment = double
-);
-```
-
-When you define a new base type, HAWQ automatically provides support for 
arrays of that type. For historical reasons, the array type has the same name 
as the base type with the underscore character (\_) prepended.
-
-Once the data type exists, we can declare additional functions to provide 
useful operations on the data type. Operators can then be defined atop the 
functions, and if needed, operator classes can be created to support indexing 
of the data type. 
-
-For further details, see the description of the [CREATE 
TYPE](../reference/sql/CREATE-TYPE.html) command.
-
-## <a id="userdefinedoperators"></a>User Defined Operators
-
-Every operator is "syntactic sugar" for a call to an underlying function 
that does the real work; so you must first create the underlying function 
before you can create the operator. However, an operator is not merely 
syntactic sugar, because it carries additional information that helps the query 
planner optimize queries that use the operator. The next section will be 
devoted to explaining that additional information.
-
-HAWQ supports left unary, right unary, and binary operators. Operators can be 
overloaded; that is, the same operator name can be used for different 
operators that have different numbers and types of operands. When a query is 
executed, the system determines the operator to call from the number and types 
of the provided operands.
-
-Here is an example of creating an operator for adding two complex numbers. We 
assume we've already created the definition of type complex. First we need a 
function that does the work, then we can define the operator:
-
-``` sql
-CREATE FUNCTION complex_add(complex, complex)
-    RETURNS complex
-    AS 'filename', 'complex_add'
-    LANGUAGE C IMMUTABLE STRICT;
-
-CREATE OPERATOR + (
-    leftarg = complex,
-    rightarg = complex,
-    procedure = complex_add,
-    commutator = +
-);
-```
-
-Now we could execute a query like this:
-
-``` sql
-SELECT (a + b) AS c FROM test_complex;
-```
-
-```
-        c
------------------
- (5.2,6.05)
- (133.42,144.95)
-```
-
-We've shown how to create a binary operator here. To create unary operators, 
just omit one of leftarg (for left unary) or rightarg (for right unary). 
The procedure clause and the argument clauses are the only required items in 
CREATE OPERATOR. The commutator clause shown in the example is an optional 
hint to the query optimizer. Further details aboutcommutator and other 
optimizer hints appear in the next section.
-
-## <a id="topic29"></a>Built-in Functions and Operators
-
-The following table lists the categories of built-in functions and operators 
supported by PostgreSQL. All functions and operators are supported in HAWQ as 
in PostgreSQL with the exception of `STABLE` and `VOLATILE` functions, which 
are subject to the restrictions noted in [Using Functions in HAWQ](#topic27). 
See the [Functions and 
Operators](http://www.postgresql.org/docs/8.2/static/functions.html) section of 
the PostgreSQL documentation for more information about these built-in 
functions and operators.
-
-<a id="topic29__in204913"></a>
-
-<table>
-<caption><span class="tablecap">Table 2. Built-in functions and 
operators</span></caption>
-<colgroup>
-<col width="33%" />
-<col width="33%" />
-<col width="33%" />
-</colgroup>
-<thead>
-<tr class="header">
-<th>Operator/Function Category</th>
-<th>VOLATILE Functions</th>
-<th>STABLE Functions</th>
-</tr>
-</thead>
-<tbody>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions.html#FUNCTIONS-LOGICAL";>Logical
 Operators</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="even">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-comparison.html";>Comparison
 Operators</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-math.html";>Mathematical
 Functions and Operators</a></td>
-<td>random
-<p>setseed</p></td>
-<td> </td>
-</tr>
-<tr class="even">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-string.html";>String 
Functions and Operators</a></td>
-<td><em>All built-in conversion functions</em></td>
-<td>convert
-<p>pg_client_encoding</p></td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-binarystring.html";>Binary
 String Functions and Operators</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="even">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-bitstring.html";>Bit 
String Functions and Operators</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.3/static/functions-matching.html";>Pattern
 Matching</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="even">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-formatting.html";>Data 
Type Formatting Functions</a></td>
-<td> </td>
-<td>to_char
-<p>to_timestamp</p></td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-datetime.html";>Date/Time
 Functions and Operators</a></td>
-<td>timeofday</td>
-<td>age
-<p>current_date</p>
-<p>current_time</p>
-<p>current_timestamp</p>
-<p>localtime</p>
-<p>localtimestamp</p>
-<p>now</p></td>
-</tr>
-<tr class="even">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-geometry.html";>Geometric
 Functions and Operators</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-net.html";>Network 
Address Functions and Operators</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="even">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-sequence.html";>Sequence
 Manipulation Functions</a></td>
-<td>currval
-<p>lastval</p>
-<p>nextval</p>
-<p>setval</p></td>
-<td> </td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-conditional.html";>Conditional
 Expressions</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="even">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-array.html";>Array 
Functions and Operators</a></td>
-<td> </td>
-<td><em>All array functions</em></td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-aggregate.html";>Aggregate
 Functions</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="even">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-subquery.html";>Subquery
 Expressions</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-comparisons.html";>Row 
and Array Comparisons</a></td>
-<td> </td>
-<td> </td>
-</tr>
-<tr class="even">
-<td><a href="http://www.postgresql.org/docs/8.2/static/functions-srf.html";>Set 
Returning Functions</a></td>
-<td>generate_series</td>
-<td> </td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-info.html";>System 
Information Functions</a></td>
-<td> </td>
-<td><em>All session information functions</em>
-<p><em>All access privilege inquiry functions</em></p>
-<p><em>All schema visibility inquiry functions</em></p>
-<p><em>All system catalog information functions</em></p>
-<p><em>All comment information functions</em></p></td>
-</tr>
-<tr class="even">
-<td><a 
href="http://www.postgresql.org/docs/8.2/static/functions-admin.html";>System 
Administration Functions</a></td>
-<td>set_config
-<p>pg_cancel_backend</p>
-<p>pg_reload_conf</p>
-<p>pg_rotate_logfile</p>
-<p>pg_start_backup</p>
-<p>pg_stop_backup</p>
-<p>pg_size_pretty</p>
-<p>pg_ls_dir</p>
-<p>pg_read_file</p>
-<p>pg_stat_file</p></td>
-<td>current_setting
-<p><em>All database object size functions</em></p></td>
-</tr>
-<tr class="odd">
-<td><a 
href="http://www.postgresql.org/docs/9.1/interactive/functions-xml.html";>XML 
Functions</a></td>
-<td> </td>
-<td>xmlagg(xml)
-<p>xmlexists(text, xml)</p>
-<p>xml_is_well_formed(text)</p>
-<p>xml_is_well_formed_document(text)</p>
-<p>xml_is_well_formed_content(text)</p>
-<p>xpath(text, xml)</p>
-<p>xpath(text, xml, text[])</p>
-<p>xpath_exists(text, xml)</p>
-<p>xpath_exists(text, xml, text[])</p>
-<p>xml(text)</p>
-<p>text(xml)</p>
-<p>xmlcomment(xml)</p>
-<p>xmlconcat2(xml, xml)</p></td>
-</tr>
-</tbody>
-</table>
-
-## <a id="topic30"></a>Window Functions
-
-The following built-in window functions are HAWQ extensions to the PostgreSQL 
database. All window functions are *immutable*. For more information about 
window functions, see [Window Expressions](defining-queries.html#topic13).
-
-<a id="topic30__in164369"></a>
-
-<span class="tablecap">Table 3. Window functions</span>
-
-| Function                                             | Return Type           
    | Full Syntax                                                               
                                | Description                                   
                                                                                
                                                                                
                                                                                
                                                                                
                                                                             |
-|------------------------------------------------------|---------------------------|-----------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `cume_dist()`                                        | `double precision`    
    | `CUME_DIST() OVER ( [PARTITION BY ` *expr* `] ORDER BY ` *expr* ` )`      
                                | Calculates the cumulative distribution of a 
value in a group of values. Rows with equal values always evaluate to the same 
cumulative distribution value.                                                  
                                                                                
                                                                                
                                                                                
|
-| `dense_rank()`                                       | `bigint`              
    | `DENSE_RANK () OVER ( [PARTITION BY ` *expr* `] ORDER BY ` *expr* `)`     
                                | Computes the rank of a row in an ordered 
group of rows without skipping rank values. Rows with equal values are given 
the same rank value.                                                            
                                                                                
                                                                                
                                                                                
     |
-| `first_value(expr)`                                  | same as input *expr* 
type | FIRST\_VALUE expr ) OVER ( \[PARTITION BY expr \] ORDER BY expr 
\[ROWS|RANGE frame\_expr \] )             | Returns the first value in an 
ordered set of values.                                                          
                                                                                
                                                                                
                                                                                
                                                                                
             |
-| `lag(expr [,offset] [,default])`                     | same as input *expr* 
type | `LAG(` *expr* ` [,` *offset* `] [,` *default* `]) OVER ( [PARTITION BY ` 
*expr* `] ORDER BY ` *expr* ` )` | Provides access to more than one row of the 
same table without doing a self join. Given a series of rows returned from a 
query and a position of the cursor, `LAG` provides access to a row at a given 
physical offset prior to that position. The default `offset` is 1. *default* 
sets the value that is returned if the offset goes beyond the scope of the 
window. If *default* is not specified, the default value is null.               
            |
-| `last_valueexpr`                                     | same as input *expr* 
type | LAST\_VALUE(expr) OVER ( \[PARTITION BY expr\] ORDER BY expr 
\[ROWS|RANGE frame\_expr\] )                 | Returns the last value in an 
ordered set of values.                                                          
                                                                                
                                                                                
                                                                                
                                                                                
              |
-| `                   lead(expr [,offset] [,default])` | same as input *expr* 
type | `LEAD(expr [,offset] [,exprdefault]) OVER (                   [PARTITION 
BY expr] ORDER BY expr )`        | Provides access to more than one row of the 
same table without doing a self join. Given a series of rows returned from a 
query and a position of the cursor, `lead` provides access to a row at a given 
physical offset after that position. If *offset* is not specified, the default 
offset is 1. *default* sets the value that is returned if the offset goes 
beyond the scope of the window. If *default* is not specified, the default 
value is null. |
-| `ntile(expr)`                                        | bigint                
    | `NTILE(expr) OVER ( [PARTITION BY expr] ORDER BY expr                   
)`                                | Divides an ordered data set into a number 
of buckets (as defined by *expr*) and assigns a bucket number to each row.      
                                                                                
                                                                                
                                                                                
                                                                                
 |
-| `percent_rank(`)                                     | `double precision`    
    | `PERCENT_RANK () OVER ( [PARTITION BY expr] ORDER BY expr                 
  )`                            | Calculates the rank of a hypothetical row `R` 
minus 1, divided by 1 less than the number of rows being evaluated (within a 
window partition).                                                              
                                                                                
                                                                                
                                                                                
|
-| `rank()`                                             | bigint                
    | `RANK () OVER ( [PARTITION BY expr] ORDER BY expr )`                      
                                | Calculates the rank of a row in an ordered 
group of values. Rows with equal values for the ranking criteria receive the 
same rank. The number of tied rows are added to the rank number to calculate 
the next rank value. Ranks may not be consecutive numbers in this case.         
                                                                                
                                                                                
      |
-| `row_number(`)                                       | `bigint`              
    | `ROW_NUMBER () OVER ( [PARTITION BY expr] ORDER BY expr                   
)`                              | Assigns a unique number to each row to which 
it is applied (either each row in a window partition or each row of the query). 
                                                                                
                                                                                
                                                                                
                                                                              |
-
-
-## <a id="topic31"></a>Advanced Aggregate Functions
-
-The following built-in advanced aggregate functions are HAWQ extensions of the 
PostgreSQL database.
-
-<a id="topic31__in2073121"></a>
-
-<table>
-
-<caption><span class="tablecap">Table 4. Advanced Aggregate 
Functions</span></caption>
-<colgroup>
-<col width="25%" />
-<col width="25%" />
-<col width="25%" />
-<col width="25%" />
-</colgroup>
-<thead>
-<tr class="header">
-<th>Function</th>
-<th>Return Type</th>
-<th>Full Syntax</th>
-<th>Description</th>
-</tr>
-</thead>
-<tbody>
-<tr class="odd">
-<td><code class="ph codeph">MEDIAN (expr)</code></td>
-<td><code class="ph codeph">timestamp, timestampz, interval, float</code></td>
-<td><code class="ph codeph">MEDIAN (expression)</code>
-<p><em>Example:</em></p>
-<pre class="pre codeblock"><code>SELECT department_id, MEDIAN(salary) 
-FROM employees 
-GROUP BY department_id; </code></pre></td>
-<td>Can take a two-dimensional array as input. Treats such arrays as 
matrices.</td>
-</tr>
-<tr class="even">
-<td><code class="ph codeph">PERCENTILE_CONT (expr) WITHIN GROUP (ORDER BY expr 
                  [DESC/ASC])</code></td>
-<td><code class="ph codeph">timestamp, timestampz, interval, float</code></td>
-<td><code class="ph codeph">PERCENTILE_CONT(percentage) WITHIN GROUP (ORDER BY 
                  expression)</code>
-<p><em>Example:</em></p>
-<pre class="pre codeblock"><code>SELECT department_id,
-PERCENTILE_CONT (0.5) WITHIN GROUP (ORDER BY salary DESC)
-&quot;Median_cont&quot;; 
-FROM employees GROUP BY department_id;</code></pre></td>
-<td>Performs an inverse function that assumes a continuous distribution model. 
It takes a percentile value and a sort specification and returns the same 
datatype as the numeric datatype of the argument. This returned value is a 
computed result after performing linear interpolation. Null are ignored in this 
calculation.</td>
-</tr>
-<tr class="odd">
-<td><code class="ph codeph">PERCENTILE_DISC (expr) WITHIN GROUP (ORDER BY      
               expr [DESC/ASC]</code>)</td>
-<td><code class="ph codeph">timestamp, timestampz, interval, float</code></td>
-<td><code class="ph codeph">PERCENTILE_DISC(percentage) WITHIN GROUP (ORDER BY 
                  expression)</code>
-<p><em>Example:</em></p>
-<pre class="pre codeblock"><code>SELECT department_id, 
-PERCENTILE_DISC (0.5) WITHIN GROUP (ORDER BY salary DESC)
-&quot;Median_desc&quot;; 
-FROM employees GROUP BY department_id;</code></pre></td>
-<td>Performs an inverse distribution function that assumes a discrete 
distribution model. It takes a percentile value and a sort specification. This 
returned value is an element from the set. Null are ignored in this 
calculation.</td>
-</tr>
-<tr class="even">
-<td><code class="ph codeph">sum(array[])</code></td>
-<td><code class="ph codeph">smallint[]int[], bigint[], float[]</code></td>
-<td><code class="ph codeph">sum(array[[1,2],[3,4]])</code>
-<p><em>Example:</em></p>
-<pre class="pre codeblock"><code>CREATE TABLE mymatrix (myvalue int[]);
-INSERT INTO mymatrix VALUES (array[[1,2],[3,4]]);
-INSERT INTO mymatrix VALUES (array[[0,1],[1,0]]);
-SELECT sum(myvalue) FROM mymatrix;
- sum 
----------------
- {{1,3},{4,4}}</code></pre></td>
-<td>Performs matrix summation. Can take as input a two-dimensional array that 
is treated as a matrix.</td>
-</tr>
-<tr class="odd">
-<td><code class="ph codeph">pivot_sum (label[], label, expr)</code></td>
-<td><code class="ph codeph">int[], bigint[], float[]</code></td>
-<td><code class="ph codeph">pivot_sum( array['A1','A2'], attr, 
value)</code></td>
-<td>A pivot aggregation using sum to resolve duplicate entries.</td>
-</tr>
-</tbody>
-</table>
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/gporca/query-gporca-changed.html.md.erb
----------------------------------------------------------------------
diff --git a/query/gporca/query-gporca-changed.html.md.erb 
b/query/gporca/query-gporca-changed.html.md.erb
deleted file mode 100644
index 041aa4b..0000000
--- a/query/gporca/query-gporca-changed.html.md.erb
+++ /dev/null
@@ -1,17 +0,0 @@
----
-title: Changed Behavior with GPORCA
----
-
-<span class="shortdesc">When GPORCA is enabled, HAWQ's behavior changes. This 
topic describes these changes.</span>
-
--   The command `CREATE TABLE AS` distributes table data randomly if the 
`DISTRIBUTED BY` clause is not specified and no primary or unique keys are 
specified.
--   Statistics are required on the root table of a partitioned table. The 
`ANALYZE` command generates statistics on both root and individual partition 
tables (leaf child tables). See the `ROOTPARTITION` clause for `ANALYZE` 
command.
--   Additional Result nodes in the query plan:
-    -   Query plan `Assert` operator.
-    -   Query plan `Partition selector` operator.
-    -   Query plan `Split` operator.
--   When running `EXPLAIN`, the query plan generated by GPORCA is different 
than the plan generated by the legacy query optimizer.
--   HAWQ adds the log file message `Planner produced plan` when GPORCA is 
enabled and HAWQ falls back to the legacy query optimizer to generate the query 
plan.
--   HAWQ issues a warning when statistics are missing from one or more table 
columns. When executing an SQL command with GPORCA, HAWQ issues a warning if 
the command performance could be improved by collecting statistics on a column 
or set of columns referenced by the command. The warning is issued on the 
command line and information is added to the HAWQ log file. For information 
about collecting statistics on table columns, see the `ANALYZE` command.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/gporca/query-gporca-enable.html.md.erb
----------------------------------------------------------------------
diff --git a/query/gporca/query-gporca-enable.html.md.erb 
b/query/gporca/query-gporca-enable.html.md.erb
deleted file mode 100644
index e8cc93f..0000000
--- a/query/gporca/query-gporca-enable.html.md.erb
+++ /dev/null
@@ -1,95 +0,0 @@
----
-title: Enabling GPORCA
----
-
-<span class="shortdesc">Precompiled versions of HAWQ that include the GPORCA 
query optimizer enable it by default, no additional configuration is required. 
To use the GPORCA query optimizer in a HAWQ built from source, your build must 
include GPORCA. You must also enable specific HAWQ server configuration 
parameters at or after install time: </span>
-
--   [Set the <code class="ph 
codeph">optimizer\_analyze\_root\_partition</code> parameter to <code class="ph 
codeph">on</code>](#topic_r5d_hv1_kr) to enable statistics collection for the 
root partition of a partitioned table.
--   Set the `optimizer` parameter to `on` to enable GPORCA. You can set the 
parameter at these levels:
-    -   [A HAWQ system](#topic_byp_lqk_br)
-    -   [A specific HAWQ database](#topic_pzr_3db_3r)
-    -   [A session or query](#topic_lx4_vqk_br)
-
-**Important:** If you intend to execute queries on partitioned tables with 
GPORCA enabled, you must collect statistics on the partitioned table root 
partition with the `ANALYZE ROOTPARTITION` command. The command `ANALYZE        
 ROOTPARTITION` collects statistics on the root partition of a partitioned 
table without collecting statistics on the leaf partitions. If you specify a 
list of column names for a partitioned table, the statistics for the columns 
and the root partition are collected. For information on the `ANALYZE` command, 
see [ANALYZE](../../reference/sql/ANALYZE.html).
-
-You can also use the HAWQ utility `analyzedb` to update table statistics. The 
HAWQ utility `analyzedb` can update statistics for multiple tables in parallel. 
The utility can also check table statistics and update statistics only if the 
statistics are not current or do not exist. For information about the 
`analyzedb` utility, see 
[analyzedb](../../reference/cli/admin_utilities/analyzedb.html#topic1).
-
-As part of routine database maintenance, you should refresh statistics on the 
root partition when there are significant changes to child leaf partition data.
-
-## <a id="topic_r5d_hv1_kr"></a>Setting the 
optimizer\_analyze\_root\_partition Parameter
-
-When the configuration parameter `optimizer_analyze_root_partition` is set to 
`on`, root partition statistics will be collected when `ANALYZE` is run on a 
partitioned table. Root partition statistics are required by GPORCA.
-
-You will perform different procedures to set optimizer configuration 
parameters for your whole HAWQ cluster depending upon whether you manage your 
cluster from the command line or use Ambari. If you use Ambari to manage your 
HAWQ cluster, you must ensure that you update server configuration parameters 
only via the Ambari Web UI. If you manage your HAWQ cluster from the command 
line, you will use the `hawq config` command line utility to set optimizer 
server configuration parameters.
-
-If you use Ambari to manage your HAWQ cluster:
-
-1. Set the `optimizer_analyze_root_partition` configuration property to `on` 
via the HAWQ service **Configs > Advanced > Custom hawq-site** drop down. 
-2. Select **Service Actions > Restart All** to load the updated configuration.
-
-If you manage your HAWQ cluster from the command line:
-
-1.  Log in to the HAWQ master host as a HAWQ administrator and source the file 
`/usr/local/hawq/greenplum_path.sh`.
-
-    ``` shell
-    $ source /usr/local/hawq/greenplum_path.sh
-    ```
-
-1. Use the `hawq config` utility to set `optimizer_analyze_root_partition`:
-
-    ``` shell
-    $ hawq config -c optimizer_analyze_root_partition -v on
-    ```
-2. Reload the HAWQ configuration:
-
-    ``` shell
-    $ hawq stop cluster -u
-    ```
-
-## <a id="topic_byp_lqk_br"></a>Enabling GPORCA for a System
-
-Set the server configuration parameter `optimizer` for the HAWQ system.
-
-If you use Ambari to manage your HAWQ cluster:
-
-1. Set the `optimizer` configuration property to `on` via the HAWQ service 
**Configs > Advanced > Custom hawq-site** drop down. 
-2. Select **Service Actions > Restart All** to load the updated configuration.
-
-If you manage your HAWQ cluster from the command line:
-
-1.  Log in to the HAWQ master host as a HAWQ administrator and source the file 
`/usr/local/hawq/greenplum_path.sh`.
-
-    ``` shell
-    $ source /usr/local/hawq/greenplum_path.sh
-    ```
-
-1. Use the `hawq config` utility to set `optimizer`:
-
-    ``` shell
-    $ hawq config -c optimizer -v on
-    ```
-2. Reload the HAWQ configuration:
-
-    ``` shell
-    $ hawq stop cluster -u
-    ```
-
-## <a id="topic_pzr_3db_3r"></a>Enabling GPORCA for a Database
-
-Set the server configuration parameter `optimizer` for individual HAWQ 
databases with the `ALTER DATABASE` command. For example, this command enables 
GPORCA for the database *test\_db*.
-
-``` sql
-=> ALTER DATABASE test_db SET optimizer = ON ;
-```
-
-## <a id="topic_lx4_vqk_br"></a>Enabling GPORCA for a Session or a Query
-
-You can use the `SET` command to set `optimizer` server configuration 
parameter for a session. For example, after you use the `psql` utility to 
connect to HAWQ, this `SET` command enables GPORCA:
-
-``` sql
-=> SET optimizer = on ;
-```
-
-To set the parameter for a specific query, include the `SET` command prior to 
running the query.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/gporca/query-gporca-fallback.html.md.erb
----------------------------------------------------------------------
diff --git a/query/gporca/query-gporca-fallback.html.md.erb 
b/query/gporca/query-gporca-fallback.html.md.erb
deleted file mode 100644
index 999e9a7..0000000
--- a/query/gporca/query-gporca-fallback.html.md.erb
+++ /dev/null
@@ -1,142 +0,0 @@
----
-title: Determining The Query Optimizer In Use
----
-
-<span class="shortdesc"> When GPORCA is enabled, you can determine if HAWQ is 
using GPORCA or is falling back to the legacy query optimizer. </span>
-
-These are two ways to determine which query optimizer HAWQ used to execute the 
query:
-
--   Examine `EXPLAIN` query plan output for the query. (Your output may 
include other settings.)
-    -   When GPORCA generates the query plan, the GPORCA version is displayed 
near the end of the query plan . For example.
-
-        ``` pre
-         Settings:  optimizer=on
-         Optimizer status:  PQO version 1.627
-        ```
-
-        When HAWQ falls back to the legacy optimizer to generate the plan, 
`legacy query                 optimizer` is displayed near the end of the query 
plan. For example.
-
-        ``` pre
-         Settings:  optimizer=on
-         Optimizer status: legacy query optimizer
-        ```
-
-        When the server configuration parameter `OPTIMIZER` is `off`, the 
following lines are displayed near the end of a query plan.
-
-        ``` pre
-         Settings:  optimizer=off
-         Optimizer status: legacy query optimizer
-        ```
-
-    -   These plan items appear only in the `EXPLAIN` plan output generated by 
GPORCA. The items are not supported in a legacy optimizer query plan.
-        -   Assert operator
-        -   Sequence operator
-        -   DynamicIndexScan
-        -   DynamicTableScan
-        -   Table Scan
-    -   When a query against a partitioned table is generated by GPORCA, the 
`EXPLAIN` plan displays only the number of partitions that are being eliminated 
is listed. The scanned partitions are not shown. The `EXPLAIN` plan generated 
by the legacy optimizer lists the scanned partitions.
-
--   View the log messages in the HAWQ log file.
-
-    The log file contains messages that indicate which query optimizer was 
used. In the log file message, the `[OPT]` flag appears when GPORCA attempts to 
optimize a query. If HAWQ falls back to the legacy optimizer, an error message 
is added to the log file, indicating the unsupported feature. Also, in the 
message, the label `Planner produced             plan:` appears before the 
query when HAWQ falls back to the legacy optimizer.
-
-    **Note:** You can configure HAWQ to display log messages on the psql 
command line by setting the HAWQ server configuration parameter 
`client_min_messages` to `LOG`. See [Server Configuration Parameter 
Reference](../../reference/HAWQSiteConfig.html) for information about the 
parameter.
-
-## <a id="topic_n4w_nb5_xr"></a>Example
-
-This example shows the differences for a query that is run against partitioned 
tables when GPORCA is enabled.
-
-This `CREATE TABLE` statement creates a table with single level partitions:
-
-``` sql
-CREATE TABLE sales (trans_id int, date date, 
-    amount decimal(9,2), region text)
-   DISTRIBUTED BY (trans_id)
-   PARTITION BY RANGE (date)
-      (START (date '2011­01­01') 
-       INCLUSIVE END (date '2012­01­01') 
-       EXCLUSIVE EVERY (INTERVAL '1 month'),
-   DEFAULT PARTITION outlying_dates);
-```
-
-This query against the table is supported by GPORCA and does not generate 
errors in the log file:
-
-``` sql
-SELECT * FROM sales;
-```
-
-The `EXPLAIN` plan output lists only the number of selected partitions.
-
-``` 
- ->  Partition Selector for sales (dynamic scan id: 1)  (cost=10.00..100.00 
rows=50 width=4)
-       Partitions selected:  13 (out of 13)
-```
-
-Output from the log file indicates that GPORCA attempted to optimize the query:
-
-``` 
-2015-05-06 15:00:53.293451 PDT,"gpadmin","test",p2809,th297883424,"[local]",
-  ,2015-05-06 14:59:21 PDT,1120,con6,cmd1,seg-1,,dx3,x1120,sx1,"LOG","00000"
-  ,"statement: explain select * from sales
-;",,,,,,"explain select * from sales
-;",0,,"postgres.c",1566,
-
-2015-05-06 15:00:54.258412 PDT,"gpadmin","test",p2809,th297883424,"[local]",
-  ,2015-05-06 14:59:21 PDT,1120,con6,cmd1,seg-1,,dx3,x1120,sx1,"LOG","00000","
-[OPT]: Using default search strategy",,,,,,"explain select * from sales
-;",0,,"COptTasks.cpp",677,
-```
-
-The following cube query is not supported by GPORCA.
-
-``` sql
-SELECT count(*) FROM foo GROUP BY cube(a,b);
-```
-
-The following EXPLAIN plan output includes the message "Feature not supported 
by GPORCA."
-
-``` sql
-postgres=# EXPLAIN SELECT count(*) FROM foo GROUP BY cube(a,b);
-```
-```
-LOG:  statement: explain select count(*) from foo group by cube(a,b);
-LOG:  2016-04-14 16:26:15:487935 PDT,THD000,NOTICE,"Feature not supported by 
the GPORCA: Cube",
-LOG:  Planner produced plan :0
-                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
- Gather Motion 3:1  (slice3; segments: 3)  (cost=9643.62..19400.26 rows=40897 
width=28)
-   ->  Append  (cost=9643.62..19400.26 rows=13633 width=28)
-         ->  HashAggregate  (cost=9643.62..9993.39 rows=9328 width=28)
-               Group By: "rollup".unnamed_attr_2, "rollup".unnamed_attr_1, 
"rollup"."grouping", "rollup"."group_id"
-               ->  Subquery Scan "rollup"  (cost=8018.50..9589.81 rows=1435 
width=28)
-                     ->  Redistribute Motion 3:3  (slice1; segments: 3)  
(cost=8018.50..9546.76 rows=1435 width=28)
-                           Hash Key: "rollup".unnamed_attr_2, 
"rollup".unnamed_attr_1, "grouping", group_id()
-                           ->  GroupAggregate  (cost=8018.50..9460.66 
rows=1435 width=28)
-                                 Group By: "rollup"."grouping", 
"rollup"."group_id"
-                                 ->  Subquery Scan "rollup"  
(cost=8018.50..9326.13 rows=2153 width=28)
-                                       ->  GroupAggregate  
(cost=8018.50..9261.56 rows=2153 width=28)
-                                             Group By: 
"rollup".unnamed_attr_2, "rollup"."grouping", "rollup"."group_id"
-                                             ->  Subquery Scan "rollup"  
(cost=8018.50..9073.22 rows=2870 width=28)
-                                                   ->  GroupAggregate  
(cost=8018.50..8987.12 rows=2870 width=28)
-                                                         Group By: 
public.foo.b, public.foo.a
-                                                         ->  Sort  
(cost=8018.50..8233.75 rows=28700 width=8)
-                                                               Sort Key: 
public.foo.b, public.foo.a
-                                                               ->  Seq Scan on 
foo  (cost=0.00..961.00 rows=28700 width=8)
-         ->  HashAggregate  (cost=9116.27..9277.71 rows=4305 width=28)
-               Group By: "rollup".unnamed_attr_1, "rollup".unnamed_attr_2, 
"rollup"."grouping", "rollup"."group_id"
-               ->  Subquery Scan "rollup"  (cost=8018.50..9062.46 rows=1435 
width=28)
-                     ->  Redistribute Motion 3:3  (slice2; segments: 3)  
(cost=8018.50..9019.41 rows=1435 width=28)
-                           Hash Key: public.foo.a, public.foo.b, "grouping", 
group_id()
-                           ->  GroupAggregate  (cost=8018.50..8933.31 
rows=1435 width=28)
-                                 Group By: public.foo.a
-                                 ->  Sort  (cost=8018.50..8233.75 rows=28700 
width=8)
-                                       Sort Key: public.foo.a
-                                       ->  Seq Scan on foo  (cost=0.00..961.00 
rows=28700 width=8)
- Settings:  optimizer=on
- Optimizer status: legacy query optimizer
-(30 rows)
-```
-
-Since this query is not supported by GPORCA, HAWQ falls back to the legacy 
optimizer.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/gporca/query-gporca-features.html.md.erb
----------------------------------------------------------------------
diff --git a/query/gporca/query-gporca-features.html.md.erb 
b/query/gporca/query-gporca-features.html.md.erb
deleted file mode 100644
index 4941866..0000000
--- a/query/gporca/query-gporca-features.html.md.erb
+++ /dev/null
@@ -1,215 +0,0 @@
----
-title: GPORCA Features and Enhancements
----
-
-GPORCA includes enhancements for specific types of queries and operations.  
GPORCA also includes these optimization enhancements:
-
--   Improved join ordering
--   Join-Aggregate reordering
--   Sort order optimization
--   Data skew estimates included in query optimization
-
-## <a id="topic_dwy_zml_gr"></a>Queries Against Partitioned Tables
-
-GPORCA includes these enhancements for queries against partitioned tables:
-
--   Partition elimination is improved.
--   Query plan can contain the `Partition selector` operator.
--   Partitions are not enumerated in `EXPLAIN` plans.
-
-    For queries that involve static partition selection where the partitioning 
key is compared to a constant, GPORCA lists the number of partitions to be 
scanned in the `EXPLAIN` output under the Partition Selector operator. This 
example Partition Selector operator shows the filter and number of partitions 
selected:
-
-    ``` pre
-    Partition Selector for Part_Table (dynamic scan id: 1) 
-           Filter: a > 10
-           Partitions selected:  1 (out of 3)
-    ```
-
-    For queries that involve dynamic partition selection where the 
partitioning key is compared to a variable, the number of partitions that are 
scanned will be known only during query execution. The partitions selected are 
not shown in the `EXPLAIN` output.
-
--   Plan size is independent of number of partitions.
--   Out of memory errors caused by number of partitions are reduced.
-
-This example `CREATE TABLE` command creates a range partitioned table.
-
-``` sql
-CREATE TABLE sales(order_id int, item_id int, amount numeric(15,2), 
-      date date, yr_qtr int)
-   RANGE PARTITIONED BY yr_qtr;
-```
-
-GPORCA improves on these types of queries against partitioned tables:
-
--   Full table scan. Partitions are not enumerated in plans.
-
-    ``` sql
-    SELECT * FROM sales;
-    ```
-
--   Query with a constant filter predicate. Partition elimination is performed.
-
-    ``` sql
-    SELECT * FROM sales WHERE yr_qtr = 201201;
-    ```
-
--   Range selection. Partition elimination is performed.
-
-    ``` sql
-    SELECT * FROM sales WHERE yr_qtr BETWEEN 201301 AND 201404 ;
-    ```
-
--   Joins involving partitioned tables. In this example, the partitioned 
dimension table *date\_dim* is joined with fact table *catalog\_sales*:
-
-    ``` sql
-    SELECT * FROM catalog_sales
-       WHERE date_id IN (SELECT id FROM date_dim WHERE month=12);
-    ```
-
-## <a id="topic_vph_wml_gr"></a>Queries that Contain Subqueries
-
-GPORCA handles subqueries more efficiently. A subquery is query that is nested 
inside an outer query block. In the following query, the `SELECT` in the 
`WHERE` clause is a subquery.
-
-``` sql
-SELECT * FROM part
-  WHERE price > (SELECT avg(price) FROM part);
-```
-
-GPORCA also handles queries that contain a correlated subquery (CSQ) more 
efficiently. A correlated subquery is a subquery that uses values from the 
outer query. In the following query, the `price` column is used in both the 
outer query and the subquery.
-
-``` sql
-SELECT * FROM part p1
-  WHERE price > (SELECT avg(price) FROM part p2 
-  WHERE  p2.brand = p1.brand);
-```
-
-GPORCA generates more efficient plans for the following types of subqueries:
-
--   CSQ in the `SELECT` list.
-
-    ``` sql
-    SELECT *,
-     (SELECT min(price) FROM part p2 WHERE p1.brand = p2.brand)
-     AS foo
-    FROM part p1;
-    ```
-
--   CSQ in disjunctive (`OR`) filters.
-
-    ``` sql
-    SELECT FROM part p1 WHERE p_size > 40 OR 
-          p_retailprice > 
-          (SELECT avg(p_retailprice) 
-              FROM part p2 
-              WHERE p2.p_brand = p1.p_brand)
-    ```
-
--   Nested CSQ with skip level correlations
-
-    ``` sql
-    SELECT * FROM part p1 WHERE p1.p_partkey 
-    IN (SELECT p_partkey FROM part p2 WHERE p2.p_retailprice = 
-         (SELECT min(p_retailprice)
-           FROM part p3 
-           WHERE p3.p_brand = p1.p_brand)
-    );
-    ```
-
-    **Note:** Nested CSQ with skip level correlations are not supported by the 
legacy query optimizer.
-
--   CSQ with aggregate and inequality. This example contains a CSQ with an 
inequality.
-
-    ``` sql
-    SELECT * FROM part p1 WHERE p1.p_retailprice =
-     (SELECT min(p_retailprice) FROM part p2 WHERE p2.p_brand <> p1.p_brand);
-    ```
-
-<!-- -->
-
--   CSQ that must return one row.
-
-    ``` sql
-    SELECT p_partkey, 
-      (SELECT p_retailprice FROM part p2 WHERE p2.p_brand = p1.p_brand )
-    FROM part p1;
-    ```
-
-## <a id="topic_c3v_rml_gr"></a>Queries that Contain Common Table Expressions
-
-GPORCA handles queries that contain the `WITH` clause. The `WITH` clause, also 
known as a common table expression (CTE), generates temporary tables that exist 
only for the query. This example query contains a CTE.
-
-``` sql
-WITH v AS (SELECT a, sum(b) as s FROM T WHERE c < 10 GROUP BY a)
-  SELECT *FROM  v AS v1 ,  v AS v2
-  WHERE v1.a <> v2.a AND v1.s < v2.s;
-```
-
-As part of query optimization, GPORCA can push down predicates into a CTE. For 
example query, GPORCA pushes the equality predicates to the CTE.
-
-``` sql
-WITH v AS (SELECT a, sum(b) as s FROM T GROUP BY a)
-  SELECT *
-  FROM v as v1, v as v2, v as v3
-  WHERE v1.a < v2.a
-    AND v1.s < v3.s
-    AND v1.a = 10
-    AND v2.a = 20
-    AND v3.a = 30;
-```
-
-GPORCA can handle these types of CTEs:
-
--   CTE that defines one or multiple tables. In this query, the CTE defines 
two tables.
-
-    ``` sql
-    WITH cte1 AS (SELECT a, sum(b) as s FROM T 
-                   where c < 10 GROUP BY a),
-          cte2 AS (SELECT a, s FROM cte1 where s > 1000)
-      SELECT *
-      FROM cte1 as v1, cte2 as v2, cte2 as v3
-      WHERE v1.a < v2.a AND v1.s < v3.s;
-    ```
-
--   Nested CTEs.
-
-    ``` sql
-    WITH v AS (WITH w AS (SELECT a, b FROM foo 
-                          WHERE b < 5) 
-               SELECT w1.a, w2.b 
-               FROM w AS w1, w AS w2 
-               WHERE w1.a = w2.a AND w1.a > 2)
-      SELECT v1.a, v2.a, v2.b
-      FROM v as v1, v as v2
-      WHERE v1.a < v2.a; 
-    ```
-
-## <a id="topic_plx_mml_gr"></a>DML Operation Enhancements with GPORCA
-
-GPORCA contains enhancements for DML operations such as `INSERT`.
-
--   A DML node in a query plan is a query plan operator.
-    -   Can appear anywhere in the plan, as a regular node (top slice only for 
now)
-    -   Can have consumers
--   New query plan operator `Assert` is used for constraints checking.
-
-    This example plan shows the `Assert` operator.
-
-    ```
-    QUERY PLAN
-    ------------------------------------------------------------
-     Insert  (cost=0.00..4.61 rows=3 width=8)
-       ->  Assert  (cost=0.00..3.37 rows=3 width=24)
-             Assert Cond: (dmlsource.a > 2) IS DISTINCT FROM 
-    false
-             ->  Assert  (cost=0.00..2.25 rows=3 width=24)
-                   Assert Cond: NOT dmlsource.b IS NULL
-                   ->  Result  (cost=0.00..1.14 rows=3 width=24)
-                         ->  Table Scan on dmlsource
-    ```
-
-## <a id="topic_anl_t3t_pv"></a>Queries with Distinct Qualified Aggregates 
(DQA)
-
-GPORCA improves performance for queries that contain distinct qualified 
aggregates (DQA) without a grouping column and when the table is not 
distributed on the columns used by the DQA. When encountering these types of 
queries, GPORCA uses an alternative plan that evaluates the aggregate functions 
in three stages (local, intermediate, and global aggregations).
-
-See 
[optimizer\_prefer\_scalar\_dqa\_multistage\_agg](../../reference/guc/parameter_definitions.html#optimizer_prefer_scalar_dqa_multistage_agg)
 for information on the configuration parameter that controls this behavior.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/gporca/query-gporca-limitations.html.md.erb
----------------------------------------------------------------------
diff --git a/query/gporca/query-gporca-limitations.html.md.erb 
b/query/gporca/query-gporca-limitations.html.md.erb
deleted file mode 100644
index b63f0d2..0000000
--- a/query/gporca/query-gporca-limitations.html.md.erb
+++ /dev/null
@@ -1,37 +0,0 @@
----
-title: GPORCA Limitations
----
-
-<span class="shortdesc">There are limitations in HAWQ when GPORCA is enabled. 
GPORCA and the legacy query optimizer currently coexist in HAWQ because GPORCA 
does not support all HAWQ features. </span>
-
-
-## <a id="topic_kgn_vxl_vp"></a>Unsupported SQL Query Features
-
-These HAWQ features are unsupported when GPORCA is enabled:
-
--   Indexed expressions
--   `PERCENTILE` window function
--   External parameters
--   SortMergeJoin (SMJ)
--   Ordered aggregations
--   These analytics extensions:
-    -   CUBE
-    -   Multiple grouping sets
--   These scalar operators:
-    -   `ROW`
-    -   `ROWCOMPARE`
-    -   `FIELDSELECT`
--   Multiple `DISTINCT` qualified aggregate functions
--   Inverse distribution functions
-
-## <a id="topic_u4t_vxl_vp"></a>Performance Regressions
-
-When GPORCA is enabled in HAWQ, the following features are known performance 
regressions:
-
--   Short running queries - For GPORCA, short running queries might encounter 
additional overhead due to GPORCA enhancements for determining an optimal query 
execution plan.
--   `ANALYZE` - For GPORCA, the `ANALYZE` command generates root partition 
statistics for partitioned tables. For the legacy optimizer, these statistics 
are not generated.
--   DML operations - For GPORCA, DML enhancements including the support of 
updates on partition and distribution keys might require additional overhead.
-
-Also, enhanced functionality of the features from previous versions could 
result in additional time required when GPORCA executes SQL statements with the 
features.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/gporca/query-gporca-notes.html.md.erb
----------------------------------------------------------------------
diff --git a/query/gporca/query-gporca-notes.html.md.erb 
b/query/gporca/query-gporca-notes.html.md.erb
deleted file mode 100644
index ed943e4..0000000
--- a/query/gporca/query-gporca-notes.html.md.erb
+++ /dev/null
@@ -1,28 +0,0 @@
----
-title: Considerations when Using GPORCA
----
-
-<span class="shortdesc"> To execute queries optimally with GPORCA, consider 
certain criteria for the query. </span>
-
-Ensure the following criteria are met:
-
--   The table does not contain multi-column partition keys.
--   The table does not contain multi-level partitioning.
--   The query does not run against master only tables such as the system table 
*pg\_attribute*.
--   Statistics have been collected on the root partition of a partitioned 
table.
-
-If the partitioned table contains more than 20,000 partitions, consider a 
redesign of the table schema.
-
-GPORCA generates minidumps to describe the optimization context for a given 
query. Use the minidump files to analyze HAWQ issues. The minidump file is 
located under the master data directory and uses the following naming format:
-
-`Minidump_date_time.mdp`
-
-For information about the minidump file, see the server configuration 
parameter `optimizer_minidump`.
-
-When the `EXPLAIN ANALYZE` command uses GPORCA, the `EXPLAIN` plan shows only 
the number of partitions that are being eliminated. The scanned partitions are 
not shown. To show name of the scanned partitions in the segment logs set the 
server configuration parameter `gp_log_dynamic_partition_pruning` to `on`. This 
example `SET` command enables the parameter.
-
-``` sql
-SET gp_log_dynamic_partition_pruning = on;
-```
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/gporca/query-gporca-optimizer.html.md.erb
----------------------------------------------------------------------
diff --git a/query/gporca/query-gporca-optimizer.html.md.erb 
b/query/gporca/query-gporca-optimizer.html.md.erb
deleted file mode 100644
index 11814f8..0000000
--- a/query/gporca/query-gporca-optimizer.html.md.erb
+++ /dev/null
@@ -1,39 +0,0 @@
----
-title: About GPORCA
----
-
-In HAWQ, you can use GPORCA or the legacy query optimizer.
-
-**Note:** To use the GPORCA query optimizer, you must be running a version of 
HAWQ built with GPORCA, and GPORCA must be enabled in your HAWQ deployment.
-
-These sections describe GPORCA functionality and usage:
-
--   **[Overview of GPORCA](../../query/gporca/query-gporca-overview.html)**
-
-    GPORCA extends the planning and optimization capabilities of the HAWQ 
legacy optimizer.
-
--   **[GPORCA Features and 
Enhancements](../../query/gporca/query-gporca-features.html)**
-
-    GPORCA includes enhancements for specific types of queries and operations:
-
--   **[Enabling GPORCA](../../query/gporca/query-gporca-enable.html)**
-
-    Precompiled versions of HAWQ that include the GPORCA query optimizer 
enable it by default, no additional configuration is required. To use the 
GPORCA query optimizer in a HAWQ built from source, your build must include 
GPORCA. You must also enable specific HAWQ server configuration parameters at 
or after install time:
-
--   **[Considerations when Using 
GPORCA](../../query/gporca/query-gporca-notes.html)**
-
-    To execute queries optimally with GPORCA, consider certain criteria for 
the query.
-
--   **[Determining The Query Optimizer In 
Use](../../query/gporca/query-gporca-fallback.html)**
-
-    When GPORCA is enabled, you can determine if HAWQ is using GPORCA or is 
falling back to the legacy query optimizer.
-
--   **[Changed Behavior with 
GPORCA](../../query/gporca/query-gporca-changed.html)**
-
-    When GPORCA is enabled, HAWQ's behavior changes. This topic describes 
these changes.
-
--   **[GPORCA Limitations](../../query/gporca/query-gporca-limitations.html)**
-
-    There are limitations in HAWQ when GPORCA is enabled. GPORCA and the 
legacy query optimizer currently coexist in HAWQ because GPORCA does not 
support all HAWQ features.
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/query/gporca/query-gporca-overview.html.md.erb
----------------------------------------------------------------------
diff --git a/query/gporca/query-gporca-overview.html.md.erb 
b/query/gporca/query-gporca-overview.html.md.erb
deleted file mode 100644
index 56f97eb..0000000
--- a/query/gporca/query-gporca-overview.html.md.erb
+++ /dev/null
@@ -1,23 +0,0 @@
----
-title: Overview of GPORCA
----
-
-<span class="shortdesc">GPORCA extends the planning and optimization 
capabilities of the HAWQ legacy optimizer. </span> GPORCA is extensible and 
achieves better optimization in multi-core architecture environments. When 
GPORCA is available in your HAWQ installation and enabled, HAWQ uses GPORCA to 
generate an execution plan for a query when possible.
-
-GPORCA also enhances HAWQ query performance tuning in the following areas:
-
--   Queries against partitioned tables
--   Queries that contain a common table expression (CTE)
--   Queries that contain subqueries
-
-The legacy and GPORCA query optimizers coexist in HAWQ. The default query 
optimizer is GPORCA. When GPORCA is available and enabled in your HAWQ 
installation, HAWQ uses GPORCA to generate an execution plan for a query when 
possible. If GPORCA cannot be used, the legacy query optimizer is used.
-
-The following flow chart shows how GPORCA fits into the query planning 
architecture:
-
-<img src="../../images/gporca.png" id="topic1__image_rf5_svc_fv" class="image" 
width="672" />
-
-You can inspect the log to determine whether GPORCA or the legacy query 
optimizer produced the plan. The log message, "Optimizer produced plan" 
indicates that GPORCA generated the plan for your query. If the legacy query 
optimizer generated the plan, the log message reads "Planner produced plan". 
See [Determining The Query Optimizer In Use](query-gporca-fallback.html#topic1).
-
-**Note:** All legacy query optimizer (planner) server configuration parameters 
are ignored by GPORCA. However, if HAWQ falls back to the legacy optimizer, the 
planner server configuration parameters will impact the query plan generation.
-
-

Reply via email to