[04/51] [partial] incubator-impala git commit: IMPALA-4181 [DOCS] Publish rendered Impala documentation to ASF site

jbapple Wed, 12 Apr 2017 11:25:32 -0700

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_struct.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_struct.html 
b/docs/build/html/topics/impala_struct.html
new file mode 100644
index 0000000..c796fe9
--- /dev/null
+++ b/docs/build/html/topics/impala_struct.html
@@ -0,0 +1,500 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_datatypes.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="version" 
content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta 
name="DC.Format" content="XHTML"><meta name="DC.Identifier" 
content="struct"><link rel="stylesheet" type="text/css" 
href="../commonltr.css"><title>STRUCT Complex Type (Impala 2.3 or higher 
only)</title></head><body id="struct"><main role="main"><article role="article" 
aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">STRUCT Complex Type (<span 
class="keyword">Impala 2.3</span> or higher only)</h1>
+
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      A complex data type, representing multiple fields of a single item. 
Frequently used as the element type of an <code class="ph codeph">ARRAY</code>
+      or the <code class="ph codeph">VALUE</code> part of a <code class="ph 
codeph">MAP</code>.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Syntax:</strong>
+      </p>
+
+<pre class="pre codeblock"><code><var class="keyword 
varname">column_name</var> STRUCT &lt; <var class="keyword varname">name</var> 
: <var class="keyword varname">type</var> [COMMENT '<var class="keyword 
varname">comment_string</var>'], ... &gt;
+
+type ::= <var class="keyword varname">primitive_type</var> | <var 
class="keyword varname">complex_type</var>
+</code></pre>
+
+    <p class="p">
+      The names and number of fields within the <code class="ph 
codeph">STRUCT</code> are fixed. Each field can be a different type. A field 
within a
+      <code class="ph codeph">STRUCT</code> can also be another <code 
class="ph codeph">STRUCT</code>, or an <code class="ph codeph">ARRAY</code> or 
a <code class="ph codeph">MAP</code>, allowing
+      you to create nested data structures with a maximum nesting depth of 100.
+    </p>
+
+    <p class="p">
+      A <code class="ph codeph">STRUCT</code> can be the top-level type for a 
column, or can itself be an item within an <code class="ph codeph">ARRAY</code> 
or the
+      value part of the key-value pair in a <code class="ph codeph">MAP</code>.
+    </p>
+
+    <p class="p">
+      When a <code class="ph codeph">STRUCT</code> is used as an <code 
class="ph codeph">ARRAY</code> element or a <code class="ph codeph">MAP</code> 
value, you use a join clause to
+      bring the <code class="ph codeph">ARRAY</code> or <code class="ph 
codeph">MAP</code> elements into the result set, and then refer to
+      <code class="ph codeph"><var class="keyword 
varname">array_name</var>.ITEM.<var class="keyword varname">field</var></code> 
or
+      <code class="ph codeph"><var class="keyword 
varname">map_name</var>.VALUE.<var class="keyword varname">field</var></code>. 
In the case of a <code class="ph codeph">STRUCT</code> directly inside
+      an <code class="ph codeph">ARRAY</code> or <code class="ph 
codeph">MAP</code>, you can omit the <code class="ph codeph">.ITEM</code> and 
<code class="ph codeph">.VALUE</code> pseudocolumns
+      and refer directly to <code class="ph codeph"><var class="keyword 
varname">array_name</var>.<var class="keyword varname">field</var></code> or
+      <code class="ph codeph"><var class="keyword varname">map_name</var>.<var 
class="keyword varname">field</var></code>.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Usage notes:</strong>
+      </p>
+
+    <p class="p">
+        Because complex types are often used in combination,
+        for example an <code class="ph codeph">ARRAY</code> of <code class="ph 
codeph">STRUCT</code>
+        elements, if you are unfamiliar with the Impala complex types,
+        start with <a class="xref" 
href="../shared/../topics/impala_complex_types.html#complex_types">Complex 
Types (Impala 2.3 or higher only)</a> for
+        background information and usage examples.
+      </p>
+
+    <p class="p">
+      A <code class="ph codeph">STRUCT</code> is similar conceptually to a 
table row: it contains a fixed number of named fields, each with a predefined
+      type. To combine two related tables, while using complex types to 
minimize repetition, the typical way to represent that data is as an
+      <code class="ph codeph">ARRAY</code> of <code class="ph 
codeph">STRUCT</code> elements.
+    </p>
+
+    <p class="p">
+      Because a <code class="ph codeph">STRUCT</code> has a fixed number of 
named fields, it typically does not make sense to have a
+      <code class="ph codeph">STRUCT</code> as the type of a table column. In 
such a case, you could just make each field of the <code class="ph 
codeph">STRUCT</code>
+      into a separate column of the table. The <code class="ph 
codeph">STRUCT</code> type is most useful as an item of an <code class="ph 
codeph">ARRAY</code> or the
+      value part of the key-value pair in a <code class="ph 
codeph">MAP</code>. A nested type column with a <code class="ph 
codeph">STRUCT</code> at the lowest level
+      lets you associate a variable number of row-like objects with each row 
of the table.
+    </p>
+
+    <p class="p">
+      The <code class="ph codeph">STRUCT</code> type is straightforward to 
reference within a query. You do not need to include the
+      <code class="ph codeph">STRUCT</code> column in a join clause or give it 
a table alias, as is required for the <code class="ph codeph">ARRAY</code> and
+      <code class="ph codeph">MAP</code> types. You refer to the individual 
fields using dot notation, such as
+      <code class="ph codeph"><var class="keyword 
varname">struct_column_name</var>.<var class="keyword 
varname">field_name</var></code>, without any pseudocolumn such as
+      <code class="ph codeph">ITEM</code> or <code class="ph 
codeph">VALUE</code>.
+    </p>
+
+    <p class="p">
+        You can pass a multi-part qualified name to <code class="ph 
codeph">DESCRIBE</code>
+        to specify an <code class="ph codeph">ARRAY</code>, <code class="ph 
codeph">STRUCT</code>, or <code class="ph codeph">MAP</code>
+        column and visualize its structure as if it were a table.
+        For example, if table <code class="ph codeph">T1</code> contains an 
<code class="ph codeph">ARRAY</code> column
+        <code class="ph codeph">A1</code>, you could issue the statement <code 
class="ph codeph">DESCRIBE t1.a1</code>.
+        If table <code class="ph codeph">T1</code> contained a <code class="ph 
codeph">STRUCT</code> column <code class="ph codeph">S1</code>,
+        and a field <code class="ph codeph">F1</code> within the <code 
class="ph codeph">STRUCT</code> was a <code class="ph codeph">MAP</code>,
+        you could issue the statement <code class="ph codeph">DESCRIBE 
t1.s1.f1</code>.
+        An <code class="ph codeph">ARRAY</code> is shown as a two-column 
table, with
+        <code class="ph codeph">ITEM</code> and <code class="ph 
codeph">POS</code> columns.
+        A <code class="ph codeph">STRUCT</code> is shown as a table with each 
field
+        representing a column in the table.
+        A <code class="ph codeph">MAP</code> is shown as a two-column table, 
with
+        <code class="ph codeph">KEY</code> and <code class="ph 
codeph">VALUE</code> columns.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Internal details:</strong>
+      </p>
+
+    <p class="p">
+      Within the Parquet data file, the values for each <code class="ph 
codeph">STRUCT</code> field are stored adjacent to each other, so that they can 
be
+      encoded and compressed using all the Parquet techniques for storing sets 
of similar or repeated values. The adjacency applies even
+      when the <code class="ph codeph">STRUCT</code> values are part of an 
<code class="ph codeph">ARRAY</code> or <code class="ph codeph">MAP</code>. 
During a query, Impala avoids
+      unnecessary I/O by reading only the portions of the Parquet data file 
containing the requested <code class="ph codeph">STRUCT</code> fields.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Added in:</strong> <span class="keyword">Impala 
2.3.0</span>
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Restrictions:</strong>
+      </p>
+
+    <ul class="ul">
+        <li class="li">
+          <p class="p">
+            Columns with this data type can only be used in tables or 
partitions with the Parquet file format.
+          </p>
+        </li>
+        <li class="li">
+          <p class="p">
+            Columns with this data type cannot be used as partition key 
columns in a partitioned table.
+          </p>
+        </li>
+        <li class="li">
+          <p class="p">
+            The <code class="ph codeph">COMPUTE STATS</code> statement does 
not produce any statistics for columns of this data type.
+          </p>
+        </li>
+        <li class="li">
+          <p class="p" id="struct__d6e2889">
+            The maximum length of the column definition for any complex type, 
including declarations for any nested types,
+            is 4000 characters.
+          </p>
+        </li>
+        <li class="li">
+          <p class="p">
+            See <a class="xref" 
href="../shared/../topics/impala_complex_types.html#complex_types_limits">Limitations
 and Restrictions for Complex Types</a> for a full list of limitations
+            and associated guidelines about complex type columns.
+          </p>
+        </li>
+      </ul>
+
+    <p class="p">
+        <strong class="ph b">Kudu considerations:</strong>
+      </p>
+    <p class="p">
+        Currently, the data types <code class="ph codeph">DECIMAL</code>, 
<code class="ph codeph">TIMESTAMP</code>, <code class="ph codeph">CHAR</code>, 
<code class="ph codeph">VARCHAR</code>,
+        <code class="ph codeph">ARRAY</code>, <code class="ph 
codeph">MAP</code>, and <code class="ph codeph">STRUCT</code> cannot be used 
with Kudu tables.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+    <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+      Many of the complex type examples refer to tables
+      such as <code class="ph codeph">CUSTOMER</code> and <code class="ph 
codeph">REGION</code>
+      adapted from the tables used in the TPC-H benchmark.
+      See <a class="xref" 
href="../shared/../topics/impala_complex_types.html#complex_sample_schema">Sample
 Schema and Data for Experimenting with Impala Complex Types</a>
+      for the table definitions.
+      </div>
+
+    <p class="p">
+      The following example shows a table with various kinds of <code 
class="ph codeph">STRUCT</code> columns, both at the top level and nested within
+      other complex types. Practice the <code class="ph codeph">CREATE 
TABLE</code> and query notation for complex type columns using empty tables, 
until
+      you can visualize a complex data structure and construct corresponding 
SQL statements reliably.
+    </p>
+
+<pre class="pre codeblock"><code>CREATE TABLE struct_demo
+(
+  id BIGINT,
+  name STRING,
+
+-- A STRUCT as a top-level column. Demonstrates how the table ID column
+-- and the ID field within the STRUCT can coexist without a name conflict.
+  employee_info STRUCT &lt; employer: STRING, id: BIGINT, address: STRING &gt;,
+
+-- A STRUCT as the element type of an ARRAY.
+  places_lived ARRAY &lt; STRUCT &lt;street: STRING, city: STRING, country: 
STRING &gt;&gt;,
+
+-- A STRUCT as the value portion of the key-value pairs in a MAP.
+  memorable_moments MAP &lt; STRING, STRUCT &lt; year: INT, place: STRING, 
details: STRING &gt;&gt;,
+
+-- A STRUCT where one of the fields is another STRUCT.
+  current_address STRUCT &lt; street_address: STRUCT &lt;street_number: INT, 
street_name: STRING, street_type: STRING&gt;, country: STRING, postal_code: 
STRING &gt;
+)
+STORED AS PARQUET;
+
+</code></pre>
+
+    <p class="p">
+      The following example shows how to examine the structure of a table 
containing one or more <code class="ph codeph">STRUCT</code> columns by using
+      the <code class="ph codeph">DESCRIBE</code> statement. You can visualize 
each <code class="ph codeph">STRUCT</code> as its own table, with columns named 
the same
+      as each field of the <code class="ph codeph">STRUCT</code>. If the <code 
class="ph codeph">STRUCT</code> is nested inside another complex type, such as
+      <code class="ph codeph">ARRAY</code>, you can extend the qualified name 
passed to <code class="ph codeph">DESCRIBE</code> until the output shows just 
the
+      <code class="ph codeph">STRUCT</code> fields.
+    </p>
+
+<pre class="pre codeblock"><code>DESCRIBE struct_demo;
++-------------------+--------------------------+
+| name              | type                     |
++-------------------+--------------------------+
+| id                | bigint                   |
+| name              | string                   |
+| employee_info     | struct&lt;                  |
+|                   |   employer:string,       |
+|                   |   id:bigint,             |
+|                   |   address:string         |
+|                   | &gt;                        |
+| places_lived      | array&lt;struct&lt;            |
+|                   |   street:string,         |
+|                   |   city:string,           |
+|                   |   country:string         |
+|                   | &gt;&gt;                       |
+| memorable_moments | map&lt;string,struct&lt;       |
+|                   |   year:int,              |
+|                   |   place:string,          |
+|                   |   details:string         |
+|                   | &gt;&gt;                       |
+| current_address   | struct&lt;                  |
+|                   |   street_address:struct&lt; |
+|                   |     street_number:int,   |
+|                   |     street_name:string,  |
+|                   |     street_type:string   |
+|                   |   &gt;,                     |
+|                   |   country:string,        |
+|                   |   postal_code:string     |
+|                   | &gt;                        |
++-------------------+--------------------------+
+
+</code></pre>
+
+    <p class="p">
+      The top-level column <code class="ph codeph">EMPLOYEE_INFO</code> is a 
<code class="ph codeph">STRUCT</code>. Describing
+      <code class="ph codeph"><var class="keyword 
varname">table_name</var>.<var class="keyword varname">struct_name</var></code> 
displays the fields of the <code class="ph codeph">STRUCT</code> as if
+      they were columns of a table:
+    </p>
+
+<pre class="pre codeblock"><code>DESCRIBE struct_demo.employee_info;
++----------+--------+
+| name     | type   |
++----------+--------+
+| employer | string |
+| id       | bigint |
+| address  | string |
++----------+--------+
+
+</code></pre>
+
+    <p class="p">
+      Because <code class="ph codeph">PLACES_LIVED</code> is a <code class="ph 
codeph">STRUCT</code> inside an <code class="ph codeph">ARRAY</code>, the 
initial
+      <code class="ph codeph">DESCRIBE</code> shows the structure of the <code 
class="ph codeph">ARRAY</code>:
+    </p>
+
+<pre class="pre codeblock"><code>DESCRIBE struct_demo.places_lived;
++------+------------------+
+| name | type             |
++------+------------------+
+| item | struct&lt;          |
+|      |   street:string, |
+|      |   city:string,   |
+|      |   country:string |
+|      | &gt;                |
+| pos  | bigint           |
++------+------------------+
+
+</code></pre>
+
+    <p class="p">
+      Ask for the details of the <code class="ph codeph">ITEM</code> field of 
the <code class="ph codeph">ARRAY</code> to see just the layout of the
+      <code class="ph codeph">STRUCT</code>:
+    </p>
+
+<pre class="pre codeblock"><code>DESCRIBE struct_demo.places_lived.item;
++---------+--------+
+| name    | type   |
++---------+--------+
+| street  | string |
+| city    | string |
+| country | string |
++---------+--------+
+
+</code></pre>
+
+    <p class="p">
+      Likewise, <code class="ph codeph">MEMORABLE_MOMENTS</code> has a <code 
class="ph codeph">STRUCT</code> inside a <code class="ph codeph">MAP</code>, 
which requires an extra
+      level of qualified name to see just the <code class="ph 
codeph">STRUCT</code> part:
+    </p>
+
+<pre class="pre codeblock"><code>DESCRIBE struct_demo.memorable_moments;
++-------+------------------+
+| name  | type             |
++-------+------------------+
+| key   | string           |
+| value | struct&lt;          |
+|       |   year:int,      |
+|       |   place:string,  |
+|       |   details:string |
+|       | &gt;                |
++-------+------------------+
+
+</code></pre>
+
+    <p class="p">
+      For a <code class="ph codeph">MAP</code>, ask to see the <code class="ph 
codeph">VALUE</code> field to see the corresponding <code class="ph 
codeph">STRUCT</code> fields in a
+      table-like structure:
+    </p>
+
+<pre class="pre codeblock"><code>DESCRIBE struct_demo.memorable_moments.value;
++---------+--------+
+| name    | type   |
++---------+--------+
+| year    | int    |
+| place   | string |
+| details | string |
++---------+--------+
+
+</code></pre>
+
+    <p class="p">
+      For a <code class="ph codeph">STRUCT</code> inside a <code class="ph 
codeph">STRUCT</code>, we can see the fields of the outer <code class="ph 
codeph">STRUCT</code>:
+    </p>
+
+<pre class="pre codeblock"><code>DESCRIBE struct_demo.current_address;
++----------------+-----------------------+
+| name           | type                  |
++----------------+-----------------------+
+| street_address | struct&lt;               |
+|                |   street_number:int,  |
+|                |   street_name:string, |
+|                |   street_type:string  |
+|                | &gt;                     |
+| country        | string                |
+| postal_code    | string                |
++----------------+-----------------------+
+
+</code></pre>
+
+    <p class="p">
+      Then we can use a further qualified name to see just the fields of the 
inner <code class="ph codeph">STRUCT</code>:
+    </p>
+
+<pre class="pre codeblock"><code>DESCRIBE 
struct_demo.current_address.street_address;
++---------------+--------+
+| name          | type   |
++---------------+--------+
+| street_number | int    |
+| street_name   | string |
+| street_type   | string |
++---------------+--------+
+
+</code></pre>
+
+    <p class="p">
+      The following example shows how to examine the structure of a table 
containing one or more <code class="ph codeph">STRUCT</code> columns by using
+      the <code class="ph codeph">DESCRIBE</code> statement. You can visualize 
each <code class="ph codeph">STRUCT</code> as its own table, with columns named 
the same
+      as each field of the <code class="ph codeph">STRUCT</code>. If the <code 
class="ph codeph">STRUCT</code> is nested inside another complex type, such as
+      <code class="ph codeph">ARRAY</code>, you can extend the qualified name 
passed to <code class="ph codeph">DESCRIBE</code> until the output shows just 
the
+      <code class="ph codeph">STRUCT</code> fields.
+    </p>
+
+
+
+<pre class="pre codeblock"><code>DESCRIBE struct_demo;
++-------------------+--------------------------+---------+
+| name              | type                     | comment |
++-------------------+--------------------------+---------+
+| id                | bigint                   |         |
+| name              | string                   |         |
+| employee_info     | struct&lt;                  |         |
+|                   |   employer:string,       |         |
+|                   |   id:bigint,             |         |
+|                   |   address:string         |         |
+|                   | &gt;                        |         |
+| places_lived      | array&lt;struct&lt;            |         |
+|                   |   street:string,         |         |
+|                   |   city:string,           |         |
+|                   |   country:string         |         |
+|                   | &gt;&gt;                       |         |
+| memorable_moments | map&lt;string,struct&lt;       |         |
+|                   |   year:int,              |         |
+|                   |   place:string,          |         |
+|                   |   details:string         |         |
+|                   | &gt;&gt;                       |         |
+| current_address   | struct&lt;                  |         |
+|                   |   street_address:struct&lt; |         |
+|                   |     street_number:int,   |         |
+|                   |     street_name:string,  |         |
+|                   |     street_type:string   |         |
+|                   |   &gt;,                     |         |
+|                   |   country:string,        |         |
+|                   |   postal_code:string     |         |
+|                   | &gt;                        |         |
++-------------------+--------------------------+---------+
+
+SELECT id, employee_info.id FROM struct_demo;
+
+SELECT id, employee_info.id AS employee_id FROM struct_demo;
+
+SELECT id, employee_info.id AS employee_id, employee_info.employer
+  FROM struct_demo;
+
+SELECT id, name, street, city, country
+  FROM struct_demo, struct_demo.places_lived;
+
+SELECT id, name, places_lived.pos, places_lived.street, places_lived.city, 
places_lived.country
+  FROM struct_demo, struct_demo.places_lived;
+
+SELECT id, name, pl.pos, pl.street, pl.city, pl.country
+  FROM struct_demo, struct_demo.places_lived AS pl;
+
+SELECT id, name, places_lived.pos, places_lived.street, places_lived.city, 
places_lived.country
+  FROM struct_demo, struct_demo.places_lived;
+
+SELECT id, name, pos, street, city, country
+  FROM struct_demo, struct_demo.places_lived;
+
+SELECT id, name, memorable_moments.key,
+  memorable_moments.value.year,
+  memorable_moments.value.place,
+  memorable_moments.value.details
+FROM struct_demo, struct_demo.memorable_moments
+WHERE memorable_moments.key IN ('Birthday','Anniversary','Graduation');
+
+SELECT id, name, mm.key, mm.value.year, mm.value.place, mm.value.details
+  FROM struct_demo, struct_demo.memorable_moments AS mm
+WHERE mm.key IN ('Birthday','Anniversary','Graduation');
+
+SELECT id, name, memorable_moments.key, memorable_moments.value.year,
+  memorable_moments.value.place, memorable_moments.value.details
+FROM struct_demo, struct_demo.memorable_moments
+WHERE key IN ('Birthday','Anniversary','Graduation');
+
+SELECT id, name, key, value.year, value.place, value.details
+  FROM struct_demo, struct_demo.memorable_moments
+WHERE key IN ('Birthday','Anniversary','Graduation');
+
+SELECT id, name, key, year, place, details
+  FROM struct_demo, struct_demo.memorable_moments
+WHERE key IN ('Birthday','Anniversary','Graduation');
+
+SELECT id, name,
+  current_address.street_address.street_number,
+  current_address.street_address.street_name,
+  current_address.street_address.street_type,
+  current_address.country,
+  current_address.postal_code
+FROM struct_demo;
+
+</code></pre>
+
+    <p class="p">
+      For example, this table uses a struct that encodes several data values 
for each phone number associated with a person. Each person can
+      have a variable-length array of associated phone numbers, and queries 
can refer to the category field to locate specific home, work,
+      mobile, and so on kinds of phone numbers.
+    </p>
+
+<pre class="pre codeblock"><code>CREATE TABLE contact_info_many_structs
+(
+  id BIGINT, name STRING,
+  phone_numbers ARRAY &lt; STRUCT &lt;category:STRING, country_code:STRING, 
area_code:SMALLINT, full_number:STRING, mobile:BOOLEAN, carrier:STRING &gt; &gt;
+) STORED AS PARQUET;
+
+</code></pre>
+
+    <p class="p">
+      Because structs are naturally suited to composite values where the 
fields have different data types, you might use them to decompose
+      things such as addresses:
+    </p>
+
+<pre class="pre codeblock"><code>CREATE TABLE contact_info_detailed_address
+(
+  id BIGINT, name STRING,
+  address STRUCT &lt; house_number:INT, street:STRING, street_type:STRING, 
apartment:STRING, city:STRING, region:STRING, country:STRING &gt;
+);
+
+</code></pre>
+
+    <p class="p">
+      In a big data context, splitting out data fields such as the number part 
of the address and the street name could let you do analysis
+      on each field independently. For example, which streets have the largest 
number range of addresses, what are the statistical
+      properties of the street names, which areas have a higher proportion of 
<span class="q">"Roads"</span>, <span class="q">"Courts"</span> or <span 
class="q">"Boulevards"</span>, and so on.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+    <p class="p">
+      <a class="xref" href="impala_complex_types.html#complex_types">Complex 
Types (Impala 2.3 or higher only)</a>, <a class="xref" 
href="impala_array.html#array">ARRAY Complex Type (Impala 2.3 or higher 
only)</a>,
+
+      <a class="xref" href="impala_map.html#map">MAP Complex Type (Impala 2.3 
or higher only)</a>
+    </p>
+
+  </div>
+
+<nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_datatypes.html">Data 
Types</a></div></div></nav></article></main></body></html>
\ No newline at end of file


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_subqueries.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_subqueries.html 
b/docs/build/html/topics/impala_subqueries.html
new file mode 100644
index 0000000..2be2880
--- /dev/null
+++ b/docs/build/html/topics/impala_subqueries.html
@@ -0,0 +1,316 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_select.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="version" 
content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta 
name="DC.Format" content="XHTML"><meta name="DC.Identifier" 
content="subqueries"><link rel="stylesheet" type="text/css" 
href="../commonltr.css"><title>Subqueries in Impala SELECT 
Statements</title></head><body id="subqueries"><main role="main"><article 
role="article" aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">Subqueries in Impala SELECT 
Statements</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      
+      A <dfn class="term">subquery</dfn> is a query that is nested within 
another query. Subqueries let queries on one table
+      dynamically adapt based on the contents of another table. This technique 
provides great flexibility and
+      expressive power for SQL queries.
+    </p>
+
+    <p class="p">
+      A subquery can return a result set for use in the <code class="ph 
codeph">FROM</code> or <code class="ph codeph">WITH</code> clauses, or
+      with operators such as <code class="ph codeph">IN</code> or <code 
class="ph codeph">EXISTS</code>.
+    </p>
+
+    <p class="p">
+      A <dfn class="term">scalar subquery</dfn> produces a result set with a 
single row containing a single column, typically
+      produced by an aggregation function such as <code class="ph 
codeph">MAX()</code> or <code class="ph codeph">SUM()</code>. This single
+      result value can be substituted in scalar contexts such as arguments to 
comparison operators. If the result
+      set is empty, the value of the scalar subquery is <code class="ph 
codeph">NULL</code>. For example, the following query
+      finds the maximum value of <code class="ph codeph">T2.Y</code> and then 
substitutes that value into the
+      <code class="ph codeph">WHERE</code> clause of the outer block that 
queries <code class="ph codeph">T1</code>:
+    </p>
+
+<pre class="pre codeblock"><code>SELECT x FROM t1 WHERE x &gt; (SELECT MAX(y) 
FROM t2);
+</code></pre>
+
+    <p class="p">
+      <dfn class="term">Uncorrelated subqueries</dfn> do not refer to any 
tables from the outer block of the query. The same
+      value or set of values produced by the subquery is used when evaluating 
each row from the outer query block.
+      In this example, the subquery returns an arbitrary number of values from 
<code class="ph codeph">T2.Y</code>, and each
+      value of <code class="ph codeph">T1.X</code> is tested for membership in 
that same set of values:
+    </p>
+
+<pre class="pre codeblock"><code>SELECT x FROM t1 WHERE x IN (SELECT y FROM 
t2);
+</code></pre>
+
+    <p class="p">
+      <dfn class="term">Correlated subqueries</dfn> compare one or more values 
from the outer query block to values referenced
+      in the <code class="ph codeph">WHERE</code> clause of the subquery. Each 
row evaluated by the outer <code class="ph codeph">WHERE</code>
+      clause can be evaluated using a different set of values. These kinds of 
subqueries are restricted in the
+      kinds of comparisons they can do between columns of the inner and outer 
tables. (See the following
+      <strong class="ph b">Restrictions</strong> item.)
+    </p>
+
+    <p class="p">
+      For example, the following query finds all the employees with salaries 
that are higher than average for their
+      department. The subquery potentially computes a different <code 
class="ph codeph">AVG()</code> value for each employee.
+    </p>
+
+
+
+<pre class="pre codeblock"><code>SELECT employee_name, employee_id FROM 
employees one WHERE
+  salary &gt; (SELECT avg(salary) FROM employees two WHERE one.dept_id = 
two.dept_id);
+</code></pre>
+
+    <p class="p">
+        <strong class="ph b">Syntax:</strong>
+      </p>
+
+    <p class="p">
+      <strong class="ph b">Subquery in the <code class="ph codeph">FROM</code> 
clause:</strong>
+    </p>
+
+<pre class="pre codeblock"><code>SELECT <var class="keyword 
varname">select_list</var> FROM <var class="keyword varname">table_ref</var> [, 
<var class="keyword varname">table_ref</var> ...]
+
+<var class="keyword varname">table_ref</var> ::= <var class="keyword 
varname">table_name</var> | (<var class="keyword 
varname">select_statement</var>)
+</code></pre>
+
+    <p class="p">
+      <strong class="ph b">Subqueries in <code class="ph codeph">WHERE</code> 
clause:</strong>
+    </p>
+
+<pre class="pre codeblock"><code>WHERE <var class="keyword 
varname">value</var> <var class="keyword varname">comparison_operator</var> 
(<var class="keyword varname">scalar_select_statement</var>)
+WHERE <var class="keyword varname">value</var> [NOT] IN (<var class="keyword 
varname">select_statement</var>)
+WHERE [NOT] EXISTS (<var class="keyword 
varname">correlated_select_statement</var>)
+WHERE NOT EXISTS (<var class="keyword 
varname">correlated_select_statement</var>)
+</code></pre>
+
+    <p class="p">
+      <code class="ph codeph">comparison_operator</code> is a numeric 
comparison such as <code class="ph codeph">=</code>,
+      <code class="ph codeph">&lt;=</code>, <code class="ph codeph">!=</code>, 
and so on, or a string comparison operator such as
+      <code class="ph codeph">LIKE</code> or <code class="ph 
codeph">REGEXP</code>.
+    </p>
+
+    <p class="p">
+      Although you can use non-equality comparison operators such as <code 
class="ph codeph">&lt;</code> or
+      <code class="ph codeph">&gt;=</code>, the subquery must include at least 
one equality comparison between the columns of the
+      inner and outer query blocks.
+    </p>
+
+    <p class="p">
+      All syntax is available for both correlated and uncorrelated queries, 
except that the <code class="ph codeph">NOT
+      EXISTS</code> clause cannot be used with an uncorrelated subquery.
+    </p>
+
+    <p class="p">
+      Impala subqueries can be nested arbitrarily deep.
+    </p>
+
+    <p class="p">
+      <strong class="ph b">Standards compliance:</strong> Introduced in
+      <a class="xref" href="http://en.wikipedia.org/wiki/SQL:1999"; 
target="_blank">SQL:1999</a>.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+    <p class="p">
+      This example illustrates how subqueries can be used in the <code 
class="ph codeph">FROM</code> clause to organize the table
+      names, column names, and column values by producing intermediate result 
sets, especially for join queries.
+    </p>
+
+<pre class="pre codeblock"><code>SELECT avg(t1.x), max(t2.y) FROM
+  (SELECT id, cast(a AS DECIMAL(10,5)) AS x FROM raw_data WHERE a BETWEEN 0 
AND 100) AS t1
+  JOIN
+  (SELECT id, length(s) AS y FROM raw_data WHERE s LIKE 'A%') AS t2;
+  USING (id);
+</code></pre>
+
+    <p class="p">
+      These examples show how a query can test for the existence of values in 
a separate table using the
+      <code class="ph codeph">EXISTS()</code> operator with a subquery.
+
+    </p>
+
+    <p class="p">
+      The following examples show how a value can be compared against a set of 
values returned by a subquery.
+    </p>
+
+<pre class="pre codeblock"><code>SELECT count(x) FROM t1 WHERE EXISTS(SELECT 1 
FROM t2 WHERE t1.x = t2.y * 10);
+
+SELECT x FROM t1 WHERE x IN (SELECT y FROM t2 WHERE state = 'CA');
+</code></pre>
+
+    <p class="p">
+      The following examples demonstrate scalar subqueries. When a subquery is 
known to return a single value, you
+      can substitute it where you would normally put a constant value.
+    </p>
+
+<pre class="pre codeblock"><code>SELECT x FROM t1 WHERE y = (SELECT max(z) 
FROM t2);
+SELECT x FROM t1 WHERE y &gt; (SELECT count(z) FROM t2);
+</code></pre>
+
+
+
+
+
+
+
+    <p class="p">
+        <strong class="ph b">Usage notes:</strong>
+      </p>
+
+    <p class="p">
+      If the same table is referenced in both the outer and inner query 
blocks, construct a table alias in the
+      outer query block and use a fully qualified name to distinguish the 
inner and outer table references:
+    </p>
+
+
+
+<pre class="pre codeblock"><code>SELECT * FROM t1 one WHERE id IN (SELECT 
parent FROM t1 two WHERE t1.parent = t2.id);
+</code></pre>
+
+    <p class="p">
+        <strong class="ph b">Internal details:</strong>
+      </p>
+
+    <p class="p">
+      Internally, subqueries involving <code class="ph codeph">IN</code>, 
<code class="ph codeph">NOT IN</code>, <code class="ph codeph">EXISTS</code>, or
+      <code class="ph codeph">NOT EXISTS</code> clauses are rewritten into 
join queries. Depending on the syntax, the subquery
+      might be rewritten to an outer join, semi join, cross join, or anti join.
+    </p>
+
+    <p class="p">
+      A query is processed differently depending on whether the subquery calls 
any aggregation functions. There are
+      correlated and uncorrelated forms, with and without calls to aggregation 
functions. Each of these four
+      categories is rewritten differently.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Column statistics considerations:</strong>
+      </p>
+
+    <p class="p">
+      Because queries that include correlated and uncorrelated subqueries in 
the <code class="ph codeph">WHERE</code> clause are
+      written into join queries, to achieve best performance, follow the same 
guidelines for running the
+      <code class="ph codeph">COMPUTE STATS</code> statement as you do for 
tables involved in regular join queries. Run the
+      <code class="ph codeph">COMPUTE STATS</code> statement for each 
associated tables after loading or substantially changing
+      the data in that table. See <a class="xref" 
href="impala_perf_stats.html#perf_stats">Table and Column Statistics</a> for 
details.
+    </p>
+
+    <p class="p">
+      <strong class="ph b">Added in:</strong> Subqueries are substantially 
enhanced starting in Impala 2.0. Now,
+      they can be used in the <code class="ph codeph">WHERE</code> clause, in 
combination with clauses such as
+      <code class="ph codeph">EXISTS</code> and <code class="ph 
codeph">IN</code>, rather than just in the <code class="ph codeph">FROM</code> 
clause.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Restrictions:</strong>
+      </p>
+
+    <p class="p">
+      The initial Impala support for nested subqueries addresses the most 
common use cases. Some restrictions
+      remain:
+    </p>
+
+    <ul class="ul">
+      <li class="li">
+        <p class="p">
+          Although you can use subqueries in a query involving <code class="ph 
codeph">UNION</code> or <code class="ph codeph">UNION ALL</code>
+          in Impala 2.1.0 and higher, currently you cannot construct a union 
of two subqueries (for example, in the
+          argument of an <code class="ph codeph">IN</code> or <code class="ph 
codeph">EXISTS</code> operator).
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+          Subqueries returning scalar values cannot be used with the operators 
<code class="ph codeph">ANY</code> or
+          <code class="ph codeph">ALL</code>. (Impala does not currently have 
a <code class="ph codeph">SOME</code> operator, but if it did,
+          the same restriction would apply.)
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+          For the <code class="ph codeph">EXISTS</code> and <code class="ph 
codeph">NOT EXISTS</code> clauses, any subquery comparing values
+          from the outer query block to another table must use at least one 
equality comparison, not exclusively
+          other kinds of comparisons such as less than, greater than, <code 
class="ph codeph">BETWEEN</code>, or
+          <code class="ph codeph">!=</code>.
+        </p>
+      </li>
+
+      <li class="li">
+
+        <p class="p">
+          Currently, a scalar subquery cannot be used as the first or second 
argument to the
+          <code class="ph codeph">BETWEEN</code> operator.
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+          A subquery cannot be used inside an <code class="ph 
codeph">OR</code> conjunction. Expressions inside a subquery, for
+          example in the <code class="ph codeph">WHERE</code> clause, can use 
<code class="ph codeph">OR</code> conjunctions; the restriction
+          only applies to parts of the query <span class="q">"above"</span> 
the subquery.
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+          Scalar subqueries are only supported in numeric contexts. You cannot 
use a scalar subquery as an argument
+          to the <code class="ph codeph">LIKE</code>, <code class="ph 
codeph">REGEXP</code>, or <code class="ph codeph">RLIKE</code> operators, or 
compare it
+          to a value of a non-numeric type such as <code class="ph 
codeph">TIMESTAMP</code> or <code class="ph codeph">BOOLEAN</code>.
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+
+          You cannot use subqueries with the <code class="ph 
codeph">CASE</code> function to generate the comparison value, the
+          values to be compared against, or the return value.
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+          A subquery is not allowed in the filter condition for the <code 
class="ph codeph">HAVING</code> clause. (Strictly
+          speaking, a subquery cannot appear anywhere outside the <code 
class="ph codeph">WITH</code>, <code class="ph codeph">FROM</code>, and
+          <code class="ph codeph">WHERE</code> clauses.)
+        </p>
+      </li>
+
+      <li class="li">
+        <p class="p">
+          You must use a fully qualified name
+          (<code class="ph codeph"><var class="keyword 
varname">table_name</var>.<var class="keyword varname">column_name</var></code> 
or
+          <code class="ph codeph"><var class="keyword 
varname">database_name</var>.<var class="keyword varname">table_name</var>.<var 
class="keyword varname">column_name</var></code>)
+          when referring to any column from the outer query block within a 
subquery.
+        </p>
+      </li>
+    </ul>
+
+    <p class="p">
+        <strong class="ph b">Complex type considerations:</strong>
+      </p>
+
+    <p class="p">
+      For the complex types (<code class="ph codeph">ARRAY</code>, <code 
class="ph codeph">STRUCT</code>, and <code class="ph codeph">MAP</code>)
+      available in <span class="keyword">Impala 2.3</span> and higher, the 
join queries that <span class="q">"unpack"</span> complex type
+      columns often use correlated subqueries in the <code class="ph 
codeph">FROM</code> clause.
+      For example, if the first table in the join clause is <code class="ph 
codeph">CUSTOMER</code>, the second
+      join clause might have a subquery that selects from the column <code 
class="ph codeph">CUSTOMER.C_ORDERS</code>,
+      which is an <code class="ph codeph">ARRAY</code>. The subquery 
re-evaluates the <code class="ph codeph">ARRAY</code> elements
+      corresponding to each row from the <code class="ph 
codeph">CUSTOMER</code> table.
+      See <a class="xref" 
href="impala_complex_types.html#complex_types">Complex Types (Impala 2.3 or 
higher only)</a> for details and examples of
+      using subqueries with complex types.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+    <p class="p">
+      <a class="xref" href="impala_operators.html#exists">EXISTS Operator</a>, 
<a class="xref" href="impala_operators.html#in">IN Operator</a>
+    </p>
+  </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_select.html">SELECT 
Statement</a></div></div></nav></article></main></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_sum.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_sum.html 
b/docs/build/html/topics/impala_sum.html
new file mode 100644
index 0000000..95ac90e
--- /dev/null
+++ b/docs/build/html/topics/impala_sum.html
@@ -0,0 +1,333 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_aggregate_functions.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="version" 
content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta 
name="DC.Format" content="XHTML"><meta name="DC.Identifier" content="sum"><link 
rel="stylesheet" type="text/css" href="../commonltr.css"><title>SUM 
Function</title></head><body id="sum"><main role="main"><article role="article" 
aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">SUM Function</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      
+      An aggregate function that returns the sum of a set of numbers. Its 
single argument can be numeric column, or
+      the numeric result of a function or expression applied to the column 
value. Rows with a <code class="ph codeph">NULL</code>
+      value for the specified column are ignored. If the table is empty, or 
all the values supplied to
+      <code class="ph codeph">MIN</code> are <code class="ph 
codeph">NULL</code>, <code class="ph codeph">SUM</code> returns <code class="ph 
codeph">NULL</code>.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Syntax:</strong>
+      </p>
+
+<pre class="pre codeblock"><code>SUM([DISTINCT | ALL] <var class="keyword 
varname">expression</var>) [OVER (<var class="keyword 
varname">analytic_clause</var>)]</code></pre>
+
+    <p class="p">
+      When the query contains a <code class="ph codeph">GROUP BY</code> 
clause, returns one value for each combination of
+      grouping values.
+    </p>
+
+    <p class="p">
+      <strong class="ph b">Return type:</strong> <code class="ph 
codeph">BIGINT</code> for integer arguments, <code class="ph 
codeph">DOUBLE</code> for floating-point
+      arguments
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Complex type considerations:</strong>
+      </p>
+
+    <p class="p">
+        To access a column with a complex type (<code class="ph 
codeph">ARRAY</code>, <code class="ph codeph">STRUCT</code>, or <code class="ph 
codeph">MAP</code>)
+        in an aggregation function, you unpack the individual elements using 
join notation in the query,
+        and then apply the function to the final scalar item, field, key, or 
value at the bottom of any nested type hierarchy in the column.
+        See <a class="xref" 
href="../shared/../topics/impala_complex_types.html#complex_types">Complex 
Types (Impala 2.3 or higher only)</a> for details about using complex types in 
Impala.
+      </p>
+
+    <div class="p">
+The following example demonstrates calls to several aggregation functions
+using values from a column containing nested complex types
+(an <code class="ph codeph">ARRAY</code> of <code class="ph 
codeph">STRUCT</code> items).
+The array is unpacked inside the query using join notation.
+The array elements are referenced using the <code class="ph codeph">ITEM</code>
+pseudocolumn, and the structure fields inside the array elements
+are referenced using dot notation.
+Numeric values such as <code class="ph codeph">SUM()</code> and <code 
class="ph codeph">AVG()</code>
+are computed using the numeric <code class="ph codeph">R_NATIONKEY</code> 
field, and
+the general-purpose <code class="ph codeph">MAX()</code> and <code class="ph 
codeph">MIN()</code>
+values are computed from the string <code class="ph codeph">N_NAME</code> 
field.
+<pre class="pre codeblock"><code>describe region;
++-------------+-------------------------+---------+
+| name        | type                    | comment |
++-------------+-------------------------+---------+
+| r_regionkey | smallint                |         |
+| r_name      | string                  |         |
+| r_comment   | string                  |         |
+| r_nations   | array&lt;struct&lt;           |         |
+|             |   n_nationkey:smallint, |         |
+|             |   n_name:string,        |         |
+|             |   n_comment:string      |         |
+|             | &gt;&gt;                      |         |
++-------------+-------------------------+---------+
+
+select r_name, r_nations.item.n_nationkey
+  from region, region.r_nations as r_nations
+order by r_name, r_nations.item.n_nationkey;
++-------------+------------------+
+| r_name      | item.n_nationkey |
++-------------+------------------+
+| AFRICA      | 0                |
+| AFRICA      | 5                |
+| AFRICA      | 14               |
+| AFRICA      | 15               |
+| AFRICA      | 16               |
+| AMERICA     | 1                |
+| AMERICA     | 2                |
+| AMERICA     | 3                |
+| AMERICA     | 17               |
+| AMERICA     | 24               |
+| ASIA        | 8                |
+| ASIA        | 9                |
+| ASIA        | 12               |
+| ASIA        | 18               |
+| ASIA        | 21               |
+| EUROPE      | 6                |
+| EUROPE      | 7                |
+| EUROPE      | 19               |
+| EUROPE      | 22               |
+| EUROPE      | 23               |
+| MIDDLE EAST | 4                |
+| MIDDLE EAST | 10               |
+| MIDDLE EAST | 11               |
+| MIDDLE EAST | 13               |
+| MIDDLE EAST | 20               |
++-------------+------------------+
+
+select
+  r_name,
+  count(r_nations.item.n_nationkey) as count,
+  sum(r_nations.item.n_nationkey) as sum,
+  avg(r_nations.item.n_nationkey) as avg,
+  min(r_nations.item.n_name) as minimum,
+  max(r_nations.item.n_name) as maximum,
+  ndv(r_nations.item.n_nationkey) as distinct_vals
+from
+  region, region.r_nations as r_nations
+group by r_name
+order by r_name;
++-------------+-------+-----+------+-----------+----------------+---------------+
+| r_name      | count | sum | avg  | minimum   | maximum        | 
distinct_vals |
++-------------+-------+-----+------+-----------+----------------+---------------+
+| AFRICA      | 5     | 50  | 10   | ALGERIA   | MOZAMBIQUE     | 5            
 |
+| AMERICA     | 5     | 47  | 9.4  | ARGENTINA | UNITED STATES  | 5            
 |
+| ASIA        | 5     | 68  | 13.6 | CHINA     | VIETNAM        | 5            
 |
+| EUROPE      | 5     | 77  | 15.4 | FRANCE    | UNITED KINGDOM | 5            
 |
+| MIDDLE EAST | 5     | 58  | 11.6 | EGYPT     | SAUDI ARABIA   | 5            
 |
++-------------+-------+-----+------+-----------+----------------+---------------+
+</code></pre>
+</div>
+
+    <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+    <p class="p">
+      The following example shows how to use <code class="ph 
codeph">SUM()</code> to compute the total for all the values in the
+      table, a subset of values, or the sum for each combination of values in 
the <code class="ph codeph">GROUP BY</code> clause:
+    </p>
+
+<pre class="pre codeblock"><code>-- Total all the values for this column in 
the table.
+select sum(c1) from t1;
+-- Find the total for this column from a subset of the table.
+select sum(c1) from t1 where month = 'January' and year = '2013';
+-- Find the total from a set of numeric function results.
+select sum(length(s)) from t1;
+-- Often used with functions that return predefined values to compute a score.
+select sum(case when grade = 'A' then 1.0 when grade = 'B' then 0.75 else 0) 
as class_honors from test_scores;
+-- Can also be used in combination with DISTINCT and/or GROUP BY.
+-- Return more than one result.
+select month, year, sum(purchase_price) from store_stats group by month, year;
+-- Filter the input to eliminate duplicates before performing the calculation.
+select sum(distinct x) from t1;
+</code></pre>
+
+    <div class="p">
+      The following examples show how to use <code class="ph 
codeph">SUM()</code> in an analytic context. They use a table
+      containing integers from 1 to 10. Notice how the <code class="ph 
codeph">SUM()</code> is reported for each input value, as
+      opposed to the <code class="ph codeph">GROUP BY</code> clause which 
condenses the result set.
+<pre class="pre codeblock"><code>select x, property, sum(x) <strong class="ph 
b">over (partition by property)</strong> as sum from int_t where property in 
('odd','even');
++----+----------+-----+
+| x  | property | sum |
++----+----------+-----+
+| 2  | even     | 30  |
+| 4  | even     | 30  |
+| 6  | even     | 30  |
+| 8  | even     | 30  |
+| 10 | even     | 30  |
+| 1  | odd      | 25  |
+| 3  | odd      | 25  |
+| 5  | odd      | 25  |
+| 7  | odd      | 25  |
+| 9  | odd      | 25  |
++----+----------+-----+
+</code></pre>
+
+Adding an <code class="ph codeph">ORDER BY</code> clause lets you experiment 
with results that are cumulative or apply to a moving
+set of rows (the <span class="q">"window"</span>). The following examples use 
<code class="ph codeph">SUM()</code> in an analytic context
+(that is, with an <code class="ph codeph">OVER()</code> clause) to produce a 
running total of all the even values,
+then a running total of all the odd values. The basic <code class="ph 
codeph">ORDER BY x</code> clause implicitly
+activates a window clause of <code class="ph codeph">RANGE BETWEEN UNBOUNDED 
PRECEDING AND CURRENT ROW</code>,
+which is effectively the same as <code class="ph codeph">ROWS BETWEEN 
UNBOUNDED PRECEDING AND CURRENT ROW</code>,
+therefore all of these examples produce the same results:
+<pre class="pre codeblock"><code>select x, property,
+  sum(x) over (partition by property <strong class="ph b">order by x</strong>) 
as 'cumulative total'
+  from int_t where property in ('odd','even');
++----+----------+------------------+
+| x  | property | cumulative total |
++----+----------+------------------+
+| 2  | even     | 2                |
+| 4  | even     | 6                |
+| 6  | even     | 12               |
+| 8  | even     | 20               |
+| 10 | even     | 30               |
+| 1  | odd      | 1                |
+| 3  | odd      | 4                |
+| 5  | odd      | 9                |
+| 7  | odd      | 16               |
+| 9  | odd      | 25               |
++----+----------+------------------+
+
+select x, property,
+  sum(x) over
+  (
+    partition by property
+    <strong class="ph b">order by x</strong>
+    <strong class="ph b">range between unbounded preceding and current 
row</strong>
+  ) as 'cumulative total'
+from int_t where property in ('odd','even');
++----+----------+------------------+
+| x  | property | cumulative total |
++----+----------+------------------+
+| 2  | even     | 2                |
+| 4  | even     | 6                |
+| 6  | even     | 12               |
+| 8  | even     | 20               |
+| 10 | even     | 30               |
+| 1  | odd      | 1                |
+| 3  | odd      | 4                |
+| 5  | odd      | 9                |
+| 7  | odd      | 16               |
+| 9  | odd      | 25               |
++----+----------+------------------+
+
+select x, property,
+  sum(x) over
+  (
+    partition by property
+    <strong class="ph b">order by x</strong>
+    <strong class="ph b">rows between unbounded preceding and current 
row</strong>
+  ) as 'cumulative total'
+  from int_t where property in ('odd','even');
++----+----------+------------------+
+| x  | property | cumulative total |
++----+----------+------------------+
+| 2  | even     | 2                |
+| 4  | even     | 6                |
+| 6  | even     | 12               |
+| 8  | even     | 20               |
+| 10 | even     | 30               |
+| 1  | odd      | 1                |
+| 3  | odd      | 4                |
+| 5  | odd      | 9                |
+| 7  | odd      | 16               |
+| 9  | odd      | 25               |
++----+----------+------------------+
+</code></pre>
+
+Changing the direction of the <code class="ph codeph">ORDER BY</code> clause 
causes the intermediate
+results of the cumulative total to be calculated in a different order:
+
+<pre class="pre codeblock"><code>select sum(x) over (partition by property 
<strong class="ph b">order by x desc</strong>) as 'cumulative total'
+  from int_t where property in ('odd','even');
++----+----------+------------------+
+| x  | property | cumulative total |
++----+----------+------------------+
+| 10 | even     | 10               |
+| 8  | even     | 18               |
+| 6  | even     | 24               |
+| 4  | even     | 28               |
+| 2  | even     | 30               |
+| 9  | odd      | 9                |
+| 7  | odd      | 16               |
+| 5  | odd      | 21               |
+| 3  | odd      | 24               |
+| 1  | odd      | 25               |
++----+----------+------------------+
+</code></pre>
+
+The following examples show how to construct a moving window, with a running 
total taking into account 1 row before
+and 1 row after the current row, within the same partition (all the even 
values or all the odd values).
+Because of a restriction in the Impala <code class="ph codeph">RANGE</code> 
syntax, this type of
+moving window is possible with the <code class="ph codeph">ROWS BETWEEN</code> 
clause but not the <code class="ph codeph">RANGE BETWEEN</code>
+clause:
+<pre class="pre codeblock"><code>select x, property,
+  sum(x) over
+  (
+    partition by property
+    <strong class="ph b">order by x</strong>
+    <strong class="ph b">rows between 1 preceding and 1 following</strong>
+  ) as 'moving total'
+  from int_t where property in ('odd','even');
++----+----------+--------------+
+| x  | property | moving total |
++----+----------+--------------+
+| 2  | even     | 6            |
+| 4  | even     | 12           |
+| 6  | even     | 18           |
+| 8  | even     | 24           |
+| 10 | even     | 18           |
+| 1  | odd      | 4            |
+| 3  | odd      | 9            |
+| 5  | odd      | 15           |
+| 7  | odd      | 21           |
+| 9  | odd      | 16           |
++----+----------+--------------+
+
+-- Doesn't work because of syntax restriction on RANGE clause.
+select x, property,
+  sum(x) over
+  (
+    partition by property
+    <strong class="ph b">order by x</strong>
+    <strong class="ph b">range between 1 preceding and 1 following</strong>
+  ) as 'moving total'
+from int_t where property in ('odd','even');
+ERROR: AnalysisException: RANGE is only supported with both the lower and 
upper bounds UNBOUNDED or one UNBOUNDED and the other CURRENT ROW.
+</code></pre>
+    </div>
+
+    <p class="p">
+        <strong class="ph b">Restrictions:</strong>
+      </p>
+
+
+
+    <p class="p">
+        Due to the way arithmetic on <code class="ph codeph">FLOAT</code> and 
<code class="ph codeph">DOUBLE</code> columns uses
+        high-performance hardware instructions, and distributed queries can 
perform these operations in different
+        order for each query, results can vary slightly for aggregate function 
calls such as <code class="ph codeph">SUM()</code>
+        and <code class="ph codeph">AVG()</code> for <code class="ph 
codeph">FLOAT</code> and <code class="ph codeph">DOUBLE</code> columns, 
particularly on
+        large data sets where millions or billions of values are summed or 
averaged. For perfect consistency and
+        repeatability, use the <code class="ph codeph">DECIMAL</code> data 
type for such operations instead of
+        <code class="ph codeph">FLOAT</code> or <code class="ph 
codeph">DOUBLE</code>.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+    <p class="p">
+      <a class="xref" 
href="impala_analytic_functions.html#analytic_functions">Impala Analytic 
Functions</a>
+    </p>
+
+  </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_aggregate_functions.html">Impala Aggregate 
Functions</a></div></div></nav></article></main></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_support_start_over.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_support_start_over.html 
b/docs/build/html/topics/impala_support_start_over.html
new file mode 100644
index 0000000..f813773
--- /dev/null
+++ b/docs/build/html/topics/impala_support_start_over.html
@@ -0,0 +1,30 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_query_options.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="version" 
content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta 
name="DC.Format" content="XHTML"><meta name="DC.Identifier" 
content="support_start_over"><link rel="stylesheet" type="text/css" 
href="../commonltr.css"><title>SUPPORT_START_OVER Query 
Option</title></head><body id="support_start_over"><main role="main"><article 
role="article" aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">SUPPORT_START_OVER Query 
Option</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      
+      Leave this setting at its default value.
+      It is a read-only setting, tested by some client applications such as 
Hue.
+    </p>
+    <p class="p">
+      If you accidentally change it through <span class="keyword 
cmdname">impala-shell</span>,
+      subsequent queries encounter errors until you undo the change
+      by issuing <code class="ph codeph">UNSET support_start_over</code>.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Type:</strong> Boolean; recognized values are 1 
and 0, or <code class="ph codeph">true</code> and <code class="ph 
codeph">false</code>;
+        any other value interpreted as <code class="ph codeph">false</code>
+      </p>
+    <p class="p">
+        <strong class="ph b">Default:</strong> <code class="ph 
codeph">false</code>
+      </p>
+  </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_query_options.html">Query Options for the SET 
Statement</a></div></div></nav></article></main></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_sync_ddl.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_sync_ddl.html 
b/docs/build/html/topics/impala_sync_ddl.html
new file mode 100644
index 0000000..b19e266
--- /dev/null
+++ b/docs/build/html/topics/impala_sync_ddl.html
@@ -0,0 +1,55 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_query_options.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="version" 
content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta 
name="DC.Format" content="XHTML"><meta name="DC.Identifier" 
content="sync_ddl"><link rel="stylesheet" type="text/css" 
href="../commonltr.css"><title>SYNC_DDL Query Option</title></head><body 
id="sync_ddl"><main role="main"><article role="article" 
aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">SYNC_DDL Query Option</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      
+      When enabled, causes any DDL operation such as <code class="ph 
codeph">CREATE TABLE</code> or <code class="ph codeph">ALTER TABLE</code>
+      to return only when the changes have been propagated to all other Impala 
nodes in the cluster by the Impala
+      catalog service. That way, if you issue a subsequent <code class="ph 
codeph">CONNECT</code> statement in
+      <span class="keyword cmdname">impala-shell</span> to connect to a 
different node in the cluster, you can be sure that other
+      node will already recognize any added or changed tables. (The catalog 
service automatically broadcasts the
+      DDL changes to all nodes automatically, but without this option there 
could be a period of inconsistency if
+      you quickly switched to another node, such as by issuing a subsequent 
query through a load-balancing proxy.)
+    </p>
+
+    <p class="p">
+      Although <code class="ph codeph">INSERT</code> is classified as a DML 
statement, when the <code class="ph codeph">SYNC_DDL</code> option
+      is enabled, <code class="ph codeph">INSERT</code> statements also delay 
their completion until all the underlying data and
+      metadata changes are propagated to all Impala nodes. Internally, Impala 
inserts have similarities with DDL
+      statements in traditional database systems, because they create metadata 
needed to track HDFS block locations
+      for new files and they potentially add new partitions to partitioned 
tables.
+    </p>
+
+    <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+      Because this option can introduce a delay after each write operation, if 
you are running a sequence of
+      <code class="ph codeph">CREATE DATABASE</code>, <code class="ph 
codeph">CREATE TABLE</code>, <code class="ph codeph">ALTER TABLE</code>,
+      <code class="ph codeph">INSERT</code>, and similar statements within a 
setup script, to minimize the overall delay you can
+      enable the <code class="ph codeph">SYNC_DDL</code> query option only 
near the end, before the final DDL statement.
+    </div>
+
+    <p class="p">
+        <strong class="ph b">Type:</strong> Boolean; recognized values are 1 
and 0, or <code class="ph codeph">true</code> and <code class="ph 
codeph">false</code>;
+        any other value interpreted as <code class="ph codeph">false</code>
+      </p>
+    <p class="p">
+        <strong class="ph b">Default:</strong> <code class="ph 
codeph">false</code> (shown as 0 in output of <code class="ph 
codeph">SET</code> statement)
+      </p>
+
+    
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+    <p class="p">
+      <a class="xref" href="impala_ddl.html#ddl">DDL Statements</a>
+    </p>
+
+  </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_query_options.html">Query Options for the SET 
Statement</a></div></div></nav></article></main></body></html>
\ No newline at end of file

[04/51] [partial] incubator-impala git commit: IMPALA-4181 [DOCS] Publish rendered Impala documentation to ASF site

Reply via email to