[34/51] [partial] incubator-impala git commit: IMPALA-3398: Add docs to main Impala branch.

jbapple Thu, 17 Nov 2016 15:12:49 -0800

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_describe.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_describe.xml b/docs/topics/impala_describe.xml
new file mode 100644
index 0000000..c4070c3
--- /dev/null
+++ b/docs/topics/impala_describe.xml
@@ -0,0 +1,689 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="describe">
+
+  <title id="desc">DESCRIBE Statement</title>
+  <titlealts audience="PDF"><navtitle>DESCRIBE</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Data Types"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Tables"/>
+      <data name="Category" value="Reports"/>
+      <data name="Category" value="Schemas"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DESCRIBE statement</indexterm>
+      The <codeph>DESCRIBE</codeph> statement displays metadata about a table, 
such as the column names and their
+      data types.
+      <ph rev="2.3.0">In <keyword keyref="impala23_full"/> and higher, you can 
specify the name of a complex type column, which takes
+      the form of a dotted path. The path might include multiple components in 
the case of a nested type definition.</ph>
+      <ph rev="2.5.0">In <keyword keyref="impala25_full"/> and higher, the 
<codeph>DESCRIBE DATABASE</codeph> form can display
+      information about a database.</ph>
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+<codeblock rev="2.5.0">DESCRIBE [DATABASE] [FORMATTED|EXTENDED] 
<varname>object_name</varname>
+
+object_name ::=
+    
[<varname>db_name</varname>.]<varname>table_name</varname>[.<varname>complex_col_name</varname>
 ...]
+  | <varname>db_name</varname>
+</codeblock>
+
+    <p>
+      You can use the abbreviation <codeph>DESC</codeph> for the 
<codeph>DESCRIBE</codeph> statement.
+    </p>
+
+    <p rev="1.1">
+      The <codeph>DESCRIBE FORMATTED</codeph> variation displays additional 
information, in a format familiar to
+      users of Apache Hive. The extra information includes low-level details 
such as whether the table is internal
+      or external, when it was created, the file format, the location of the 
data in HDFS, whether the object is a
+      table or a view, and (for views) the text of the query from the view 
definition.
+    </p>
+
+    <note>
+      The <codeph>Compressed</codeph> field is not a reliable indicator of 
whether the table contains compressed
+      data. It typically always shows <codeph>No</codeph>, because the 
compression settings only apply during the
+      session that loads data and are not stored persistently with the table 
metadata.
+    </note>
+
+<p rev="2.5.0 IMPALA-2196">
+  <b>Describing databases:</b>
+</p>
+
+<p rev="2.5.0">
+  By default, the <codeph>DESCRIBE</codeph> output for a database includes the 
location
+  and the comment, which can be set by the <codeph>LOCATION</codeph> and 
<codeph>COMMENT</codeph>
+  clauses on the <codeph>CREATE DATABASE</codeph> statement.
+</p>
+
+<p rev="2.5.0">
+  The additional information displayed by the <codeph>FORMATTED</codeph> or 
<codeph>EXTENDED</codeph>
+  keyword includes the HDFS user ID that is considered the owner of the 
database, and any
+  optional database properties. The properties could be specified by the 
<codeph>WITH DBPROPERTIES</codeph>
+  clause if the database is created using a Hive <codeph>CREATE 
DATABASE</codeph> statement.
+  Impala currently does not set or do any special processing based on those 
properties.
+</p>
+
+<p rev="2.5.0">
+The following examples show the variations in syntax and output for
+describing databases. This feature is available in <keyword 
keyref="impala25_full"/>
+and higher.
+</p>
+
+<codeblock rev="2.5.0">
+describe database default;
++---------+----------------------+-----------------------+
+| name    | location             | comment               |
++---------+----------------------+-----------------------+
+| default | /user/hive/warehouse | Default Hive database |
++---------+----------------------+-----------------------+
+
+describe database formatted default;
++---------+----------------------+-----------------------+
+| name    | location             | comment               |
++---------+----------------------+-----------------------+
+| default | /user/hive/warehouse | Default Hive database |
+| Owner:  |                      |                       |
+|         | public               | ROLE                  |
++---------+----------------------+-----------------------+
+
+describe database extended default;
++---------+----------------------+-----------------------+
+| name    | location             | comment               |
++---------+----------------------+-----------------------+
+| default | /user/hive/warehouse | Default Hive database |
+| Owner:  |                      |                       |
+|         | public               | ROLE                  |
++---------+----------------------+-----------------------+
+</codeblock>
+
+<p>
+  <b>Describing tables:</b>
+</p>
+
+<p>
+  If the <codeph>DATABASE</codeph> keyword is omitted, the default
+  for the <codeph>DESCRIBE</codeph> statement is to refer to a table.
+</p>
+
+<codeblock>
+-- By default, the table is assumed to be in the current database.
+describe my_table;
++------+--------+---------+
+| name | type   | comment |
++------+--------+---------+
+| x    | int    |         |
+| s    | string |         |
++------+--------+---------+
+
+-- Use a fully qualified table name to specify a table in any database.
+describe my_database.my_table;
++------+--------+---------+
+| name | type   | comment |
++------+--------+---------+
+| x    | int    |         |
+| s    | string |         |
++------+--------+---------+
+
+-- The formatted or extended output includes additional useful information.
+-- The LOCATION field is especially useful to know for DDL statements and HDFS 
commands
+-- during ETL jobs. (The LOCATION includes a full hdfs:// URL, omitted here 
for readability.)
+describe formatted my_table;
++------------------------------+----------------------------------------------+----------------------+
+| name                         | type                                         
| comment              |
++------------------------------+----------------------------------------------+----------------------+
+| # col_name                   | data_type                                    
| comment              |
+|                              | NULL                                         
| NULL                 |
+| x                            | int                                          
| NULL                 |
+| s                            | string                                       
| NULL                 |
+|                              | NULL                                         
| NULL                 |
+| # Detailed Table Information | NULL                                         
| NULL                 |
+| Database:                    | my_database                                  
| NULL                 |
+| Owner:                       | jrussell                                     
| NULL                 |
+| CreateTime:                  | Fri Mar 18 15:58:00 PDT 2016                 
| NULL                 |
+| LastAccessTime:              | UNKNOWN                                      
| NULL                 |
+| Protect Mode:                | None                                         
| NULL                 |
+| Retention:                   | 0                                            
| NULL                 |
+| Location:                    | /user/hive/warehouse/my_database.db/my_table 
| NULL                 |
+| Table Type:                  | MANAGED_TABLE                                
| NULL                 |
+| Table Parameters:            | NULL                                         
| NULL                 |
+|                              | transient_lastDdlTime                        
| 1458341880           |
+|                              | NULL                                         
| NULL                 |
+| # Storage Information        | NULL                                         
| NULL                 |
+| SerDe Library:               | org. ... .LazySimpleSerDe                    
| NULL                 |
+| InputFormat:                 | org.apache.hadoop.mapred.TextInputFormat     
| NULL                 |
+| OutputFormat:                | org. ... .HiveIgnoreKeyTextOutputFormat      
| NULL                 |
+| Compressed:                  | No                                           
| NULL                 |
+| Num Buckets:                 | 0                                            
| NULL                 |
+| Bucket Columns:              | []                                           
| NULL                 |
+| Sort Columns:                | []                                           
| NULL                 |
++------------------------------+----------------------------------------------+----------------------+
+</codeblock>
+
+    <p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
+
+    <p rev="2.3.0">
+      Because the column definitions for complex types can become long, 
particularly when such types are nested,
+      the <codeph>DESCRIBE</codeph> statement uses special formatting for 
complex type columns to make the output readable.
+    </p>
+
+    <p rev="2.3.0">
+      For the <codeph>ARRAY</codeph>, <codeph>STRUCT</codeph>, and 
<codeph>MAP</codeph> types available in
+      <keyword keyref="impala23_full"/> and higher, the 
<codeph>DESCRIBE</codeph> output is formatted to avoid
+      excessively long lines for multiple fields within a 
<codeph>STRUCT</codeph>, or a nested sequence of
+      complex types.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/complex_types_describe"/>
+
+    <p rev="2.3.0">
+      For example, here is the <codeph>DESCRIBE</codeph> output for a table 
containing a single top-level column
+      of each complex type:
+    </p>
+
+<codeblock rev="2.3.0"><![CDATA[create table t1 (x int, a array<int>, s 
struct<f1: string, f2: bigint>, m map<string,int>) stored as parquet;
+
+describe t1;
++------+-----------------+---------+
+| name | type            | comment |
++------+-----------------+---------+
+| x    | int             |         |
+| a    | array<int>      |         |
+| s    | struct<         |         |
+|      |   f1:string,    |         |
+|      |   f2:bigint     |         |
+|      | >               |         |
+| m    | map<string,int> |         |
++------+-----------------+---------+
+]]>
+</codeblock>
+
+    <p rev="2.3.0">
+      Here are examples showing how to <q>drill down</q> into the layouts of 
complex types, including
+      using multi-part names to examine the definitions of nested types.
+      The <codeph>&lt; &gt;</codeph> delimiters identify the columns with 
complex types;
+      these are the columns where you can descend another level to see the 
parts that make up
+      the complex type.
+      This technique helps you to understand the multi-part names you use as 
table references in queries
+      involving complex types, and the corresponding column names you refer to 
in the <codeph>SELECT</codeph> list.
+      These tables are from the <q>nested TPC-H</q> schema, shown in detail in
+      <xref href="impala_complex_types.xml#complex_sample_schema"/>.
+    </p>
+
+    <p>
+      The <codeph>REGION</codeph> table contains an <codeph>ARRAY</codeph> of 
<codeph>STRUCT</codeph>
+      elements:
+    </p>
+
+    <ul>
+      <li>
+        <p>
+          The first <codeph>DESCRIBE</codeph> specifies the table name, to 
display the definition
+          of each top-level column.
+        </p>
+      </li>
+      <li>
+        <p>
+          The second <codeph>DESCRIBE</codeph> specifies the name of a complex
+          column, <codeph>REGION.R_NATIONS</codeph>, showing that when you 
include the name of an <codeph>ARRAY</codeph>
+          column in a <codeph>FROM</codeph> clause, that table reference acts 
like a two-column table with
+          columns <codeph>ITEM</codeph> and <codeph>POS</codeph>.
+        </p>
+      </li>
+      <li>
+        <p>
+          The final <codeph>DESCRIBE</codeph> specifies the fully qualified 
name of the <codeph>ITEM</codeph> field,
+          to display the layout of its underlying <codeph>STRUCT</codeph> type 
in table format, with the fields
+          mapped to column names.
+        </p>
+      </li>
+    </ul>
+
+<codeblock rev="2.3.0"><![CDATA[
+-- #1: The overall layout of the entire table.
+describe region;
++-------------+-------------------------+---------+
+| name        | type                    | comment |
++-------------+-------------------------+---------+
+| r_regionkey | smallint                |         |
+| r_name      | string                  |         |
+| r_comment   | string                  |         |
+| r_nations   | array<struct<           |         |
+|             |   n_nationkey:smallint, |         |
+|             |   n_name:string,        |         |
+|             |   n_comment:string      |         |
+|             | >>                      |         |
++-------------+-------------------------+---------+
+
+-- #2: The ARRAY column within the table.
+describe region.r_nations;
++------+-------------------------+---------+
+| name | type                    | comment |
++------+-------------------------+---------+
+| item | struct<                 |         |
+|      |   n_nationkey:smallint, |         |
+|      |   n_name:string,        |         |
+|      |   n_comment:string      |         |
+|      | >                       |         |
+| pos  | bigint                  |         |
++------+-------------------------+---------+
+
+-- #3: The STRUCT that makes up each ARRAY element.
+--     The fields of the STRUCT act like columns of a table.
+describe region.r_nations.item;
++-------------+----------+---------+
+| name        | type     | comment |
++-------------+----------+---------+
+| n_nationkey | smallint |         |
+| n_name      | string   |         |
+| n_comment   | string   |         |
++-------------+----------+---------+
+]]>
+</codeblock>
+
+    <p>
+      The <codeph>CUSTOMER</codeph> table contains an <codeph>ARRAY</codeph> 
of <codeph>STRUCT</codeph>
+      elements, where one field in the <codeph>STRUCT</codeph> is another 
<codeph>ARRAY</codeph> of
+      <codeph>STRUCT</codeph> elements:
+    </p>
+    <ul>
+      <li>
+        <p>
+          Again, the initial <codeph>DESCRIBE</codeph> specifies only the 
table name.
+        </p>
+      </li>
+      <li>
+        <p>
+          The second <codeph>DESCRIBE</codeph> specifies the qualified name of 
the complex
+          column, <codeph>CUSTOMER.C_ORDERS</codeph>, showing how an 
<codeph>ARRAY</codeph>
+          is represented as a two-column table with columns 
<codeph>ITEM</codeph> and <codeph>POS</codeph>.
+        </p>
+      </li>
+      <li>
+        <p>
+          The third <codeph>DESCRIBE</codeph> specifies the qualified name of 
the <codeph>ITEM</codeph>
+          of the <codeph>ARRAY</codeph> column, to see the structure of the 
nested <codeph>ARRAY</codeph>.
+          Again, it has has two parts, <codeph>ITEM</codeph> and 
<codeph>POS</codeph>. Because the
+          <codeph>ARRAY</codeph> contains a <codeph>STRUCT</codeph>, the 
layout of the <codeph>STRUCT</codeph>
+          is shown.
+        </p>
+      </li>
+      <li>
+        <p>
+          The fourth and fifth <codeph>DESCRIBE</codeph> statements drill down 
into a <codeph>STRUCT</codeph> field that
+          is itself a complex type, an <codeph>ARRAY</codeph> of 
<codeph>STRUCT</codeph>.
+          The <codeph>ITEM</codeph> portion of the qualified name is only 
required when the <codeph>ARRAY</codeph>
+          elements are anonymous. The fields of the <codeph>STRUCT</codeph> 
give names to any other complex types
+          nested inside the <codeph>STRUCT</codeph>. Therefore, the 
<codeph>DESCRIBE</codeph> parameters
+          <codeph>CUSTOMER.C_ORDERS.ITEM.O_LINEITEMS</codeph> and 
<codeph>CUSTOMER.C_ORDERS.O_LINEITEMS</codeph>
+          are equivalent. (For brevity, leave out the <codeph>ITEM</codeph> 
portion of
+          a qualified name when it is not required.)
+        </p>
+      </li>
+      <li>
+        <p>
+          The final <codeph>DESCRIBE</codeph> shows the layout of the deeply 
nested <codeph>STRUCT</codeph> type.
+          Because there are no more complex types nested inside this 
<codeph>STRUCT</codeph>, this is as far
+          as you can drill down into the layout for this table.
+        </p>
+      </li>
+    </ul>
+
+<codeblock rev="2.3.0"><![CDATA[-- #1: The overall layout of the entire table.
+describe customer;
++--------------+------------------------------------+
+| name         | type                               |
++--------------+------------------------------------+
+| c_custkey    | bigint                             |
+... more scalar columns ...
+| c_orders     | array<struct<                      |
+|              |   o_orderkey:bigint,               |
+|              |   o_orderstatus:string,            |
+|              |   o_totalprice:decimal(12,2),      |
+|              |   o_orderdate:string,              |
+|              |   o_orderpriority:string,          |
+|              |   o_clerk:string,                  |
+|              |   o_shippriority:int,              |
+|              |   o_comment:string,                |
+|              |   o_lineitems:array<struct<        |
+|              |     l_partkey:bigint,              |
+|              |     l_suppkey:bigint,              |
+|              |     l_linenumber:int,              |
+|              |     l_quantity:decimal(12,2),      |
+|              |     l_extendedprice:decimal(12,2), |
+|              |     l_discount:decimal(12,2),      |
+|              |     l_tax:decimal(12,2),           |
+|              |     l_returnflag:string,           |
+|              |     l_linestatus:string,           |
+|              |     l_shipdate:string,             |
+|              |     l_commitdate:string,           |
+|              |     l_receiptdate:string,          |
+|              |     l_shipinstruct:string,         |
+|              |     l_shipmode:string,             |
+|              |     l_comment:string               |
+|              |   >>                               |
+|              | >>                                 |
++--------------+------------------------------------+
+
+-- #2: The ARRAY column within the table.
+describe customer.c_orders;
++------+------------------------------------+
+| name | type                               |
++------+------------------------------------+
+| item | struct<                            |
+|      |   o_orderkey:bigint,               |
+|      |   o_orderstatus:string,            |
+... more struct fields ...
+|      |   o_lineitems:array<struct<        |
+|      |     l_partkey:bigint,              |
+|      |     l_suppkey:bigint,              |
+... more nested struct fields ...
+|      |     l_comment:string               |
+|      |   >>                               |
+|      | >                                  |
+| pos  | bigint                             |
++------+------------------------------------+
+
+-- #3: The STRUCT that makes up each ARRAY element.
+--     The fields of the STRUCT act like columns of a table.
+describe customer.c_orders.item;
++-----------------+----------------------------------+
+| name            | type                             |
++-----------------+----------------------------------+
+| o_orderkey      | bigint                           |
+| o_orderstatus   | string                           |
+| o_totalprice    | decimal(12,2)                    |
+| o_orderdate     | string                           |
+| o_orderpriority | string                           |
+| o_clerk         | string                           |
+| o_shippriority  | int                              |
+| o_comment       | string                           |
+| o_lineitems     | array<struct<                    |
+|                 |   l_partkey:bigint,              |
+|                 |   l_suppkey:bigint,              |
+... more struct fields ...
+|                 |   l_comment:string               |
+|                 | >>                               |
++-----------------+----------------------------------+
+
+-- #4: The ARRAY nested inside the STRUCT elements of the first ARRAY.
+describe customer.c_orders.item.o_lineitems;
++------+----------------------------------+
+| name | type                             |
++------+----------------------------------+
+| item | struct<                          |
+|      |   l_partkey:bigint,              |
+|      |   l_suppkey:bigint,              |
+... more struct fields ...
+|      |   l_comment:string               |
+|      | >                                |
+| pos  | bigint                           |
++------+----------------------------------+
+
+-- #5: Shorter form of the previous DESCRIBE. Omits the .ITEM portion of the 
name
+--     because O_LINEITEMS and other field names provide a way to refer to 
things
+--     inside the ARRAY element.
+describe customer.c_orders.o_lineitems;
++------+----------------------------------+
+| name | type                             |
++------+----------------------------------+
+| item | struct<                          |
+|      |   l_partkey:bigint,              |
+|      |   l_suppkey:bigint,              |
+... more struct fields ...
+|      |   l_comment:string               |
+|      | >                                |
+| pos  | bigint                           |
++------+----------------------------------+
+
+-- #6: The STRUCT representing ARRAY elements nested inside
+--     another ARRAY of STRUCTs. The lack of any complex types
+--     in this output means this is as far as DESCRIBE can
+--     descend into the table layout.
+describe customer.c_orders.o_lineitems.item;
++-----------------+---------------+
+| name            | type          |
++-----------------+---------------+
+| l_partkey       | bigint        |
+| l_suppkey       | bigint        |
+... more scalar columns ...
+| l_comment       | string        |
++-----------------+---------------+
+]]>
+</codeblock>
+
+<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+<p>
+  After the <cmdname>impalad</cmdname> daemons are restarted, the first query 
against a table can take longer
+  than subsequent queries, because the metadata for the table is loaded before 
the query is processed. This
+  one-time delay for each table can cause misleading results in benchmark 
tests or cause unnecessary concern.
+  To <q>warm up</q> the Impala metadata cache, you can issue a 
<codeph>DESCRIBE</codeph> statement in advance
+  for each table you intend to access later.
+</p>
+
+<p>
+  When you are dealing with data files stored in HDFS, sometimes it is 
important to know details such as the
+  path of the data files for an Impala table, and the hostname for the 
namenode. You can get this information
+  from the <codeph>DESCRIBE FORMATTED</codeph> output. You specify HDFS URIs 
or path specifications with
+  statements such as <codeph>LOAD DATA</codeph> and the 
<codeph>LOCATION</codeph> clause of <codeph>CREATE
+  TABLE</codeph> or <codeph>ALTER TABLE</codeph>. You might also use HDFS URIs 
or paths with Linux commands
+  such as <cmdname>hadoop</cmdname> and <cmdname>hdfs</cmdname> to copy, 
rename, and so on, data files in HDFS.
+</p>
+
+<p conref="../shared/impala_common.xml#common/sync_ddl_blurb"/>
+
+<p rev="1.2.1">
+  Each table can also have associated table statistics and column statistics. 
To see these categories of
+  information, use the <codeph>SHOW TABLE STATS 
<varname>table_name</varname></codeph> and <codeph>SHOW COLUMN
+  STATS <varname>table_name</varname></codeph> statements.
+<!--
+For example, the table statistics can often show you the number
+and total size of the files in the table, even if you have not
+run <codeph>COMPUTE STATS</codeph>.
+-->
+  See <xref href="impala_show.xml#show"/> for details.
+</p>
+
+<note conref="../shared/impala_common.xml#common/compute_stats_next"/>
+
+<p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+<p>
+  The following example shows the results of both a standard 
<codeph>DESCRIBE</codeph> and <codeph>DESCRIBE
+  FORMATTED</codeph> for different kinds of schema objects:
+</p>
+
+  <ul>
+    <li>
+      <codeph>DESCRIBE</codeph> for a table or a view returns the name, type, 
and comment for each of the
+      columns. For a view, if the column value is computed by an expression, 
the column name is automatically
+      generated as <codeph>_c0</codeph>, <codeph>_c1</codeph>, and so on 
depending on the ordinal number of the
+      column.
+    </li>
+
+    <li>
+      A table created with no special format or storage clauses is designated 
as a <codeph>MANAGED_TABLE</codeph>
+      (an <q>internal table</q> in Impala terminology). Its data files are 
stored in an HDFS directory under the
+      default Hive data directory. By default, it uses Text data format.
+    </li>
+
+    <li>
+      A view is designated as <codeph>VIRTUAL_VIEW</codeph> in 
<codeph>DESCRIBE FORMATTED</codeph> output. Some
+      of its properties are <codeph>NULL</codeph> or blank because they are 
inherited from the base table. The
+      text of the query that defines the view is part of the <codeph>DESCRIBE 
FORMATTED</codeph> output.
+    </li>
+
+    <li>
+      A table with additional clauses in the <codeph>CREATE TABLE</codeph> 
statement has differences in
+      <codeph>DESCRIBE FORMATTED</codeph> output. The output for 
<codeph>T2</codeph> includes the
+      <codeph>EXTERNAL_TABLE</codeph> keyword because of the <codeph>CREATE 
EXTERNAL TABLE</codeph> syntax, and
+      different <codeph>InputFormat</codeph> and <codeph>OutputFormat</codeph> 
fields to reflect the Parquet file
+      format.
+    </li>
+  </ul>
+
+<codeblock>[localhost:21000] &gt; create table t1 (x int, y int, s string);
+Query: create table t1 (x int, y int, s string)
+[localhost:21000] &gt; describe t1;
+Query: describe t1
+Query finished, fetching results ...
++------+--------+---------+
+| name | type   | comment |
++------+--------+---------+
+| x    | int    |         |
+| y    | int    |         |
+| s    | string |         |
++------+--------+---------+
+Returned 3 row(s) in 0.13s
+[localhost:21000] &gt; describe formatted t1;
+Query: describe formatted t1
+Query finished, fetching results ...
++------------------------------+--------------------------------------------+------------+
+| name                         | type                                       | 
comment    |
++------------------------------+--------------------------------------------+------------+
+| # col_name                   | data_type                                  | 
comment    |
+|                              | NULL                                       | 
NULL       |
+| x                            | int                                        | 
None       |
+| y                            | int                                        | 
None       |
+| s                            | string                                     | 
None       |
+|                              | NULL                                       | 
NULL       |
+| # Detailed Table Information | NULL                                       | 
NULL       |
+| Database:                    | describe_formatted                         | 
NULL       |
+| Owner:                       | cloudera                                   | 
NULL       |
+| CreateTime:                  | Mon Jul 22 17:03:16 EDT 2013               | 
NULL       |
+| LastAccessTime:              | UNKNOWN                                    | 
NULL       |
+| Protect Mode:                | None                                       | 
NULL       |
+| Retention:                   | 0                                          | 
NULL       |
+| Location:                    | hdfs://127.0.0.1:8020/user/hive/warehouse/ |  
          |
+|                              |   describe_formatted.db/t1                 | 
NULL       |
+| Table Type:                  | MANAGED_TABLE                              | 
NULL       |
+| Table Parameters:            | NULL                                       | 
NULL       |
+|                              | transient_lastDdlTime                      | 
1374526996 |
+|                              | NULL                                       | 
NULL       |
+| # Storage Information        | NULL                                       | 
NULL       |
+| SerDe Library:               | org.apache.hadoop.hive.serde2.lazy.        |  
          |
+|                              |   LazySimpleSerDe                          | 
NULL       |
+| InputFormat:                 | org.apache.hadoop.mapred.TextInputFormat   | 
NULL       |
+| OutputFormat:                | org.apache.hadoop.hive.ql.io.              |  
          |
+|                              |   HiveIgnoreKeyTextOutputFormat            | 
NULL       |
+| Compressed:                  | No                                         | 
NULL       |
+| Num Buckets:                 | 0                                          | 
NULL       |
+| Bucket Columns:              | []                                         | 
NULL       |
+| Sort Columns:                | []                                         | 
NULL       |
++------------------------------+--------------------------------------------+------------+
+Returned 26 row(s) in 0.03s
+[localhost:21000] &gt; create view v1 as select x, upper(s) from t1;
+Query: create view v1 as select x, upper(s) from t1
+[localhost:21000] &gt; describe v1;
+Query: describe v1
+Query finished, fetching results ...
++------+--------+---------+
+| name | type   | comment |
++------+--------+---------+
+| x    | int    |         |
+| _c1  | string |         |
++------+--------+---------+
+Returned 2 row(s) in 0.10s
+[localhost:21000] &gt; describe formatted v1;
+Query: describe formatted v1
+Query finished, fetching results ...
++------------------------------+------------------------------+----------------------+
+| name                         | type                         | comment        
      |
++------------------------------+------------------------------+----------------------+
+| # col_name                   | data_type                    | comment        
      |
+|                              | NULL                         | NULL           
      |
+| x                            | int                          | None           
      |
+| _c1                          | string                       | None           
      |
+|                              | NULL                         | NULL           
      |
+| # Detailed Table Information | NULL                         | NULL           
      |
+| Database:                    | describe_formatted           | NULL           
      |
+| Owner:                       | cloudera                     | NULL           
      |
+| CreateTime:                  | Mon Jul 22 16:56:38 EDT 2013 | NULL           
      |
+| LastAccessTime:              | UNKNOWN                      | NULL           
      |
+| Protect Mode:                | None                         | NULL           
      |
+| Retention:                   | 0                            | NULL           
      |
+| Table Type:                  | VIRTUAL_VIEW                 | NULL           
      |
+| Table Parameters:            | NULL                         | NULL           
      |
+|                              | transient_lastDdlTime        | 1374526598     
      |
+|                              | NULL                         | NULL           
      |
+| # Storage Information        | NULL                         | NULL           
      |
+| SerDe Library:               | null                         | NULL           
      |
+| InputFormat:                 | null                         | NULL           
      |
+| OutputFormat:                | null                         | NULL           
      |
+| Compressed:                  | No                           | NULL           
      |
+| Num Buckets:                 | 0                            | NULL           
      |
+| Bucket Columns:              | []                           | NULL           
      |
+| Sort Columns:                | []                           | NULL           
      |
+|                              | NULL                         | NULL           
      |
+| # View Information           | NULL                         | NULL           
      |
+| View Original Text:          | SELECT x, upper(s) FROM t1   | NULL           
      |
+| View Expanded Text:          | SELECT x, upper(s) FROM t1   | NULL           
      |
++------------------------------+------------------------------+----------------------+
+Returned 28 row(s) in 0.03s
+[localhost:21000] &gt; create external table t2 (x int, y int, s string) 
stored as parquet location '/user/cloudera/sample_data';
+[localhost:21000] &gt; describe formatted t2;
+Query: describe formatted t2
+Query finished, fetching results ...
++------------------------------+----------------------------------------------------+------------+
+| name                         | type                                          
     | comment    |
++------------------------------+----------------------------------------------------+------------+
+| # col_name                   | data_type                                     
     | comment    |
+|                              | NULL                                          
     | NULL       |
+| x                            | int                                           
     | None       |
+| y                            | int                                           
     | None       |
+| s                            | string                                        
     | None       |
+|                              | NULL                                          
     | NULL       |
+| # Detailed Table Information | NULL                                          
     | NULL       |
+| Database:                    | describe_formatted                            
     | NULL       |
+| Owner:                       | cloudera                                      
     | NULL       |
+| CreateTime:                  | Mon Jul 22 17:01:47 EDT 2013                  
     | NULL       |
+| LastAccessTime:              | UNKNOWN                                       
     | NULL       |
+| Protect Mode:                | None                                          
     | NULL       |
+| Retention:                   | 0                                             
     | NULL       |
+| Location:                    | 
hdfs://127.0.0.1:8020/user/cloudera/sample_data    | NULL       |
+| Table Type:                  | EXTERNAL_TABLE                                
     | NULL       |
+| Table Parameters:            | NULL                                          
     | NULL       |
+|                              | EXTERNAL                                      
     | TRUE       |
+|                              | transient_lastDdlTime                         
     | 1374526907 |
+|                              | NULL                                          
     | NULL       |
+| # Storage Information        | NULL                                          
     | NULL       |
+| SerDe Library:               | 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL       |
+| InputFormat:                 | 
com.cloudera.impala.hive.serde.ParquetInputFormat  | NULL       |
+| OutputFormat:                | 
com.cloudera.impala.hive.serde.ParquetOutputFormat | NULL       |
+| Compressed:                  | No                                            
     | NULL       |
+| Num Buckets:                 | 0                                             
     | NULL       |
+| Bucket Columns:              | []                                            
     | NULL       |
+| Sort Columns:                | []                                            
     | NULL       |
++------------------------------+----------------------------------------------------+------------+
+Returned 27 row(s) in 0.17s</codeblock>
+
+    <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/>
+
+    <p conref="../shared/impala_common.xml#common/permissions_blurb"/>
+    <p rev="CDH-19187">
+      The user ID that the <cmdname>impalad</cmdname> daemon runs under,
+      typically the <codeph>impala</codeph> user, must have read and execute
+      permissions for all directories that are part of the table.
+      (A table could span multiple different HDFS directories if it is 
partitioned.
+      The directories could be widely scattered because a partition can reside
+      in an arbitrary HDFS directory based on its <codeph>LOCATION</codeph> 
attribute.)
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      <xref href="impala_tables.xml#tables"/>, <xref 
href="impala_create_table.xml#create_table"/>,
+      <xref href="impala_show.xml#show_tables"/>, <xref 
href="impala_show.xml#show_create_table"/>
+    </p>
+  </conbody>
+</concept>


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_development.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_development.xml 
b/docs/topics/impala_development.xml
new file mode 100644
index 0000000..1b35046
--- /dev/null
+++ b/docs/topics/impala_development.xml
@@ -0,0 +1,231 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="intro_dev">
+
+  <title>Developing Impala Applications</title>
+  <titlealts audience="PDF"><navtitle>Developing 
Applications</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Concepts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      The core development language with Impala is SQL. You can also use Java 
or other languages to interact with
+      Impala through the standard JDBC and ODBC interfaces used by many 
business intelligence tools. For
+      specialized kinds of analysis, you can supplement the SQL built-in 
functions by writing
+      <xref href="impala_udf.xml#udfs">user-defined functions (UDFs)</xref> in 
C++ or Java.
+    </p>
+
+    <p outputclass="toc inpage"/>
+  </conbody>
+
+  <concept id="intro_sql">
+
+    <title>Overview of the Impala SQL Dialect</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Concepts"/>
+    </metadata>
+  </prolog>
+
+    <conbody>
+
+      <p>
+        The Impala SQL dialect is highly compatible with the SQL syntax used 
in the Apache Hive component (HiveQL). As
+        such, it is familiar to users who are already familiar with running 
SQL queries on the Hadoop
+        infrastructure. Currently, Impala SQL supports a subset of HiveQL 
statements, data types, and built-in
+        functions. Impala also includes additional built-in functions for 
common industry features, to simplify
+        porting SQL from non-Hadoop systems.
+      </p>
+
+      <p>
+        For users coming to Impala from traditional database or data 
warehousing backgrounds, the following aspects of the SQL dialect
+        might seem familiar:
+      </p>
+
+      <ul>
+        <li>
+          <p>
+            The <xref href="impala_select.xml#select">SELECT statement</xref> 
includes familiar clauses such as <codeph>WHERE</codeph>,
+            <codeph>GROUP BY</codeph>, <codeph>ORDER BY</codeph>, and 
<codeph>WITH</codeph>.
+            You will find familiar notions such as
+            <xref href="impala_joins.xml#joins">joins</xref>, <xref 
href="impala_functions.xml#builtins">built-in
+            functions</xref> for processing strings, numbers, and dates,
+            <xref 
href="impala_aggregate_functions.xml#aggregate_functions">aggregate 
functions</xref>,
+            <xref href="impala_subqueries.xml#subqueries">subqueries</xref>, 
and
+            <xref href="impala_operators.xml#comparison_operators">comparison 
operators</xref>
+            such as <codeph>IN()</codeph> and <codeph>BETWEEN</codeph>.
+            The <codeph>SELECT</codeph> statement is the place where SQL 
standards compliance is most important.
+          </p>
+        </li>
+
+        <li>
+          <p>
+          From the data warehousing world, you will recognize the notion of
+          <xref href="impala_partitioning.xml#partitioning">partitioned 
tables</xref>.
+          One or more columns serve as partition keys, and the data is 
physically arranged so that
+          queries that refer to the partition key columns in the 
<codeph>WHERE</codeph> clause
+          can skip partitions that do not match the filter conditions. For 
example, if you have 10
+          years worth of data and use a clause such as <codeph>WHERE year = 
2015</codeph>,
+          <codeph>WHERE year &gt; 2010</codeph>, or <codeph>WHERE year IN 
(2014, 2015)</codeph>,
+          Impala skips all the data for non-matching years, greatly reducing 
the amount of I/O
+          for the query.
+          </p>
+        </li>
+
+        <li rev="1.2">
+          <p>
+          In Impala 1.2 and higher, <xref 
href="impala_udf.xml#udfs">UDFs</xref> let you perform custom comparisons
+          and transformation logic during <codeph>SELECT</codeph> and 
<codeph>INSERT...SELECT</codeph> statements.
+          </p>
+        </li>
+      </ul>
+
+      <p>
+        For users coming to Impala from traditional database or data 
warehousing backgrounds, the following aspects of the SQL dialect
+        might require some learning and practice for you to become proficient 
in the Hadoop environment:
+      </p>
+
+      <ul>
+        <li>
+          <p>
+          Impala SQL is focused on queries and includes relatively little DML. 
There is no <codeph>UPDATE</codeph>
+          or <codeph>DELETE</codeph> statement. Stale data is typically 
discarded (by <codeph>DROP TABLE</codeph>
+          or <codeph>ALTER TABLE ... DROP PARTITION</codeph> statements) or 
replaced (by <codeph>INSERT
+          OVERWRITE</codeph> statements).
+          </p>
+        </li>
+
+        <li>
+          <p>
+          All data creation is done by <codeph>INSERT</codeph> statements, 
which typically insert data in bulk by
+          querying from other tables. There are two variations, <codeph>INSERT 
INTO</codeph> which appends to the
+          existing data, and <codeph>INSERT OVERWRITE</codeph> which replaces 
the entire contents of a table or
+          partition (similar to <codeph>TRUNCATE TABLE</codeph> followed by a 
new <codeph>INSERT</codeph>).
+          Although there is an <codeph>INSERT ... VALUES</codeph> syntax to 
create a small number of values in
+          a single statement, it is far more efficient to use the 
<codeph>INSERT ... SELECT</codeph> to copy
+          and transform large amounts of data from one table to another in a 
single operation.
+          </p>
+        </li>
+
+        <li>
+          <p>
+          You often construct Impala table definitions and data files in some 
other environment, and then attach
+          Impala so that it can run real-time queries. The same data files and 
table metadata are shared with other
+          components of the Hadoop ecosystem. In particular, Impala can access 
tables created by Hive or data
+          inserted by Hive, and Hive can access tables and data produced by 
Impala. Many other Hadoop components
+          can write files in formats such as Parquet and Avro, that can then 
be queried by Impala.
+          </p>
+        </li>
+
+        <li>
+          <p>
+          Because Hadoop and Impala are focused on data warehouse-style 
operations on large data sets, Impala SQL
+          includes some idioms that you might find in the import utilities for 
traditional database systems. For
+          example, you can create a table that reads comma-separated or 
tab-separated text files, specifying the
+          separator in the <codeph>CREATE TABLE</codeph> statement. You can 
create <b>external tables</b> that read
+          existing data files but do not move or transform them.
+          </p>
+        </li>
+
+        <li>
+          <p>
+          Because Impala reads large quantities of data that might not be 
perfectly tidy and predictable, it does
+          not require length constraints on string data types. For example, 
you can define a database column as
+          <codeph>STRING</codeph> with unlimited length, rather than 
<codeph>CHAR(1)</codeph> or
+          <codeph>VARCHAR(64)</codeph>. <ph rev="2.0.0">(Although in Impala 
2.0 and later, you can also use
+          length-constrained <codeph>CHAR</codeph> and 
<codeph>VARCHAR</codeph> types.)</ph>
+          </p>
+        </li>
+
+      </ul>
+
+      <p>
+        <b>Related information:</b> <xref href="impala_langref.xml#langref"/>, 
especially
+        <xref href="impala_langref_sql.xml#langref_sql"/> and <xref 
href="impala_functions.xml#builtins"/>
+      </p>
+    </conbody>
+  </concept>
+
+<!-- Bunch of potential concept topics for future consideration. Major areas 
of Impala modelled on areas of discussion for Oracle Database, and distributed 
databases in general. -->
+
+  <concept id="intro_datatypes" audience="Cloudera">
+
+    <title>Overview of Impala SQL Data Types</title>
+
+    <conbody/>
+  </concept>
+
+  <concept id="intro_network" audience="Cloudera">
+
+    <title>Overview of Impala Network Topology</title>
+
+    <conbody/>
+  </concept>
+
+  <concept id="intro_cluster" audience="Cloudera">
+
+    <title>Overview of Impala Cluster Topology</title>
+
+    <conbody/>
+  </concept>
+
+  <concept id="intro_apis">
+
+    <title>Overview of Impala Programming Interfaces</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="JDBC"/>
+      <data name="Category" value="ODBC"/>
+      <data name="Category" value="Hue"/>
+    </metadata>
+  </prolog>
+
+    <conbody>
+
+      <p>
+        You can connect and submit requests to the Impala daemons through:
+      </p>
+
+      <ul>
+        <li>
+          The <codeph><xref 
href="impala_impala_shell.xml#impala_shell">impala-shell</xref></codeph> 
interactive
+          command interpreter.
+        </li>
+
+        <li>
+          The <xref href="http://gethue.com/"; scope="external" 
format="html">Hue</xref> web-based user interface.
+        </li>
+
+        <li>
+          <xref href="impala_jdbc.xml#impala_jdbc">JDBC</xref>.
+        </li>
+
+        <li>
+          <xref href="impala_odbc.xml#impala_odbc">ODBC</xref>.
+        </li>
+      </ul>
+
+      <p>
+        With these options, you can use Impala in heterogeneous environments, 
with JDBC or ODBC applications
+        running on non-Linux platforms. You can also use Impala on combination 
with various Business Intelligence
+        tools that use the JDBC and ODBC interfaces.
+      </p>
+
+      <p>
+        Each <codeph>impalad</codeph> daemon process, running on separate 
nodes in a cluster, listens to
+        <xref href="impala_ports.xml#ports">several ports</xref> for incoming 
requests. Requests from
+        <codeph>impala-shell</codeph> and Hue are routed to the 
<codeph>impalad</codeph> daemons through the same
+        port. The <codeph>impalad</codeph> daemons listen on separate ports 
for JDBC and ODBC requests.
+      </p>
+    </conbody>
+  </concept>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_cached_reads.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_cached_reads.xml 
b/docs/topics/impala_disable_cached_reads.xml
new file mode 100644
index 0000000..bce6091
--- /dev/null
+++ b/docs/topics/impala_disable_cached_reads.xml
@@ -0,0 +1,36 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_cached_reads" rev="1.4.0">
+
+  <title>DISABLE_CACHED_READS Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="HDFS"/>
+      <data name="Category" value="HDFS Caching"/>
+      <data name="Category" value="Querying"/>
+      <data name="Category" value="Performance"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DISABLE_CACHED_READS query 
option</indexterm>
+      Prevents Impala from reading data files that are <q>pinned</q> in memory
+      through the HDFS caching feature. Primarily a debugging option for
+      cases where processing of HDFS cached data is concentrated on a single
+      host, leading to excessive CPU usage on that host.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+
+    <p conref="../shared/impala_common.xml#common/default_false"/>
+
+    <p conref="../shared/impala_common.xml#common/added_in_140"/>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_codegen.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_codegen.xml 
b/docs/topics/impala_disable_codegen.xml
new file mode 100644
index 0000000..ab306e8
--- /dev/null
+++ b/docs/topics/impala_disable_codegen.xml
@@ -0,0 +1,38 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_codegen">
+
+  <title>DISABLE_CODEGEN Query Option</title>
+  <titlealts audience="PDF"><navtitle>DISABLE_CODEGEN</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Troubleshooting"/>
+      <data name="Category" value="Performance"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DISABLE_CODEGEN query option</indexterm>
+      This is a debug option, intended for diagnosing and working around 
issues that cause crashes. If a query
+      fails with an <q>illegal instruction</q> or other hardware-specific 
message, try setting
+      <codeph>DISABLE_CODEGEN=true</codeph> and running the query again. If 
the query succeeds only when the
+      <codeph>DISABLE_CODEGEN</codeph> option is turned on, submit the problem 
to <keyword keyref="support_org"/> and include that
+      detail in the problem report. Do not otherwise run with this setting 
turned on, because it results in lower
+      overall performance.
+    </p>
+
+    <p>
+      Because the code generation phase adds a small amount of overhead for 
each query, you might turn on the
+      <codeph>DISABLE_CODEGEN</codeph> option to achieve maximum throughput 
when running many short-lived queries
+      against small tables.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+    <p conref="../shared/impala_common.xml#common/default_false_0"/>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_outermost_topn.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_outermost_topn.xml 
b/docs/topics/impala_disable_outermost_topn.xml
new file mode 100644
index 0000000..24fcd2a
--- /dev/null
+++ b/docs/topics/impala_disable_outermost_topn.xml
@@ -0,0 +1,29 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_outermost_topn" rev="2.5.0">
+
+  <title>DISABLE_OUTERMOST_TOPN Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p rev="2.5.0">
+      <indexterm audience="Cloudera">DISABLE_OUTERMOST_TOPN query 
option</indexterm>
+    </p>
+
+    <p>
+      <b>Type:</b>
+    </p>
+
+    <p>
+      <b>Default:</b>
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_row_runtime_filtering.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_row_runtime_filtering.xml 
b/docs/topics/impala_disable_row_runtime_filtering.xml
new file mode 100644
index 0000000..a8c07cb
--- /dev/null
+++ b/docs/topics/impala_disable_row_runtime_filtering.xml
@@ -0,0 +1,65 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_row_runtime_filtering" rev="2.5.0">
+
+  <title>DISABLE_ROW_RUNTIME_FILTERING Query Option (<keyword 
keyref="impala25"/> or higher only)</title>
+  <titlealts 
audience="PDF"><navtitle>DISABLE_ROW_RUNTIME_FILTERING</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Performance"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p rev="2.5.0">
+      <indexterm audience="Cloudera">DISABLE_ROW_RUNTIME_FILTERING query 
option</indexterm>
+      The <codeph>DISABLE_ROW_RUNTIME_FILTERING</codeph> query option
+      reduces the scope of the runtime filtering feature. Queries still 
dynamically prune
+      partitions, but do not apply the filtering logic to individual rows 
within partitions.
+    </p>
+
+    <p>
+      Only applies to queries against Parquet tables. For other file formats, 
Impala
+      only prunes at the level of partitions, not individual rows.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+    <p conref="../shared/impala_common.xml#common/default_false"/>
+
+    <p conref="../shared/impala_common.xml#common/added_in_250"/>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p>
+      Impala automatically evaluates whether the per-row filters are being
+      effective at reducing the amount of intermediate data. Therefore,
+      this option is typically only needed for the rare case where Impala
+      cannot accurately determine how effective the per-row filtering is
+      for a query.
+    </p>
+
+    <p 
conref="../shared/impala_common.xml#common/runtime_filtering_option_caveat"/>
+
+    <p>
+      Because this setting only improves query performance in very specific
+      circumstances, depending on the query characteristics and data 
distribution,
+      only use it when you determine through benchmarking that it improves
+      performance of specific expensive queries.
+      Consider setting this query option immediately before the expensive 
query and
+      unsetting it immediately afterward.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+    <p>
+      <xref href="impala_runtime_filtering.xml"/>,
+      <xref href="impala_runtime_filter_mode.xml#runtime_filter_mode"/>
+      <!-- , <xref href="impala_partitioning.xml#dynamic_partition_pruning"/> 
-->
+    </p>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_streaming_preaggregations.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_streaming_preaggregations.xml 
b/docs/topics/impala_disable_streaming_preaggregations.xml
new file mode 100644
index 0000000..a3919af
--- /dev/null
+++ b/docs/topics/impala_disable_streaming_preaggregations.xml
@@ -0,0 +1,45 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_streaming_preaggregations" rev="2.5.0 IMPALA-1305">
+
+  <title>DISABLE_STREAMING_PREAGGREGATIONS Query Option (<keyword 
keyref="impala25"/> or higher only)</title>
+  <titlealts 
audience="PDF"><navtitle>DISABLE_STREAMING_PREAGGREGATIONS</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Performance"/>
+      <data name="Category" value="Aggregate Functions"/>
+      <data name="Category" value="Troubleshooting"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p rev="2.5.0 IMPALA-1305">
+      <indexterm audience="Cloudera">DISABLE_STREAMING_PREAGGREGATIONS query 
option</indexterm>
+      Turns off the <q>streaming preaggregation</q> optimization that is 
available in <keyword keyref="impala25_full"/>
+      and higher. This optimization reduces unnecessary work performed by 
queries that perform aggregation
+      operations on columns with few or no duplicate values, for example 
<codeph>DISTINCT <varname>id_column</varname></codeph>
+      or <codeph>GROUP BY <varname>unique_column</varname></codeph>. If the 
optimization causes regressions in
+      existing queries that use aggregation functions, you can turn it off as 
needed by setting this query option.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+    <p conref="../shared/impala_common.xml#common/default_false_0"/>
+
+    <note conref="../shared/impala_common.xml#common/one_but_not_true"/>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+    <p>
+      Typically, queries that would require enabling this option involve very 
large numbers of
+      aggregated values, such as a billion or more distinct keys being 
processed on each
+      worker node.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/added_in_250"/>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_unsafe_spills.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_unsafe_spills.xml 
b/docs/topics/impala_disable_unsafe_spills.xml
new file mode 100644
index 0000000..14b687c
--- /dev/null
+++ b/docs/topics/impala_disable_unsafe_spills.xml
@@ -0,0 +1,53 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept rev="2.0.0" id="disable_unsafe_spills">
+
+  <title>DISABLE_UNSAFE_SPILLS Query Option (<keyword keyref="impala20"/> or 
higher only)</title>
+  <titlealts 
audience="PDF"><navtitle>DISABLE_UNSAFE_SPILLS</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Performance"/>
+      <data name="Category" value="Scalability"/>
+      <data name="Category" value="Memory"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p rev="2.0.0">
+      <indexterm audience="Cloudera">DISABLE_UNSAFE_SPILLS query 
option</indexterm>
+      Enable this option if you prefer to have queries fail when they exceed 
the Impala memory limit, rather than
+      write temporary data to disk.
+    </p>
+
+    <p>
+      Queries that <q>spill</q> to disk typically complete successfully, when 
in earlier Impala releases they would have failed.
+      However, queries with exorbitant memory requirements due to missing 
statistics or inefficient join clauses could
+      become so slow as a result that you would rather have them cancelled 
automatically and reduce the memory
+      usage through standard Impala tuning techniques.
+    </p>
+
+    <p>
+      This option prevents only <q>unsafe</q> spill operations, meaning that 
one or more tables are missing
+      statistics or the query does not include a hint to set the most 
efficient mechanism for a join or
+      <codeph>INSERT ... SELECT</codeph> into a partitioned table. These are 
the tables most likely to result in
+      suboptimal execution plans that could cause unnecessary spilling. 
Therefore, leaving this option enabled is a
+      good way to find tables on which to run the <codeph>COMPUTE 
STATS</codeph> statement.
+    </p>
+
+    <p>
+      See <xref href="impala_scalability.xml#spill_to_disk"/> for information 
about the <q>spill to disk</q>
+      feature for queries processing large result sets with joins, 
<codeph>ORDER BY</codeph>, <codeph>GROUP
+      BY</codeph>, <codeph>DISTINCT</codeph>, aggregation functions, or 
analytic functions.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+    <p conref="../shared/impala_common.xml#common/default_false_0"/>
+
+    <p conref="../shared/impala_common.xml#common/added_in_20"/>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disk_space.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disk_space.xml 
b/docs/topics/impala_disk_space.xml
new file mode 100644
index 0000000..b6daaeb
--- /dev/null
+++ b/docs/topics/impala_disk_space.xml
@@ -0,0 +1,129 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disk_space">
+
+  <title>Managing Disk Space for Impala Data</title>
+  <titlealts audience="PDF"><navtitle>Managing Disk 
Space</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Disk Storage"/>
+      <data name="Category" value="Administrators"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Tables"/>
+      <data name="Category" value="Compression"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      Although Impala typically works with many large files in an HDFS storage 
system with plenty of capacity,
+      there are times when you might perform some file cleanup to reclaim 
space, or advise developers on techniques
+      to minimize space consumption and file duplication.
+    </p>
+
+    <ul>
+      <li>
+        <p>
+          Use compact binary file formats where practical. Numeric and 
time-based data in particular can be stored
+          in more compact form in binary data files. Depending on the file 
format, various compression and encoding
+          features can reduce file size even further. You can specify the 
<codeph>STORED AS</codeph> clause as part
+          of the <codeph>CREATE TABLE</codeph> statement, or <codeph>ALTER 
TABLE</codeph> with the <codeph>SET
+          FILEFORMAT</codeph> clause for an existing table or partition within 
a partitioned table. See
+          <xref href="impala_file_formats.xml#file_formats"/> for details 
about file formats, especially
+          <xref href="impala_parquet.xml#parquet"/>. See <xref 
href="impala_create_table.xml#create_table"/> and
+          <xref href="impala_alter_table.xml#alter_table"/> for syntax details.
+        </p>
+      </li>
+
+      <li>
+        <p>
+          You manage underlying data files differently depending on whether 
the corresponding Impala table is
+          defined as an <xref 
href="impala_tables.xml#internal_tables">internal</xref> or
+          <xref href="impala_tables.xml#external_tables">external</xref> table:
+        </p>
+        <ul>
+          <li>
+            Use the <codeph>DESCRIBE FORMATTED</codeph> statement to check if 
a particular table is internal
+            (managed by Impala) or external, and to see the physical location 
of the data files in HDFS. See
+            <xref href="impala_describe.xml#describe"/> for details.
+          </li>
+
+          <li>
+            For Impala-managed (<q>internal</q>) tables, use <codeph>DROP 
TABLE</codeph> statements to remove
+            data files. See <xref href="impala_drop_table.xml#drop_table"/> 
for details.
+          </li>
+
+          <li>
+            For tables not managed by Impala (<q>external</q> tables), use 
appropriate HDFS-related commands such
+            as <codeph>hadoop fs</codeph>, <codeph>hdfs dfs</codeph>, or 
<codeph>distcp</codeph>, to create, move,
+            copy, or delete files within HDFS directories that are accessible 
by the <codeph>impala</codeph> user.
+            Issue a <codeph>REFRESH <varname>table_name</varname></codeph> 
statement after adding or removing any
+            files from the data directory of an external table. See <xref 
href="impala_refresh.xml#refresh"/> for
+            details.
+          </li>
+
+          <li>
+            Use external tables to reference HDFS data files in their original 
location. With this technique, you
+            avoid copying the files, and you can map more than one Impala 
table to the same set of data files. When
+            you drop the Impala table, the data files are left undisturbed. See
+            <xref href="impala_tables.xml#external_tables"/> for details.
+          </li>
+
+          <li>
+            Use the <codeph>LOAD DATA</codeph> statement to move HDFS files 
into the data directory for an Impala
+            table from inside Impala, without the need to specify the HDFS 
path of the destination directory. This
+            technique works for both internal and external tables. See
+            <xref href="impala_load_data.xml#load_data"/> for details.
+          </li>
+        </ul>
+      </li>
+
+      <li>
+        <p>
+          Make sure that the HDFS trashcan is configured correctly. When you 
remove files from HDFS, the space
+          might not be reclaimed for use by other files until sometime later, 
when the trashcan is emptied. See
+          <xref href="impala_drop_table.xml#drop_table"/> and the FAQ entry
+          <xref href="impala_faq.xml#faq_sql/faq_drop_table_space"/> for 
details. See
+          <xref href="impala_prereqs.xml#prereqs_account"/> for permissions 
needed for the HDFS trashcan to operate
+          correctly.
+        </p>
+      </li>
+
+      <li>
+        <p>
+          Drop all tables in a database before dropping the database itself. 
See
+          <xref href="impala_drop_database.xml#drop_database"/> for details.
+        </p>
+      </li>
+
+      <li>
+        <p>
+          Clean up temporary files after failed <codeph>INSERT</codeph> 
statements. If an <codeph>INSERT</codeph>
+          statement encounters an error, and you see a directory named 
<filepath>.impala_insert_staging</filepath>
+          or <filepath>_impala_insert_staging</filepath> left behind in the 
data directory for the table, it might
+          contain temporary data files taking up space in HDFS. You might be 
able to salvage these data files, for
+          example if they are complete but could not be moved into place due 
to a permission error. Or, you might
+          delete those files through commands such as <codeph>hadoop 
fs</codeph> or <codeph>hdfs dfs</codeph>, to
+          reclaim space before re-trying the <codeph>INSERT</codeph>. Issue 
<codeph>DESCRIBE FORMATTED
+          <varname>table_name</varname></codeph> to see the HDFS path where 
you can check for temporary files.
+        </p>
+      </li>
+
+      <li rev="1.4.0">
+        <p rev="obwl" 
conref="../shared/impala_common.xml#common/order_by_scratch_dir"/>
+      </li>
+
+      <li rev="2.2.0">
+        <p>
+          If you use the Amazon Simple Storage Service (S3) as a place to 
offload
+          data to reduce the volume of local storage, Impala 2.2.0 and higher
+          can query the data directly from S3.
+          See <xref href="impala_s3.xml#s3"/> for details.
+        </p>
+      </li>
+    </ul>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_distinct.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_distinct.xml b/docs/topics/impala_distinct.xml
new file mode 100644
index 0000000..8661831
--- /dev/null
+++ b/docs/topics/impala_distinct.xml
@@ -0,0 +1,61 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="distinct">
+
+  <title>DISTINCT Operator</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Querying"/>
+      <data name="Category" value="Aggregate Functions"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DISTINCT operator</indexterm>
+      The <codeph>DISTINCT</codeph> operator in a <codeph>SELECT</codeph> 
statement filters the result set to
+      remove duplicates:
+    </p>
+
+<codeblock>-- Returns the unique values from one column.
+-- NULL is included in the set of values if any rows have a NULL in this 
column.
+select distinct c_birth_country from customer;
+-- Returns the unique combinations of values from multiple columns.
+select distinct c_salutation, c_last_name from customer;</codeblock>
+
+    <p>
+      You can use <codeph>DISTINCT</codeph> in combination with an aggregation 
function, typically
+      <codeph>COUNT()</codeph>, to find how many different values a column 
contains:
+    </p>
+
+<codeblock>-- Counts the unique values from one column.
+-- NULL is not included as a distinct value in the count.
+select count(distinct c_birth_country) from customer;
+-- Counts the unique combinations of values from multiple columns.
+select count(distinct c_salutation, c_last_name) from customer;</codeblock>
+
+    <p>
+      One construct that Impala SQL does <i>not</i> support is using 
<codeph>DISTINCT</codeph> in more than one
+      aggregation function in the same query. For example, you could not have 
a single query with both
+      <codeph>COUNT(DISTINCT c_first_name)</codeph> and <codeph>COUNT(DISTINCT 
c_last_name)</codeph> in the
+      <codeph>SELECT</codeph> list.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/zero_length_strings"/>
+
+    <note conref="../shared/impala_common.xml#common/multiple_count_distinct"/>
+
+    <note>
+      <p>
+        In contrast with some database systems that always return 
<codeph>DISTINCT</codeph> values in sorted order,
+        Impala does not do any ordering of <codeph>DISTINCT</codeph> values. 
Always include an <codeph>ORDER
+        BY</codeph> clause if you need the values in alphabetical or numeric 
sorted order.
+      </p>
+    </note>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_dml.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_dml.xml b/docs/topics/impala_dml.xml
new file mode 100644
index 0000000..b6c6399
--- /dev/null
+++ b/docs/topics/impala_dml.xml
@@ -0,0 +1,91 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="dml">
+
+  <title>DML Statements</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="DML"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Tables"/>
+      <data name="Category" value="ETL"/>
+      <data name="Category" value="Ingest"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      DML refers to <q>Data Manipulation Language</q>, a subset of SQL 
statements that modify the data stored in
+      tables. Because Impala focuses on query performance and leverages the 
append-only nature of HDFS storage,
+      currently Impala only supports a small set of DML statements:
+    </p>
+
+    <ul>
+      <li>
+        <xref keyref="delete"/>. Works for Kudu tables only.
+      </li>
+
+      <li>
+        <xref keyref="insert"/>.
+      </li>
+
+      <li>
+        <xref keyref="load_data"/>. Does not apply for HBase or Kudu tables.
+      </li>
+
+      <li>
+        <xref keyref="update"/>. Works for Kudu tables only.
+      </li>
+
+      <li>
+        <xref keyref="upsert"/>. Works for Kudu tables only.
+      </li>
+    </ul>
+
+    <p>
+      <codeph>INSERT</codeph> in Impala is primarily optimized for inserting 
large volumes of data in a single
+      statement, to make effective use of the multi-megabyte HDFS blocks. This 
is the way in Impala to create new
+      data files. If you intend to insert one or a few rows at a time, such as 
using the <codeph>INSERT ...
+      VALUES</codeph> syntax, that technique is much more efficient for Impala 
tables stored in HBase. See
+      <xref href="impala_hbase.xml#impala_hbase"/> for details.
+    </p>
+
+    <p>
+      <codeph>LOAD DATA</codeph> moves existing data files into the directory 
for an Impala table, making them
+      immediately available for Impala queries. This is one way in Impala to 
work with data files produced by other
+      Hadoop components. (<codeph>CREATE EXTERNAL TABLE</codeph> is the other 
alternative; with external tables,
+      you can query existing data files, while the files remain in their 
original location.)
+    </p>
+
+    <p>
+      In <keyword keyref="impala28_full"/> and higher, Impala does support the 
<codeph>UPDATE</codeph>, <codeph>DELETE</codeph>,
+      and <codeph>UPSERT</codeph> statements for Kudu tables.
+      For HDFS or S3 tables, to simulate the effects of an 
<codeph>UPDATE</codeph> or <codeph>DELETE</codeph> statement
+      in other database systems, typically you use <codeph>INSERT</codeph> or 
<codeph>CREATE TABLE AS SELECT</codeph> to copy data
+      from one table to another, filtering out or changing the appropriate 
rows during the copy operation.
+    </p>
+
+    <p>
+      You can also achieve a result similar to <codeph>UPDATE</codeph> by 
using Impala tables stored in HBase.
+      When you insert a row into an HBase table, and the table
+      already contains a row with the same value for the key column, the older 
row is hidden, effectively the same
+      as a single-row <codeph>UPDATE</codeph>.
+    </p>
+
+    <p rev="2.6.0">
+      Impala can perform DML operations for tables or partitions stored in the 
Amazon S3 filesystem
+      with <keyword keyref="impala26_full"/> and higher. See <xref 
href="impala_s3.xml#s3"/> for details.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      The other major classifications of SQL statements are data definition 
language (see
+      <xref href="impala_ddl.xml#ddl"/>) and queries (see <xref 
href="impala_select.xml#select"/>).
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_double.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_double.xml b/docs/topics/impala_double.xml
new file mode 100644
index 0000000..c69eae2
--- /dev/null
+++ b/docs/topics/impala_double.xml
@@ -0,0 +1,100 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="double">
+
+  <title>DOUBLE Data Type</title>
+  <titlealts audience="PDF"><navtitle>DOUBLE</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Data Types"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Schemas"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      A double precision floating-point data type used in <codeph>CREATE 
TABLE</codeph> and <codeph>ALTER
+      TABLE</codeph> statements.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+    <p>
+      In the column definition of a <codeph>CREATE TABLE</codeph> statement:
+    </p>
+
+<codeblock><varname>column_name</varname> DOUBLE</codeblock>
+
+    <p>
+      <b>Range:</b> 4.94065645841246544e-324d .. 1.79769313486231570e+308, 
positive or negative
+    </p>
+
+    <p>
+      <b>Precision:</b> 15 to 17 significant digits, depending on usage. The 
number of significant digits does
+      not depend on the position of the decimal point.
+    </p>
+
+    <p>
+      <b>Representation:</b> The values are stored in 8 bytes, using
+      <xref 
href="https://en.wikipedia.org/wiki/Double-precision_floating-point_format"; 
scope="external" format="html">IEEE 754 Double Precision Binary Floating 
Point</xref> format.
+    </p>
+
+    <p>
+      <b>Conversions:</b> Impala does not automatically convert 
<codeph>DOUBLE</codeph> to any other type. You can
+      use <codeph>CAST()</codeph> to convert <codeph>DOUBLE</codeph> values to 
<codeph>FLOAT</codeph>,
+      <codeph>TINYINT</codeph>, <codeph>SMALLINT</codeph>, 
<codeph>INT</codeph>, <codeph>BIGINT</codeph>,
+      <codeph>STRING</codeph>, <codeph>TIMESTAMP</codeph>, or 
<codeph>BOOLEAN</codeph>. You can use exponential
+      notation in <codeph>DOUBLE</codeph> literals or when casting from 
<codeph>STRING</codeph>, for example
+      <codeph>1.0e6</codeph> to represent one million.
+      <ph conref="../shared/impala_common.xml#common/cast_int_to_timestamp"/>
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p>
+      The data type <codeph>REAL</codeph> is an alias for 
<codeph>DOUBLE</codeph>.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+<codeblock>CREATE TABLE t1 (x DOUBLE);
+SELECT CAST(1000.5 AS DOUBLE);
+</codeblock>
+
+    <p conref="../shared/impala_common.xml#common/partitioning_imprecise"/>
+
+    <p conref="../shared/impala_common.xml#common/hbase_ok"/>
+
+    <p conref="../shared/impala_common.xml#common/parquet_ok"/>
+
+    <p conref="../shared/impala_common.xml#common/text_bulky"/>
+
+<!-- <p conref="../shared/impala_common.xml#common/compatibility_blurb"/> -->
+
+    <p conref="../shared/impala_common.xml#common/internals_8_bytes"/>
+
+<!-- <p conref="../shared/impala_common.xml#common/added_in_20"/> -->
+
+    <p conref="../shared/impala_common.xml#common/column_stats_constant"/>
+
+    <p conref="../shared/impala_common.xml#common/restrictions_blurb"/>
+
+<!-- This conref appears under SUM(), AVG(), FLOAT, and DOUBLE topics. -->
+
+    <p conref="../shared/impala_common.xml#common/sum_double"/>
+
+    <p 
conref="../shared/impala_common.xml#common/float_double_decimal_caveat"/>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      <xref href="impala_literals.xml#numeric_literals"/>, <xref 
href="impala_math_functions.xml#math_functions"/>,
+      <xref href="impala_float.xml#float"/>
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_drop_data_source.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_drop_data_source.xml 
b/docs/topics/impala_drop_data_source.xml
new file mode 100644
index 0000000..be6167c
--- /dev/null
+++ b/docs/topics/impala_drop_data_source.xml
@@ -0,0 +1,35 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept audience="Cloudera" rev="1.4.0" id="drop_data_source">
+
+  <title>DROP DATA SOURCE Statement</title>
+  <titlealts audience="PDF"><navtitle>DROP DATA SOURCE</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="DDL"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DROP DATA SOURCE statement</indexterm>
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/ddl_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_drop_database.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_drop_database.xml 
b/docs/topics/impala_drop_database.xml
new file mode 100644
index 0000000..fb36dca
--- /dev/null
+++ b/docs/topics/impala_drop_database.xml
@@ -0,0 +1,130 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="drop_database">
+
+  <title>DROP DATABASE Statement</title>
+  <titlealts audience="PDF"><navtitle>DROP DATABASE</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Databases"/>
+      <data name="Category" value="DDL"/>
+      <data name="Category" value="Schemas"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DROP DATABASE statement</indexterm>
+      Removes a database from the system. The physical operations involve 
removing the metadata for the database
+      from the metastore, and deleting the corresponding <codeph>*.db</codeph> 
directory from HDFS.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+<codeblock>DROP (DATABASE|SCHEMA) [IF EXISTS] <varname>database_name</varname> 
<ph rev="2.3.0">[RESTRICT | CASCADE]</ph>;</codeblock>
+
+    <p conref="../shared/impala_common.xml#common/ddl_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p>
+      By default, the database must be empty before it can be dropped, to 
avoid losing any data.
+    </p>
+
+    <p rev="2.3.0">
+      In <keyword keyref="impala23_full"/> and higher, you can include the 
<codeph>CASCADE</codeph>
+      clause to make Impala drop all tables and other objects in the database 
before dropping the database itself.
+      The <codeph>RESTRICT</codeph> clause enforces the original requirement 
that the database be empty
+      before being dropped. Because the <codeph>RESTRICT</codeph> behavior is 
still the default, this
+      clause is optional.
+    </p>
+
+    <p rev="2.3.0">
+      The automatic dropping resulting from the <codeph>CASCADE</codeph> 
clause follows the same rules as the
+      corresponding <codeph>DROP TABLE</codeph>, <codeph>DROP VIEW</codeph>, 
and <codeph>DROP FUNCTION</codeph> statements.
+      In particular, the HDFS directories and data files for any external 
tables are left behind when the
+      tables are removed.
+    </p>
+
+    <p>
+      When you do not use the <codeph>CASCADE</codeph> clause, drop or move 
all the objects inside the database manually
+      before dropping the database itself:
+    </p>
+
+    <ul>
+      <li>
+        <p>
+          Use the <codeph>SHOW TABLES</codeph> statement to locate all tables 
and views in the database,
+          and issue <codeph>DROP TABLE</codeph> and <codeph>DROP VIEW</codeph> 
statements to remove them all.
+        </p>
+      </li>
+      <li>
+        <p>
+          Use the <codeph>SHOW FUNCTIONS</codeph> and <codeph>SHOW AGGREGATE 
FUNCTIONS</codeph> statements
+          to locate all user-defined functions in the database, and issue 
<codeph>DROP FUNCTION</codeph>
+          and <codeph>DROP AGGREGATE FUNCTION</codeph> statements to remove 
them all.
+        </p>
+      </li>
+      <li>
+        <p>
+          To keep tables or views contained by a database while removing the 
database itself, use
+          <codeph>ALTER TABLE</codeph> and <codeph>ALTER VIEW</codeph> to move 
the relevant
+          objects to a different database before dropping the original 
database.
+        </p>
+      </li>
+    </ul>
+
+    <p>
+      You cannot drop the current database, that is, the database your session 
connected to
+      either through the <codeph>USE</codeph> statement or the 
<codeph>-d</codeph> option of <cmdname>impala-shell</cmdname>.
+      Issue a <codeph>USE</codeph> statement to switch to a different database 
first.
+      Because the <codeph>default</codeph> database is always available, 
issuing
+      <codeph>USE default</codeph> is a convenient way to leave the current 
database
+      before dropping it.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/hive_blurb"/>
+
+    <p>
+      When you drop a database in Impala, the database can no longer be used 
by Hive.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+<!-- Better to conref the same examples in both places. -->
+
+    <p>
+      See <xref href="impala_create_database.xml#create_database"/> for 
examples covering <codeph>CREATE
+      DATABASE</codeph>, <codeph>USE</codeph>, and <codeph>DROP 
DATABASE</codeph>.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/s3_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/s3_ddl"/>
+
+    <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/>
+
+    <p conref="../shared/impala_common.xml#common/permissions_blurb"/>
+    <p rev="CDH-19187">
+      The user ID that the <cmdname>impalad</cmdname> daemon runs under,
+      typically the <codeph>impala</codeph> user, must have write
+      permission for the directory associated with the database.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+    <codeblock 
conref="../shared/impala_common.xml#common/create_drop_db_example"/>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      <xref href="impala_databases.xml#databases"/>, <xref 
href="impala_create_database.xml#create_database"/>,
+      <xref href="impala_use.xml#use"/>, <xref 
href="impala_show.xml#show_databases"/>, <xref 
href="impala_drop_table.xml#drop_table"/>
+    </p>
+  </conbody>
+</concept>

[34/51] [partial] incubator-impala git commit: IMPALA-3398: Add docs to main Impala branch.

Reply via email to