5.10 versions.

jrussell Tue, 01 Nov 2016 16:14:59 -0700

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_jdbc.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_jdbc.xml b/docs/topics/impala_jdbc.xml
index 8a7a955..ef5e9db 100644
--- a/docs/topics/impala_jdbc.xml
+++ b/docs/topics/impala_jdbc.xml
@@ -2,7 +2,18 @@
 <concept id="impala_jdbc">
 
   <title id="jdbc">Configuring Impala to Work with JDBC</title>
-  
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="JDBC"/>
+      <data name="Category" value="Java"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Querying"/>
+      <data name="Category" value="Configuring"/>
+      <data name="Category" value="Starting and Stopping"/>
+      <data name="Category" value="Developers"/>
+    </metadata>
+  </prolog>
 
   <conbody>
 
@@ -14,8 +25,366 @@
       with various database products.
     </p>
 
-    
+    <p>
+      Setting up a JDBC connection to Impala involves the following steps:
+    </p>
+
+    <ul>
+      <li>
+        Verifying the communication port where the Impala daemons in your 
cluster are listening for incoming JDBC
+        requests.
+      </li>
+
+      <li>
+        Installing the JDBC driver on every system that runs the JDBC-enabled 
application.
+      </li>
+
+      <li>
+        Specifying a connection string for the JDBC application to access one 
of the servers running the
+        <cmdname>impalad</cmdname> daemon, with the appropriate security 
settings.
+      </li>
+    </ul>
+
+    <p outputclass="toc inpage"/>
+  </conbody>
+
+  <concept id="jdbc_port">
+
+    <title>Configuring the JDBC Port</title>
+
+    <conbody>
+
+      <p>
+        The default port used by JDBC 2.0 and later (as well as ODBC 2.x) is 
21050. Impala server accepts JDBC
+        connections through this same port 21050 by default. Make sure this 
port is available for communication
+        with other hosts on your network, for example, that it is not blocked 
by firewall software. If your JDBC
+        client software connects to a different port, specify that alternative 
port number with the
+        <codeph>--hs2_port</codeph> option when starting 
<codeph>impalad</codeph>. See
+        <xref href="impala_processes.xml#processes"/> for details about Impala 
startup options. See
+        <xref href="impala_ports.xml#ports"/> for information about all ports 
used for communication between Impala
+        and clients or between Impala components.
+      </p>
+    </conbody>
+  </concept>
+
+  <concept id="jdbc_driver_choice">
+
+    <title>Choosing the JDBC Driver</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Planning"/>
+    </metadata>
+  </prolog>
+
+    <conbody>
+
+      <p>
+        In Impala 2.0 and later, you have the choice between the Cloudera JDBC 
Connector and the Hive 0.13 JDBC driver.
+        Cloudera recommends using the Cloudera JDBC Connector where practical.
+      </p>
+
+      <p>
+        If you are already using JDBC applications with an earlier Impala 
release, you must update your JDBC driver
+        to one of these choices, because the Hive 0.12 driver that was 
formerly the only choice is not compatible
+        with Impala 2.0 and later.
+      </p>
+
+      <p>
+        Both the Cloudera JDBC 2.5 Connector and the Hive JDBC driver provide 
a substantial speed increase for JDBC
+        applications with Impala 2.0 and higher, for queries that return large 
result sets.
+      </p>
+
+      <p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
+
+      <p conref="../shared/impala_common.xml#common/jdbc_odbc_complex_types"/>
+      <p 
conref="../shared/impala_common.xml#common/jdbc_odbc_complex_types_views"/>
+
     </conbody>
   </concept>
 
+  <concept id="jdbc_setup">
+
+    <title>Enabling Impala JDBC Support on Client Systems</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Installing"/>
+    </metadata>
+  </prolog>
+
+    <conbody>
+
+      <section id="install_jdbc_connector">
+        <title>Using the Cloudera JDBC Connector (recommended)</title>
+
+        <p>
+          You download and install the Cloudera JDBC 2.5 connector on any 
Linux, Windows, or Mac system where you
+          intend to run JDBC-enabled applications. From the
+          <xref href="http://go.cloudera.com/odbc-driver-hive-impala.html"; 
scope="external" format="html">Cloudera
+          Connectors download page</xref>, you choose the appropriate protocol 
(JDBC or ODBC) and target product
+          (Impala or Hive). The ease of downloading and installing on non-CDH 
systems makes this connector a
+          convenient choice for organizations with heterogeneous environments.
+        </p>
+
+      </section>
+
+      <section id="install_hive_driver">
+        <title>Using the Hive JDBC Driver</title>
+        <p>
+          You install the Hive JDBC driver (<codeph>hive-jdbc</codeph> 
package) through the Linux package manager, on
+          hosts within the CDH cluster. The driver consists of several Java 
JAR files. The same driver can be used by Impala and Hive.
+        </p>
+
+        <p>
+          To get the JAR files, install the Hive JDBC driver on each 
CDH-enabled host in the cluster that will run
+          JDBC applications. Follow the instructions for
+          <xref 
href="http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_hive_jdbc_install.html";
 scope="external" format="html">CDH
+          5</xref> or
+          <xref 
href="http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_Installing_hive_JDBC.html";
 scope="external" format="html">CDH
+          4</xref>.
+        </p>
+
+        <note>
+          The latest JDBC driver, corresponding to Hive 0.13, provides 
substantial performance improvements for
+          Impala queries that return large result sets. Impala 2.0 and later 
are compatible with the Hive 0.13
+          driver. If you already have an older JDBC driver installed, and are 
running Impala 2.0 or higher, consider
+          upgrading to the latest Hive JDBC driver for best performance with 
JDBC applications.
+        </note>
+
+        <p>
+          If you are using JDBC-enabled applications on hosts outside the CDH 
cluster, you cannot use the CDH install
+          procedure on the non-CDH hosts. Install the JDBC driver on at least 
one CDH host using the preceding
+          procedure. Then download the JAR files to each client machine that 
will use JDBC with Impala:
+        </p>
+
+  <codeblock>commons-logging-X.X.X.jar
+  hadoop-common.jar
+  hive-common-X.XX.X-cdhX.X.X.jar
+  hive-jdbc-X.XX.X-cdhX.X.X.jar
+  hive-metastore-X.XX.X-cdhX.X.X.jar
+  hive-service-X.XX.X-cdhX.X.X.jar
+  httpclient-X.X.X.jar
+  httpcore-X.X.X.jar
+  libfb303-X.X.X.jar
+  libthrift-X.X.X.jar
+  log4j-X.X.XX.jar
+  slf4j-api-X.X.X.jar
+  slf4j-logXjXX-X.X.X.jar
+  </codeblock>
+
+        <p>
+          <b>To enable JDBC support for Impala on the system where you run the 
JDBC application:</b>
+        </p>
+
+        <ol>
+          <li>
+            Download the JAR files listed above to each client machine.
+  <!--
+            Download the
+            <xref 
href="https://downloads.cloudera.com/impala-jdbc/impala-jdbc-0.5-2.zip"; 
scope="external" format="zip">Impala
+            JDBC zip file</xref> to the client machine that you will use to 
connect to Impala servers.
+  -->
+            <note>
+              For Maven users, see
+              <xref 
href="https://github.com/onefoursix/Cloudera-Impala-JDBC-Example"; 
scope="external" format="html">this
+              sample github page</xref> for an example of the dependencies you 
could add to a <codeph>pom</codeph>
+              file instead of downloading the individual JARs.
+            </note>
+          </li>
+
+          <li>
+            Store the JAR files in a location of your choosing, ideally a 
directory already referenced in your
+            <codeph>CLASSPATH</codeph> setting. For example:
+            <ul>
+              <li>
+                On Linux, you might use a location such as
+                <codeph>/</codeph><codeph>opt</codeph><codeph>/jars/</codeph>.
+              </li>
+
+              <li>
+                On Windows, you might use a subdirectory underneath 
<filepath>C:\Program Files</filepath>.
+              </li>
+            </ul>
+          </li>
+
+          <li>
+            To successfully load the Impala JDBC driver, client programs must 
be able to locate the associated JAR
+            files. This often means setting the <codeph>CLASSPATH</codeph> for 
the client process to include the
+            JARs. Consult the documentation for your JDBC client for more 
details on how to install new JDBC drivers,
+            but some examples of how to set <codeph>CLASSPATH</codeph> 
variables include:
+            <ul>
+              <li>
+                On Linux, if you extracted the JARs to 
<codeph>/opt/jars/</codeph>, you might issue the following
+                command to prepend the JAR files path to an existing classpath:
+  <codeblock>export CLASSPATH=/opt/jars/*.jar:$CLASSPATH</codeblock>
+              </li>
+
+              <li>
+                On Windows, use the <b>System Properties</b> control panel 
item to modify the <b>Environment
+                Variables</b> for your system. Modify the environment 
variables to include the path to which you
+                extracted the files.
+                <note>
+                  If the existing <codeph>CLASSPATH</codeph> on your client 
machine refers to some older version of
+                  the Hive JARs, ensure that the new JARs are the first ones 
listed. Either put the new JAR files
+                  earlier in the listings, or delete the other references to 
Hive JAR files.
+                </note>
+              </li>
+            </ul>
+          </li>
+        </ol>
+      </section>
+
+    </conbody>
+  </concept>
+
+  <concept id="jdbc_connect">
+
+    <title>Establishing JDBC Connections</title>
+
+    <conbody>
+
+      <p>
+        The JDBC driver class depends on which driver you select.
+      </p>
+
+      <note conref="../shared/impala_common.xml#common/proxy_jdbc_caveat"/>
+
+      <section id="class_jdbc_connector">
+
+        <title>Using the Cloudera JDBC Connector (recommended)</title>
+
+        <p>
+          Depending on the level of the JDBC API your application is 
targeting, you can use
+          the following fully-qualified class names (FQCNs):
+        </p>
+
+        <ul>
+          <li><codeph>com.cloudera.impala.jdbc41.Driver</codeph></li>
+          <li><codeph>com.cloudera.impala.jdbc41.DataSource</codeph></li>
+        </ul>
+
+        <ul>
+          <li><codeph>com.cloudera.impala.jdbc4.Driver</codeph></li>
+          <li><codeph>com.cloudera.impala.jdbc4.DataSource</codeph></li>
+        </ul>
+
+        <ul>
+          <li><codeph>com.cloudera.impala.jdbc3.Driver</codeph></li>
+          <li><codeph>com.cloudera.impala.jdbc3.DataSource</codeph></li>
+        </ul>
+
+        <p>
+          The connection string has the following format:
+        </p>
+
+<codeblock>jdbc:impala://<varname>Host</varname>:<varname>Port</varname>[/<varname>Schema</varname>];<varname>Property1</varname>=<varname>Value</varname>;<varname>Property2</varname>=<varname>Value</varname>;...</codeblock>
+
+        <p>
+          The <codeph>port</codeph> value is typically 21050 for Impala.
+        </p>
+
+        <p>
+          For full details about the classes and the connection string 
(especially the property values available
+          for the connection string), download the appropriate driver 
documentation for your platform from
+          <xref 
href="http://www.cloudera.com/content/cloudera/en/downloads/connectors/impala/jdbc/impala-jdbc-v2-5-5.html";
 scope="external" format="html">the Impala JDBC Connector download page</xref>.
+        </p>
+
+      </section>
+
+      <section id="class_hive_driver">
+      <title>Using the Hive JDBC Driver</title>
+
+      <p>
+        For example, with the Hive JDBC driver, the class name is 
<codeph>org.apache.hive.jdbc.HiveDriver</codeph>.
+        Once you have configured Impala to work with JDBC, you can establish 
connections between the two.
+        To do so for a cluster that does not use
+        Kerberos authentication, use a connection string of the form
+        
<codeph>jdbc:hive2://<varname>host</varname>:<varname>port</varname>/;auth=noSasl</codeph>.
+<!--
+        Include the <codeph>auth=noSasl</codeph> argument
+        only when connecting to a non-Kerberos cluster; if Kerberos is 
enabled, omit the <codeph>auth</codeph> argument.
+-->
+        For example, you might use:
+      </p>
+
+<codeblock>jdbc:hive2://myhost.example.com:21050/;auth=noSasl</codeblock>
+
+      <p>
+        To connect to an instance of Impala that requires Kerberos 
authentication, use a connection string of the
+        form
+        
<codeph>jdbc:hive2://<varname>host</varname>:<varname>port</varname>/;principal=<varname>principal_name</varname></codeph>.
+        The principal must be the same user principal you used when starting 
Impala. For example, you might use:
+      </p>
+
+<codeblock>jdbc:hive2://myhost.example.com:21050/;principal=impala/[email protected]</codeblock>
+
+      <p>
+        To connect to an instance of Impala that requires LDAP authentication, 
use a connection string of the form
+        
<codeph>jdbc:hive2://<varname>host</varname>:<varname>port</varname>/<varname>db_name</varname>;user=<varname>ldap_userid</varname>;password=<varname>ldap_password</varname></codeph>.
+        For example, you might use:
+      </p>
+
+<codeblock>jdbc:hive2://myhost.example.com:21050/test_db;user=fred;password=xyz123</codeblock>
+
+      <note>
+          <p 
conref="../shared/impala_common.xml#common/hive_jdbc_ssl_kerberos_caveat"/>
+      </note>
+
+      </section>
+
+    </conbody>
+  </concept>
+
+  <concept rev="2.3.0" id="jdbc_odbc_notes">
+    <title>Notes about JDBC and ODBC Interaction with Impala SQL 
Features</title>
+    <conbody>
+      <p>
+        Most Impala SQL features work equivalently through the 
<cmdname>impala-shell</cmdname> interpreter
+        of the JDBC or ODBC APIs. The following are some exceptions to keep in 
mind when switching between
+        the interactive shell and applications using the APIs:
+      </p>
+      <ul>
+        <li>
+          <p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
+          <ul>
+          <li>
+          <p>
+            Queries involving the complex types (<codeph>ARRAY</codeph>, 
<codeph>STRUCT</codeph>, and <codeph>MAP</codeph>)
+            require notation that might not be available in all levels of JDBC 
and ODBC drivers.
+            If you have trouble querying such a table due to the driver level 
or
+            inability to edit the queries used by the application, you can 
create a view that exposes
+            a <q>flattened</q> version of the complex columns and point the 
application at the view.
+            See <xref href="impala_complex_types.xml#complex_types"/> for 
details.
+          </p>
+        </li>
+        <li>
+          <p>
+            The complex types available in CDH 5.5 / Impala 2.3 and higher are 
supported by the
+            JDBC <codeph>getColumns()</codeph> API.
+            Both <codeph>MAP</codeph> and <codeph>ARRAY</codeph> are reported 
as the JDBC SQL Type <codeph>ARRAY</codeph>,
+            because this is the closest matching Java SQL type. This behavior 
is consistent with Hive.
+            <codeph>STRUCT</codeph> types are reported as the JDBC SQL Type 
<codeph>STRUCT</codeph>.
+          </p>
+          <p>
+            To be consistent with Hive's behavior, the TYPE_NAME field is 
populated
+            with the primitive type name for scalar types, and with the full 
<codeph>toSql()</codeph>
+            for complex types. The resulting type names are somewhat 
inconsistent,
+            because nested types are printed differently than top-level types. 
For example,
+            the following list shows how <codeph>toSQL()</codeph> for Impala 
types are
+            translated to <codeph>TYPE_NAME</codeph> values:
+<codeblock><![CDATA[DECIMAL(10,10)         becomes  DECIMAL
+CHAR(10)               becomes  CHAR
+VARCHAR(10)            becomes  VARCHAR
+ARRAY<DECIMAL(10,10)>  becomes  ARRAY<DECIMAL(10,10)>
+ARRAY<CHAR(10)>        becomes  ARRAY<CHAR(10)>
+ARRAY<VARCHAR(10)>     becomes  ARRAY<VARCHAR(10)>
+]]>
+</codeblock>
+          </p>
+          </li>
+        </ul>
+        </li>
+      </ul>
+    </conbody>
+  </concept>
 
+</concept>


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_joins.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_joins.xml b/docs/topics/impala_joins.xml
index 011a488..0e807e8 100644
--- a/docs/topics/impala_joins.xml
+++ b/docs/topics/impala_joins.xml
@@ -3,7 +3,7 @@
 <concept id="joins">
 
   <title>Joins in Impala SELECT Statements</title>
-  <titlealts><navtitle>Joins</navtitle></titlealts>
+  <titlealts audience="PDF"><navtitle>Joins</navtitle></titlealts>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -473,6 +473,20 @@ Returned 1 row(s) in 1.00s</codeblock>
       <xref href="impala_hints.xml#hints"/>.
     </p>
 
+    <p rev="2.5.0">
+      <b>Handling NULLs in Join Columns:</b>
+    </p>
+
+    <p rev="2.5.0">
+      By default, join key columns do not match if either one contains a 
<codeph>NULL</codeph> value.
+      To treat such columns as equal if both contain <codeph>NULL</codeph>, 
you can use an expression
+      such as <codeph>A = B OR (A IS NULL AND B IS NULL)</codeph>.
+      In CDH 5.7 / Impala 2.5 and higher, the <codeph>&lt;=&gt;</codeph> 
operator (shorthand for
+      <codeph>IS NOT DISTINCT FROM</codeph>) performs the same comparison in a 
concise and efficient form.
+      The <codeph>&lt;=&gt;</codeph> operator is more efficient in for 
comparing join keys in a <codeph>NULL</codeph>-safe
+      manner, because the operator can use a hash join while the 
<codeph>OR</codeph> expression cannot.
+    </p>
+
     <p conref="../shared/impala_common.xml#common/example_blurb"/>
 
     <p>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_kudu.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_kudu.xml b/docs/topics/impala_kudu.xml
index 5b8e87c..c530cc1 100644
--- a/docs/topics/impala_kudu.xml
+++ b/docs/topics/impala_kudu.xml
@@ -4,7 +4,16 @@
 
   <title>Using Impala to Query Kudu Tables</title>
 
-  
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Kudu"/>
+      <data name="Category" value="Querying"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Developers"/>
+    </metadata>
+  </prolog>
+
   <conbody>
 
     <p>
@@ -16,9 +25,143 @@
       workloads (with key-based lookups for single rows or small ranges of 
values).
     </p>
 
-   
+    <p>
+      Certain Impala SQL statements, such as <codeph>UPDATE</codeph> and 
<codeph>DELETE</codeph>, only work with
+      Kudu tables. These operations were impractical from a performance 
perspective to perform at large scale on
+      HDFS data, or on HBase tables.
+    </p>
+
+  </conbody>
+
+  <concept id="kudu_benefits">
+
+    <title>Benefits of Using Kudu Tables with Impala</title>
+
+    <conbody>
+
+      <p>
+        The combination of Kudu and Impala works best for tables where scan 
performance is important, but data
+        arrives continuously, in small batches, or needs to be updated without 
being completely replaced. In these
+        scenarios (such as for streaming data), it might be impractical to use 
Parquet tables because Parquet works
+        best with multi-megabyte data files, requiring substantial overhead to 
replace or reorganize data files to
+        accomodate frequent additions or changes to data. Impala can query 
Kudu tables with scan performance close
+        to that of Parquet, and Impala can also perform update or delete 
operations without replacing the entire
+        table contents. You can also use the Kudu API to do ingestion or 
transformation operations outside of
+        Impala, and Impala can query the current data at any time.
+      </p>
+
     </conbody>
 
   </concept>
 
+  <concept id="kudu_primary_key">
+
+    <title>Primary Key Columns for Kudu Tables</title>
+
+    <conbody>
+
+      <p>
+        Kudu tables introduce the notion of primary keys to Impala for the 
first time. The primary key is made up
+        of one or more columns, whose values are combined and used as a lookup 
key during queries. These columns
+        cannot contain any <codeph>NULL</codeph> values or any duplicate 
values, and can never be updated. For a
+        partitioned Kudu table, all the partition key columns must come from 
the set of primary key columns.
+      </p>
+
+      <p>
+        Impala itself still does not have the notion of unique or 
non-<codeph>NULL</codeph> constraints. These
+        restrictions on the primary key columns are enforced on the Kudu side.
+      </p>
+
+      <p>
+        The primary key columns must be the first ones specified in the 
<codeph>CREATE TABLE</codeph> statement.
+        You specify which column or columns make up the primary key in the 
table properties, rather than through
+        attributes in the column list.
+      </p>
+
+      <p>
+        Kudu can do extra optimizations for queries that refer to the primary 
key columns in the
+        <codeph>WHERE</codeph> clause. It is not crucial though to include the 
primary key columns in the
+        <codeph>WHERE</codeph> clause of every query. The benefit is mainly 
for partitioned tables,
+        which divide the data among various tablet servers based on the 
distribution of
+        data values in some or all of the primary key columns.
+      </p>
+
+    </conbody>
+
+  </concept>
+
+  <concept id="kudu_dml">
+
+    <title>Impala DML Support for Kudu Tables</title>
+
+    <conbody>
+
+      <p>
+        Impala supports certain DML statements for Kudu tables only. The 
<codeph>UPDATE</codeph> and
+        <codeph>DELETE</codeph> statements let you modify data within Kudu 
tables without rewriting substantial
+        amounts of table data.
+      </p>
+
+      <p>
+        The <codeph>INSERT</codeph> statement for Kudu tables honors the 
unique and non-<codeph>NULL</codeph>
+        requirements for the primary key columns.
+      </p>
+
+      <p>
+        Because Impala and Kudu do not support transactions, the effects of 
any <codeph>INSERT</codeph>,
+        <codeph>UPDATE</codeph>, or <codeph>DELETE</codeph> statement are 
immediately visible. For example, you
+        cannot do a sequence of <codeph>UPDATE</codeph> statements and only 
make the change visible after all the
+        statements are finished. Also, if a DML statement fails partway 
through, any rows that were already
+        inserted, deleted, or changed remain in the table; there is no 
rollback mechanism to undo the changes.
+      </p>
+
+    </conbody>
+
+  </concept>
+
+  <concept id="kudu_partitioning">
+
+    <title>Partitioning for Kudu Tables</title>
+
+    <conbody>
+
+      <p>
+        Kudu tables use special mechanisms to evenly distribute data among the 
underlying tablet servers. Although
+        we refer to such tables as partitioned tables, they are distinguished 
from traditional Impala partitioned
+        tables by use of different clauses on the <codeph>CREATE 
TABLE</codeph> statement. Partitioned Kudu tables
+        use <codeph>DISTRIBUTE BY</codeph>, <codeph>HASH</codeph>, 
<codeph>RANGE</codeph>, and <codeph>SPLIT
+        ROWS</codeph> clauses rather than the traditional <codeph>PARTITIONED 
BY</codeph> clause. All of the
+        columns involved in these clauses must be primary key columns. These 
clauses let you specify different ways
+        to divide the data for each column, or even for different value ranges 
within a column. This flexibility
+        lets you avoid problems with uneven distribution of data, where the 
partitioning scheme for HDFS tables
+        might result in some partitions being much larger than others. By 
setting up an effective partitioning
+        scheme for a Kudu table, you can ensure that the work for a query can 
be parallelized evenly across the
+        hosts in a cluster.
+      </p>
+
+    </conbody>
+
+  </concept>
+
+  <concept id="kudu_performance">
+
+    <title>Impala Query Performance for Kudu Tables</title>
+
+    <conbody>
+
+      <p>
+        For queries involving Kudu tables, Impala can delegate much of the 
work of filtering the result set to
+        Kudu, avoiding some of the I/O involved in full table scans of tables 
containing HDFS data files. This type
+        of optimization is especially effective for partitioned Kudu tables, 
where the Impala query
+        <codeph>WHERE</codeph> clause refers to one or more primary key 
columns that are also used as partition key
+        columns. For example, if a partitioned Kudu table uses a 
<codeph>HASH</codeph> clause for
+        <codeph>col1</codeph> and a <codeph>RANGE</codeph> clause for 
<codeph>col2</codeph>, a query using a clause
+        such as <codeph>WHERE col1 IN (1,2,3) AND col2 &gt; 100</codeph> can 
determine exactly which tablet servers
+        contain relevant data, and therefore parallelize the query very 
efficiently.
+      </p>
+
+    </conbody>
+
+  </concept>
 
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_langref.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_langref.xml b/docs/topics/impala_langref.xml
index aaa76aa..f81b76f 100644
--- a/docs/topics/impala_langref.xml
+++ b/docs/topics/impala_langref.xml
@@ -2,8 +2,8 @@
 <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
 <concept id="langref">
 
-  <title><ph audience="PDF">Impala SQL Language Reference</ph><ph 
audience="HTML">Overview of Impala SQL</ph></title>
-
+  <title>Impala SQL Language Reference</title>
+  <titlealts audience="PDF"><navtitle>SQL Reference</navtitle></titlealts>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -17,163 +17,58 @@
   <conbody>
 
     <p>
-      Impala uses SQL as its query language. Impala interprets SQL statements 
and performs the
-      full end-to-end processing for each statement. (As opposed to acting as 
a translation
-      layer for some other Hadoop subsystem.)
+      Impala uses SQL as its query language. To protect user investment in 
skills development and query
+      design, Impala provides a high degree of compatibility with the Hive 
Query Language (HiveQL):
+    </p>
+
+    <ul>
+      <li>
+        Because Impala uses the same metadata store as Hive to record 
information about table structure and
+        properties, Impala can access tables defined through the native Impala 
<codeph>CREATE TABLE</codeph>
+        command, or tables created using the Hive data definition language 
(DDL).
+      </li>
+
+      <li>
+        Impala supports data manipulation (DML) statements similar to the DML 
component of HiveQL.
+      </li>
+
+      <li>
+        Impala provides many <xref 
href="impala_functions.xml#builtins">built-in functions</xref> with the same
+        names and parameter types as their HiveQL equivalents.
+      </li>
+    </ul>
+
+    <p>
+      Impala supports most of the same <xref 
href="impala_langref_sql.xml#langref_sql">statements and
+      clauses</xref> as HiveQL, including, but not limited to 
<codeph>JOIN</codeph>, <codeph>AGGREGATE</codeph>,
+      <codeph>DISTINCT</codeph>, <codeph>UNION ALL</codeph>, <codeph>ORDER 
BY</codeph>, <codeph>LIMIT</codeph> and
+      (uncorrelated) subquery in the <codeph>FROM</codeph> clause. Impala also 
supports <codeph>INSERT
+      INTO</codeph> and <codeph>INSERT OVERWRITE</codeph>.
+    </p>
+
+    <p>
+      Impala supports data types with the same names and semantics as the 
equivalent Hive data types:
+      <codeph>STRING</codeph>, <codeph>TINYINT</codeph>, 
<codeph>SMALLINT</codeph>, <codeph>INT</codeph>,
+      <codeph>BIGINT</codeph>, <codeph>FLOAT</codeph>, 
<codeph>DOUBLE</codeph>, <codeph>BOOLEAN</codeph>,
+      <codeph>STRING</codeph>, <codeph>TIMESTAMP</codeph>.
     </p>
 
     <p>
-      Impala implements many familiar statements, such as <codeph>CREATE 
TABLE</codeph>,
-      <codeph>INSERT</codeph>, and <codeph>SELECT</codeph>. Currently, the DML 
statements
-      <codeph>UPDATE</codeph> and <codeph>DELETE</codeph> are not available in 
the production
-      level of Impala, because big data analytics with Hadoop and HDFS 
typically involves
-      unchanging data. <codeph>UPDATE</codeph> and <codeph>DELETE</codeph> 
<i>are</i> available
-      in beta form in the version of Impala used with the Kudu storage layer. 
For full details
-      about Impala SQL syntax and semantics, see
+      For full details about Impala SQL syntax and semantics, see
       <xref href="impala_langref_sql.xml#langref_sql"/>.
     </p>
 
     <p>
-      Queries include clauses such as <codeph>WHERE</codeph>, <codeph>GROUP 
BY</codeph>,
-      <codeph>ORDER BY</codeph>, and <codeph>JOIN</codeph>. For information 
about query syntax,
-      see <xref href="impala_select.xml#select"/>.
+      Most HiveQL <codeph>SELECT</codeph> and <codeph>INSERT</codeph> 
statements run unmodified with Impala. For
+      information about Hive syntax not available in Impala, see
+      <xref href="impala_langref_unsupported.xml#langref_hiveql_delta"/>.
     </p>
 
     <p>
-      Queries can also include function calls, to scalar functions such as
-      <codeph>sin()</codeph> and <codeph>substr()</codeph>, aggregate 
functions such as
-      <codeph>count()</codeph> and <codeph>avg()</codeph>, and analytic 
functions such as
-      <codeph>lag()</codeph> and <codeph>rank()</codeph>. For a list of the 
built-in functions
-      available in Impala queries, see <xref 
href="impala_functions.xml#builtins"/>.
+      For a list of the built-in functions available in Impala queries, see
+      <xref href="impala_functions.xml#builtins"/>.
     </p>
 
     <p outputclass="toc"/>
-
   </conbody>
-
-  <concept id="langref_performance">
-
-    <title>Performance Features</title>
-
-    <conbody>
-
-      <p>
-        The main performance-related SQL features for Impala are:
-      </p>
-
-      <ul>
-        <li>
-          <p>
-            The <codeph>COMPUTE STATS</codeph> statement, and the underlying 
table statistics
-            and column statistics used in query planning. The statistics are 
used to estimate
-            the number of rows and size of the result set for queries, 
subqueries, and the
-            different <q>sides</q> of a join query.
-          </p>
-        </li>
-
-        <li>
-          <p>
-            The output of the <codeph>EXPLAIN</codeph> statement. It outlines 
the ways in which
-            the query is parallelized, and how much I/O, memory, and so on the 
query expects to
-            use. You can control the level of detail in the output through a 
query option.
-          </p>
-        </li>
-
-        <li>
-          <p>
-            Partitioning for tables. By organizing the data for efficient 
access along one or
-            more dimensions, this technique lets queries read only the 
relevant data.
-          </p>
-        </li>
-
-        <li>
-          <p>
-            Query hints, especially for join queries. Impala selects from 
different join
-            algorithms based on the relative sizes of the result sets for each 
side of the join.
-            In cases where you know the most effective technique for a 
particular query, you can
-            override the estimates that Impala uses to make that choice, and 
select the join
-            technique directly.
-          </p>
-        </li>
-
-        <li>
-          <p>
-            Query options. These options control settings that can influence 
the performance of
-            individual queries when you know the special considerations based 
on your workload,
-            hardware configuration, or data distribution.
-          </p>
-        </li>
-      </ul>
-
-      <p>
-        Because analytic queries against high volumes of data tend to require 
full scans against
-        large portions of data from each table, Impala does not include 
index-related SQL
-        statements such as <codeph>CREATE INDEX</codeph>. The <codeph>COMPUTE 
STATS</codeph>
-        serves the purpose of analyzing the distribution of data within each 
column and the
-        overall table. Partitioning optimizes the physical layout of the data 
for queries that
-        filter on one or more crucial columns.
-      </p>
-
-    </conbody>
-
-  </concept>
-
-  <concept id="hive_interoperability">
-
-    <title>Sharing Tables, Data, and Queries Between Impala and Hive</title>
-
-    <conbody>
-
-      <p>
-        To protect user investment in skills development and query design, 
Impala provides a
-        high degree of compatibility with the Hive Query Language (HiveQL):
-      </p>
-
-      <ul>
-        <li>
-          Because Impala uses the same metadata store as Hive to record 
information about table
-          structure and properties, Impala can access tables defined through 
the native Impala
-          <codeph>CREATE TABLE</codeph> command, or tables created using the 
Hive data
-          definition language (DDL).
-        </li>
-
-        <li>
-          Impala supports data manipulation (DML) statements similar to the 
DML component of
-          HiveQL.
-        </li>
-
-        <li>
-          Impala provides many <xref 
href="impala_functions.xml#builtins">built-in
-          functions</xref> with the same names and parameter types as their 
HiveQL equivalents.
-        </li>
-      </ul>
-
-      <p>
-        Impala supports most of the same
-        <xref href="impala_langref_sql.xml#langref_sql">statements and 
clauses</xref> as HiveQL,
-        including, but not limited to <codeph>JOIN</codeph>, 
<codeph>AGGREGATE</codeph>,
-        <codeph>DISTINCT</codeph>, <codeph>UNION ALL</codeph>, <codeph>ORDER 
BY</codeph>,
-        <codeph>LIMIT</codeph> and (uncorrelated) subquery in the 
<codeph>FROM</codeph> clause.
-        Impala also supports <codeph>INSERT INTO</codeph> and <codeph>INSERT 
OVERWRITE</codeph>.
-      </p>
-
-      <p>
-        Impala supports data types with the same names and semantics as the 
equivalent Hive data
-        types: <codeph>STRING</codeph>, <codeph>TINYINT</codeph>, 
<codeph>SMALLINT</codeph>,
-        <codeph>INT</codeph>, <codeph>BIGINT</codeph>, <codeph>FLOAT</codeph>,
-        <codeph>DOUBLE</codeph>, <codeph>BOOLEAN</codeph>, 
<codeph>STRING</codeph>,
-        <codeph>TIMESTAMP</codeph>. CDH 5.5 / Impala 2.3 and higher also 
include the complex
-        types <codeph>ARRAY</codeph>, <codeph>STRUCT</codeph>, and 
<codeph>MAP</codeph>.
-      </p>
-
-      <p>
-        Most HiveQL <codeph>SELECT</codeph> and <codeph>INSERT</codeph> 
statements run
-        unmodified with Impala. For information about Hive syntax not 
available in Impala, see
-        <xref href="impala_langref_unsupported.xml#langref_hiveql_delta"/>.
-      </p>
-
-    </conbody>
-
-  </concept>
-
 </concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_langref_sql.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_langref_sql.xml 
b/docs/topics/impala_langref_sql.xml
index d759e76..18b6726 100644
--- a/docs/topics/impala_langref_sql.xml
+++ b/docs/topics/impala_langref_sql.xml
@@ -3,7 +3,7 @@
 <concept id="langref_sql">
 
   <title>Impala SQL Statements</title>
-  <titlealts><navtitle>SQL Statements</navtitle></titlealts>
+  <titlealts audience="PDF"><navtitle>SQL Statements</navtitle></titlealts>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_langref_unsupported.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_langref_unsupported.xml 
b/docs/topics/impala_langref_unsupported.xml
index f2b0560..39043f3 100644
--- a/docs/topics/impala_langref_unsupported.xml
+++ b/docs/topics/impala_langref_unsupported.xml
@@ -43,12 +43,12 @@
         from HiveQL:
       </p>
 
-      <draft-comment translate="no">
-Yeesh, too many separate lists of unsupported Hive syntax.
-Here, the FAQ, and in some of the intro topics.
-Some discussion in IMP-1061 about how best to reorg.
-Lots of opportunities for conrefs.
-</draft-comment>
+      <!-- To do:
+        Yeesh, too many separate lists of unsupported Hive syntax.
+        Here, the FAQ, and in some of the intro topics.
+        Some discussion in IMP-1061 about how best to reorg.
+        Lots of opportunities for conrefs.
+      -->
 
       <ul>
 <!-- Now supported in CDH 5.5 / Impala 2.3 and higher. Find places on this 
page (like already done under lateral views) to note the new data type support.
@@ -61,6 +61,10 @@ Lots of opportunities for conrefs.
           Extensibility mechanisms such as <codeph>TRANSFORM</codeph>, custom 
file formats, or custom SerDes.
         </li>
 
+        <li rev="CDH-41376">
+          The <codeph>DATE</codeph> data type.
+        </li>
+
         <li>
           XML and JSON functions.
         </li>
@@ -96,16 +100,26 @@ Lots of opportunities for conrefs.
         for full details on Impala UDFs.
         <ul>
           <li>
-            Impala supports high-performance UDFs written in C++, as well as 
reusing some Java-based Hive UDFs.
+            <p>
+              Impala supports high-performance UDFs written in C++, as well as 
reusing some Java-based Hive UDFs.
+            </p>
+          </li>
+
+          <li>
+            <p>
+              Impala supports scalar UDFs and user-defined aggregate functions 
(UDAFs). Impala does not currently
+              support user-defined table generating functions (UDTFs).
+            </p>
           </li>
 
           <li>
-            Impala supports scalar UDFs and user-defined aggregate functions 
(UDAFs). Impala does not currently
-            support user-defined table generating functions (UDTFs).
+            <p>
+              Only Impala-supported column types are supported in Java-based 
UDFs.
+            </p>
           </li>
 
           <li>
-            Only Impala-supported column types are supported in Java-based 
UDFs.
+            <p 
conref="../shared/impala_common.xml#common/current_user_caveat"/>
           </li>
         </ul>
       </p>
@@ -146,6 +160,12 @@ Lots of opportunities for conrefs.
         <li>
           <codeph>SHOW COLUMNS</codeph>
         </li>
+
+        <li rev="DOCS-656">
+          <codeph>INSERT OVERWRITE DIRECTORY</codeph>; use <codeph>INSERT 
OVERWRITE <varname>table_name</varname></codeph>
+          or <codeph>CREATE TABLE AS SELECT</codeph> to materialize query 
results into the HDFS directory associated
+          with an Impala table.
+        </li>
       </ul>
     </conbody>
   </concept>
@@ -167,7 +187,7 @@ Lots of opportunities for conrefs.
 
       <p>
         Impala utilizes the <xref href="http://sentry.incubator.apache.org/"; 
scope="external" format="html">Apache
-        Sentry (incubating)</xref> authorization framework, which provides 
fine-grained role-based access control
+        Sentry </xref> authorization framework, which provides fine-grained 
role-based access control
         to protect data against unauthorized access or tampering.
       </p>
 
@@ -265,13 +285,9 @@ Lots of opportunities for conrefs.
         </li>
 
         <li>
-          Impala does not return column overflows as <codeph>NULL</codeph>, so 
that customers can distinguish
-          between <codeph>NULL</codeph> data and overflow conditions similar 
to how they do so with traditional
-          database systems. Impala returns the largest or smallest value in 
the range for the type. For example,
-          valid values for a <codeph>tinyint</codeph> range from -128 to 127. 
In Impala, a <codeph>tinyint</codeph>
-          with a value of -200 returns -128 rather than <codeph>NULL</codeph>. 
A <codeph>tinyint</codeph> with a
-          value of 200 returns 127.
+          <p 
conref="../shared/impala_common.xml#common/int_overflow_behavior"/>
         </li>
+
       </ul>
 
       <p>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_limit.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_limit.xml b/docs/topics/impala_limit.xml
index c186cd4..ec12271 100644
--- a/docs/topics/impala_limit.xml
+++ b/docs/topics/impala_limit.xml
@@ -9,6 +9,8 @@
       <data name="Category" value="SQL"/>
       <data name="Category" value="Querying"/>
       <data name="Category" value="Reports"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
     </metadata>
   </prolog>
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_literals.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_literals.xml b/docs/topics/impala_literals.xml
index 3c53796..d84d84c 100644
--- a/docs/topics/impala_literals.xml
+++ b/docs/topics/impala_literals.xml
@@ -357,7 +357,7 @@ insert into t1 partition(x=NULL, y) select c1, c3  from 
some_other_table;</codeb
         <li rev="1.2.1">
           <p conref="../shared/impala_common.xml#common/null_sorting_change"/>
           <note>
-            <draft-comment translate="no"> Probably a bunch of similar 
view-related restrictions like this that should be collected, reused, or 
cross-referenced under the Views topic. </draft-comment>
+            <!-- To do: Probably a bunch of similar view-related restrictions 
like this that should be collected, reused, or cross-referenced under the Views 
topic. -->
             Because the <codeph>NULLS FIRST</codeph> and <codeph>NULLS 
LAST</codeph> keywords are not currently
             available in Hive queries, any views you create using those 
keywords will not be available through
             Hive.

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_live_progress.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_live_progress.xml 
b/docs/topics/impala_live_progress.xml
index f58cdcb..ef8e8c4 100644
--- a/docs/topics/impala_live_progress.xml
+++ b/docs/topics/impala_live_progress.xml
@@ -2,7 +2,8 @@
 <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
 <concept rev="2.3.0" id="live_progress">
 
-  <title>LIVE_PROGRESS Query Option</title>
+  <title>LIVE_PROGRESS Query Option (CDH 5.5 or higher only)</title>
+  <titlealts audience="PDF"><navtitle>LIVE_PROGRESS</navtitle></titlealts>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -11,12 +12,14 @@
       <data name="Category" value="Performance"/>
       <data name="Category" value="Reports"/>
       <data name="Category" value="impala-shell"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
     </metadata>
   </prolog>
 
   <conbody>
 
-    <p>
+    <p rev="2.3.0">
       <indexterm audience="Cloudera">LIVE_PROGRESS query option</indexterm>
       For queries submitted through the <cmdname>impala-shell</cmdname> 
command,
       displays an interactive progress bar showing roughly what percentage of
@@ -59,6 +62,8 @@
     <p 
conref="../shared/impala_common.xml#common/impala_shell_progress_reports_compute_stats_caveat"/>
     <p 
conref="../shared/impala_common.xml#common/impala_shell_progress_reports_shell_only_caveat"/>
 
+    <p conref="../shared/impala_common.xml#common/added_in_230"/>
+
     <p conref="../shared/impala_common.xml#common/example_blurb"/>
 <codeblock><![CDATA[[localhost:21000] > set live_progress=true;
 LIVE_PROGRESS set to true
@@ -69,8 +74,8 @@ LIVE_PROGRESS set to true
 | 150000   |
 +----------+
 [localhost:21000] > select count(*) from customer t1 cross join customer t2;
-[##################################################                            
                      ] 50%
-[####################################################################################################]
 100%
+[###################################                                   ] 50%
+[######################################################################] 100%
 
 ]]>
 </codeblock>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_live_summary.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_live_summary.xml 
b/docs/topics/impala_live_summary.xml
index bfe71bf..42fe484 100644
--- a/docs/topics/impala_live_summary.xml
+++ b/docs/topics/impala_live_summary.xml
@@ -2,7 +2,8 @@
 <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
 <concept rev="2.3.0" id="live_summary">
 
-  <title>LIVE_SUMMARY Query Option</title>
+  <title>LIVE_SUMMARY Query Option (CDH 5.5 or higher only)</title>
+  <titlealts audience="PDF"><navtitle>LIVE_SUMMARY</navtitle></titlealts>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -11,12 +12,14 @@
       <data name="Category" value="Performance"/>
       <data name="Category" value="Reports"/>
       <data name="Category" value="impala-shell"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
     </metadata>
   </prolog>
 
   <conbody>
 
-    <p>
+    <p rev="2.3.0">
       <indexterm audience="Cloudera">LIVE_SUMMARY query option</indexterm>
       For queries submitted through the <cmdname>impala-shell</cmdname> 
command,
       displays the same output as the <codeph>SUMMARY</codeph> command,
@@ -67,6 +70,8 @@
     <p 
conref="../shared/impala_common.xml#common/impala_shell_progress_reports_compute_stats_caveat"/>
     <p 
conref="../shared/impala_common.xml#common/impala_shell_progress_reports_shell_only_caveat"/>
 
+    <p conref="../shared/impala_common.xml#common/added_in_230"/>
+
     <p conref="../shared/impala_common.xml#common/example_blurb"/>
 
     <p>
@@ -197,7 +202,6 @@ Query: select count(*) from customer t1 cross join customer 
t2
 
[####################################################################################################]
 100%
 
+---------------------+--------+----------+----------+---------+------------+----------+---------------+-----------------------+
 | Operator            | #Hosts | Avg Time | Max Time | #Rows   | Est. #Rows | 
Peak Mem | Est. Peak Mem | Detail                |
-[localhost:21000] > 
 ]]>
 </codeblock>
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_load_data.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_load_data.xml b/docs/topics/impala_load_data.xml
index e3517f0..e9d94b5 100644
--- a/docs/topics/impala_load_data.xml
+++ b/docs/topics/impala_load_data.xml
@@ -3,7 +3,7 @@
 <concept rev="1.1" id="load_data">
 
   <title>LOAD DATA Statement</title>
-  <titlealts><navtitle>LOAD DATA</navtitle></titlealts>
+  <titlealts audience="PDF"><navtitle>LOAD DATA</navtitle></titlealts>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -15,6 +15,7 @@
       <data name="Category" value="Developers"/>
       <data name="Category" value="HDFS"/>
       <data name="Category" value="Tables"/>
+      <data name="Category" value="S3"/>
     </metadata>
   </prolog>
 
@@ -74,6 +75,14 @@
         directory.
       </li>
 
+      <li rev="2.5.0 IMPALA-2867">
+        The operation fails if the source directory contains any non-hidden 
directories.
+        Prior to CDH 5.7 / Impala 2.5, if the source directory contained any 
subdirectory, even a hidden one such as
+        <filepath>_impala_insert_staging</filepath>, the <codeph>LOAD 
DATA</codeph> statement would fail.
+        In CDH 5.7 / Impala 2.5 and higher, <codeph>LOAD DATA</codeph> ignores 
hidden subdirectories in the
+        source directory, and only fails if any of the subdirectories are 
non-hidden.
+      </li>
+
       <li>
         The loaded data files retain their original names in the new location, 
unless a name conflicts with an
         existing data file, in which case the name of the new file is modified 
slightly to be unique. (The
@@ -209,6 +218,8 @@ Returned 1 row(s) in 0.62s</codeblock>
 
     <p conref="../shared/impala_common.xml#common/s3_blurb"/>
     <p conref="../shared/impala_common.xml#common/s3_dml"/>
+    <p conref="../shared/impala_common.xml#common/s3_dml_performance"/>
+    <p>See <xref href="../topics/impala_s3.xml#s3"/> for details about reading 
and writing S3 data with Impala.</p>
 
     <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/>
 
@@ -223,7 +234,8 @@ Returned 1 row(s) in 0.62s</codeblock>
     <p conref="../shared/impala_common.xml#common/related_info"/>
     <p>
       The <codeph>LOAD DATA</codeph> statement is an alternative to the
-      <codeph>INSERT</codeph> statement. Use <codeph>LOAD DATA</codeph>
+      <codeph><xref href="impala_insert.xml#insert">INSERT</xref></codeph> 
statement.
+      Use <codeph>LOAD DATA</codeph>
       when you have the data files in HDFS but outside of any Impala table.
     </p>
     <p>
@@ -231,7 +243,8 @@ Returned 1 row(s) in 0.62s</codeblock>
       to the <codeph>CREATE EXTERNAL TABLE</codeph> statement. Use
       <codeph>LOAD DATA</codeph> when it is appropriate to move the
       data files under Impala control rather than querying them
-      from their original location.
+      from their original location. See <xref 
href="impala_tables.xml#external_tables"/>
+      for information about working with external tables.
     </p>
   </conbody>
 </concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_logging.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_logging.xml b/docs/topics/impala_logging.xml
index 9430178..0767818 100644
--- a/docs/topics/impala_logging.xml
+++ b/docs/topics/impala_logging.xml
@@ -4,7 +4,16 @@
 
   <title>Using Impala Logging</title>
   <titlealts audience="PDF"><navtitle>Logging</navtitle></titlealts>
-  
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Logs"/>
+      <data name="Category" value="Troubleshooting"/>
+      <data name="Category" value="Administrators"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
 
   <conbody>
 
@@ -12,10 +21,457 @@
       The Impala logs record information about:
     </p>
 
-    
+    <ul>
+      <li>
+        Any errors Impala encountered. If Impala experienced a serious error 
during startup, you must diagnose and
+        troubleshoot that problem before you can do anything further with 
Impala.
+      </li>
+
+      <li>
+        How Impala is configured.
+      </li>
+
+      <li>
+        Jobs Impala has completed.
+      </li>
+    </ul>
+
+    <note>
+      <p>
+        Formerly, the logs contained the query profile for each query, showing 
low-level details of how the work is
+        distributed among nodes and how intermediate and final results are 
transmitted across the network. To save
+        space, those query profiles are now stored in zlib-compressed files in
+        <filepath>/var/log/impala/profiles</filepath>. You can access them 
through the Impala web user interface.
+        For example, at 
<codeph>http://<varname>impalad-node-hostname</varname>:25000/queries</codeph>, 
each query
+        is followed by a <codeph>Profile</codeph> link leading to a page 
showing extensive analytical data for the
+        query execution.
+      </p>
+
+      <p rev="1.1.1">
+        The auditing feature introduced in Impala 1.1.1 produces a separate 
set of audit log files when
+        enabled. See <xref href="impala_auditing.xml#auditing"/> for details.
+      </p>
+
+      <p rev="2.2.0">
+        The lineage feature introduced in Impala 2.2.0 produces a separate 
lineage log file when
+        enabled. See <xref href="impala_lineage.xml#lineage"/> for details.
+      </p>
+    </note>
+
+    <p outputclass="toc inpage"/>
+
+  </conbody>
+
+  <concept id="logs_details">
+
+    <title>Locations and Names of Impala Log Files</title>
+
+    <conbody>
+
+      <ul>
+        <li>
+          By default, the log files are under the directory 
<filepath>/var/log/impala</filepath>.
+<!-- TK: split this task out and state CM and non-CM ways. -->
+          To change log file locations, modify the defaults file described in
+          <xref href="impala_processes.xml#processes"/>.
+        </li>
+
+        <li>
+          The significant files for the <codeph>impalad</codeph> process are 
<filepath>impalad.INFO</filepath>,
+          <filepath>impalad.WARNING</filepath>, and 
<filepath>impalad.ERROR</filepath>. You might also see a file
+          <filepath>impalad.FATAL</filepath>, although this is only present in 
rare conditions.
+        </li>
+
+        <li>
+          The significant files for the <codeph>statestored</codeph> process 
are
+          <filepath>statestored.INFO</filepath>, 
<filepath>statestored.WARNING</filepath>, and
+          <filepath>statestored.ERROR</filepath>. You might also see a file 
<filepath>statestored.FATAL</filepath>,
+          although this is only present in rare conditions.
+        </li>
+
+        <li rev="1.2">
+          The significant files for the <codeph>catalogd</codeph> process are 
<filepath>catalogd.INFO</filepath>,
+          <filepath>catalogd.WARNING</filepath>, and 
<filepath>catalogd.ERROR</filepath>. You might also see a file
+          <filepath>catalogd.FATAL</filepath>, although this is only present 
in rare conditions.
+        </li>
+
+        <li>
+          Examine the <codeph>.INFO</codeph> files to see configuration 
settings for the processes.
+        </li>
+
+        <li>
+          Examine the <codeph>.WARNING</codeph> files to see all kinds of 
problem information, including such
+          things as suboptimal settings and also serious runtime errors.
+        </li>
+
+        <li>
+          Examine the <codeph>.ERROR</codeph> and/or <codeph>.FATAL</codeph> 
files to see only the most serious
+          errors, if the processes crash, or queries fail to complete. These 
messages are also in the
+          <codeph>.WARNING</codeph> file.
+        </li>
+
+        <li>
+          A new set of log files is produced each time the associated daemon 
is restarted. These log files have
+          long names including a timestamp. The <codeph>.INFO</codeph>, 
<codeph>.WARNING</codeph>, and
+          <codeph>.ERROR</codeph> files are physically represented as symbolic 
links to the latest applicable log
+          files.
+        </li>
+
+        <li>
+          The init script for the <codeph>impala-server</codeph> service also 
produces a consolidated log file
+          <codeph>/var/logs/impalad/impala-server.log</codeph>, with all the 
same information as the
+          corresponding<codeph>.INFO</codeph>, <codeph>.WARNING</codeph>, and 
<codeph>.ERROR</codeph> files.
+        </li>
+
+        <li>
+          The init script for the <codeph>impala-state-store</codeph> service 
also produces a consolidated log file
+          <codeph>/var/logs/impalad/impala-state-store.log</codeph>, with all 
the same information as the
+          corresponding<codeph>.INFO</codeph>, <codeph>.WARNING</codeph>, and 
<codeph>.ERROR</codeph> files.
+        </li>
+      </ul>
+
+      <p>
+        Impala stores information using the <codeph>glog_v</codeph> logging 
system. You will see some messages
+        referring to C++ file names. Logging is affected by:
+      </p>
+
+      <ul>
+        <li>
+          The <codeph>GLOG_v</codeph> environment variable specifies which 
types of messages are logged. See
+          <xref href="#log_levels"/> for details.
+        </li>
+
+        <li>
+          The <codeph>-logbuflevel</codeph> startup flag for the 
<cmdname>impalad</cmdname> daemon specifies how
+          often the log information is written to disk. The default is 0, 
meaning that the log is immediately
+          flushed to disk when Impala outputs an important messages such as a 
warning or an error, but less
+          important messages such as informational ones are buffered in memory 
rather than being flushed to disk
+          immediately.
+        </li>
+
+        <li>
+          Cloudera Manager has an Impala configuration setting that sets the 
<codeph>-logbuflevel</codeph> startup
+          option.
+        </li>
+      </ul>
 
     </conbody>
 
   </concept>
 
+  <concept id="logs_cm_noncm">
+
+    <title>Managing Impala Logs through Cloudera Manager or Manually</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Administrators"/>
+      <data name="Category" value="Cloudera Manager"/>
+    </metadata>
+  </prolog>
+
+    <conbody>
+
+      <p>
+        Cloudera recommends installing Impala through the Cloudera Manager 
administration interface. To assist with
+        troubleshooting, Cloudera Manager collects front-end and back-end logs 
together into a single view, and let
+        you do a search across log data for all the managed nodes rather than 
examining the logs on each node
+        separately. If you installed Impala using Cloudera Manager, refer to 
the topics on Monitoring Services
+        (<xref 
href="http://www.cloudera.com/documentation/enterprise/latest/topics/cm_dg_service_monitoring.html";
 scope="external" format="html">CDH
+        5</xref>,
+        <xref 
href="http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-latest/Cloudera-Manager-Diagnostics-Guide/Cloudera-Manager-Diagnostics-Guide.html";
 scope="external" format="html">CDH
+        4</xref>) or Logs
+        (<xref 
href="http://www.cloudera.com/documentation/enterprise/latest/topics/cm_dg_logs.html";
 scope="external" format="html">CDH
+        5</xref>,
+        <xref 
href="http://www.cloudera.com/content/cloudera/en/documentation/cloudera-manager/v4-latest/Cloudera-Manager-Diagnostics-Guide/cmdg_logs.html";
 scope="external" format="html">CDH
+        4</xref>).
+      </p>
+
+      <p>
+        If you are using Impala in an environment not managed by Cloudera 
Manager, review Impala log files on each
+        host, when you have traced an issue back to a specific system.
+      </p>
+
+    </conbody>
+
+  </concept>
+
+  <concept id="logs_rotate">
+
+    <title>Rotating Impala Logs</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Disk Storage"/>
+    </metadata>
+  </prolog>
+
+    <conbody>
+
+      <p>
+        Impala periodically switches the physical files representing the 
current log files, after which it is safe
+        to remove the old files if they are no longer needed.
+      </p>
+
+      <p>
+        Impala can automatically remove older unneeded log files, a feature 
known as <term>log rotation</term>.
+<!-- Another instance of the text also used in impala_new_features.xml
+             and impala_fixed_issues.xml. (Just took out the word "new"
+             and added the reference to the starting release.)
+             At this point, a conref is definitely in the cards. -->
+      </p>
+
+      <p>
+        In Impala 2.2 and higher, the <codeph>-max_log_files</codeph> 
configuration option specifies how many log
+        files to keep at each severity level. You can specify an appropriate 
setting for each Impala-related daemon
+        (<cmdname>impalad</cmdname>, <cmdname>statestored</cmdname>, and 
<cmdname>catalogd</cmdname>). The default
+        value is 10, meaning that Impala preserves the latest 10 log files for 
each severity level
+        (<codeph>INFO</codeph>, <codeph>WARNING</codeph>, 
<codeph>ERROR</codeph>, and <codeph>FATAL</codeph>).
+        Impala checks to see if any old logs need to be removed based on the 
interval specified in the
+        <codeph>logbufsecs</codeph> setting, every 5 seconds by default.
+      </p>
+
+<!-- This extra detail only appears here. Consider if it's worth including it
+           in the conref so people don't need to follow a link just for a 
couple of
+           minor factoids. -->
+
+      <p>
+        A value of 0 preserves all log files, in which case you would set up 
set up manual log rotation using your
+        Linux tool or technique of choice. A value of 1 preserves only the 
very latest log file.
+      </p>
+
+      <p>
+        To set up log rotation on a system managed by Cloudera Manager 5.4.0 
and higher, search for the
+        <codeph>max_log_files</codeph> option name and set the appropriate 
value for the <userinput>Maximum Log
+        Files</userinput> field for each Impala configuration category 
(Impala, Catalog Server, and StateStore).
+        Then restart the Impala service. In earlier Cloudera Manager releases, 
specify the
+        <codeph>-max_log_files=<varname>maximum</varname></codeph> option in 
the <uicontrol>Command Line Argument
+        Advanced Configuration Snippet (Safety Valve)</uicontrol> field for 
each Impala configuration category.
+      </p>
+
+    </conbody>
+
+  </concept>
+
+  <concept id="logs_debug">
+
+    <title>Reviewing Impala Logs</title>
+
+    <conbody>
+
+      <p>
+        By default, the Impala log is stored at 
<codeph>/var/logs/impalad/</codeph>. The most comprehensive log,
+        showing informational, warning, and error messages, is in the file 
name <filepath>impalad.INFO</filepath>.
+        View log file contents by using the web interface or by examining the 
contents of the log file. (When you
+        examine the logs through the file system, you can troubleshoot 
problems by reading the
+        <filepath>impalad.WARNING</filepath> and/or 
<filepath>impalad.ERROR</filepath> files, which contain the
+        subsets of messages indicating potential problems.)
+      </p>
+
+      <p>
+        On a machine named <codeph>impala.example.com</codeph> with default 
settings, you could view the Impala
+        logs on that machine by using a browser to access 
<codeph>http://impala.example.com:25000/logs</codeph>.
+      </p>
+
+      <note>
+        <p>
+          The web interface limits the amount of logging information 
displayed. To view every log entry, access the
+          log files directly through the file system.
+        </p>
+      </note>
+
+      <p>
+        You can view the contents of the <codeph>impalad.INFO</codeph> log 
file in the file system. With the
+        default configuration settings, the start of the log file appears as 
follows:
+      </p>
+
+<codeblock>[user@example impalad]$ pwd
+/var/log/impalad
+[user@example impalad]$ more impalad.INFO
+Log file created at: 2013/01/07 08:42:12
+Running on machine: impala.example.com
+Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
+I0107 08:42:12.292155 14876 daemon.cc:34] impalad version 0.4 RELEASE (build 
9d7fadca0461ab40b9e9df8cdb47107ec6b27cff)
+Built on Fri, 21 Dec 2012 12:55:19 PST
+I0107 08:42:12.292484 14876 daemon.cc:35] Using hostname: impala.example.com
+I0107 08:42:12.292706 14876 logging.cc:76] Flags (see also /varz are on debug 
webserver):
+--dump_ir=false
+--module_output=
+--be_port=22000
+--classpath=
+--hostname=impala.example.com</codeblock>
+
+      <note>
+        The preceding example shows only a small part of the log file. Impala 
log files are often several megabytes
+        in size.
+      </note>
+
+    </conbody>
+
+  </concept>
+
+  <concept id="log_format">
+
+    <title>Understanding Impala Log Contents</title>
+
+    <conbody>
+
+      <p>
+        The logs store information about Impala startup options. This 
information appears once for each time Impala
+        is started and may include:
+      </p>
+
+      <ul>
+        <li>
+          Machine name.
+        </li>
+
+        <li>
+          Impala version number.
+        </li>
+
+        <li>
+          Flags used to start Impala.
+        </li>
+
+        <li>
+          CPU information.
+        </li>
+
+        <li>
+          The number of available disks.
+        </li>
+      </ul>
+
+      <p>
+        There is information about each job Impala has run. Because each 
Impala job creates an additional set of
+        data about queries, the amount of job specific data may be very large. 
Logs may contained detailed
+        information on jobs. These detailed log entries may include:
+      </p>
+
+      <ul>
+        <li>
+          The composition of the query.
+        </li>
+
+        <li>
+          The degree of data locality.
+        </li>
+
+        <li>
+          Statistics on data throughput and response times.
+        </li>
+      </ul>
+
+    </conbody>
+
+  </concept>
+
+  <concept id="log_levels">
+
+    <title>Setting Logging Levels</title>
+
+    <conbody>
+
+      <p>
+        Impala uses the GLOG system, which supports three logging levels. You 
can adjust the logging levels using
+        the Cloudera Manager Admin Console. You can adjust logging levels 
without going through the Cloudera
+        Manager Admin Console by exporting variable settings. To change 
logging settings manually, use a command
+        similar to the following on each node before starting 
<codeph>impalad</codeph>:
+      </p>
+
+<codeblock>export GLOG_v=1</codeblock>
+
+      <note>
+        For performance reasons, Cloudera highly recommends not enabling the 
most verbose logging level of 3.
+      </note>
+
+      <p>
+        For more information on how to configure GLOG, including how to set 
variable logging levels for different
+        system components, see
+        <xref href="http://google-glog.googlecode.com/svn/trunk/doc/glog.html"; 
scope="external" format="html">How
+        To Use Google Logging Library (glog)</xref>.
+      </p>
+
+      <section id="loglevels_details">
+
+        <title>Understanding What is Logged at Different Logging Levels</title>
+
+        <p>
+          As logging levels increase, the categories of information logged are 
cumulative. For example, GLOG_v=2
+          records everything GLOG_v=1 records, as well as additional 
information.
+        </p>
+
+        <p>
+          Increasing logging levels imposes performance overhead and increases 
log size. Cloudera recommends using
+          GLOG_v=1 for most cases: this level has minimal performance impact 
but still captures useful
+          troubleshooting information.
+        </p>
+
+        <p>
+          Additional information logged at each level is as follows:
+        </p>
+
+        <ul>
+          <li>
+            GLOG_v=1 - The default level. Logs information about each 
connection and query that is initiated to an
+            <codeph>impalad</codeph> instance, including runtime profiles.
+          </li>
+
+          <li>
+            GLOG_v=2 - Everything from the previous level plus information for 
each RPC initiated. This level also
+            records query execution progress information, including details on 
each file that is read.
+          </li>
+
+          <li>
+            GLOG_v=3 - Everything from the previous level plus logging of 
every row that is read. This level is
+            only applicable for the most serious troubleshooting and tuning 
scenarios, because it can produce
+            exceptionally large and detailed log files, potentially leading to 
its own set of performance and
+            capacity problems.
+          </li>
+        </ul>
+
+      </section>
+
+    </conbody>
+
+  </concept>
+
+  <concept id="redaction" rev="2.2.0">
+
+    <title>Redacting Sensitive Information from Impala Log Files</title>
+    <prolog>
+      <metadata>
+        <data name="Category" value="Redaction"/>
+      </metadata>
+    </prolog>
+
+    <conbody>
+
+      <p>
+        <indexterm audience="Cloudera">redaction</indexterm>
+        <term>Log redaction</term> is a security feature that prevents 
sensitive information from being displayed in
+        locations used by administrators for monitoring and troubleshooting, 
such as log files, the Cloudera Manager
+        user interface, and the Impala debug web user interface. You configure 
regular expressions that match
+        sensitive types of information processed by your system, such as 
credit card numbers or tax IDs, and literals
+        matching these patterns are obfuscated wherever they would normally be 
recorded in log files or displayed in
+        administration or debugging user interfaces.
+      </p>
+
+      <p>
+        In a security context, the log redaction feature is complementary to 
the Sentry authorization framework.
+        Sentry prevents unauthorized users from being able to directly access 
table data. Redaction prevents
+        administrators or support personnel from seeing the smaller amounts of 
sensitive or personally identifying
+        information (PII) that might appear in queries issued by those 
authorized users.
+      </p>
+
+      <p>
+        See
+        <xref audience="integrated" href="sg_redaction.xml#log_redact"/><xref 
audience="standalone" 
href="http://www.cloudera.com/documentation/enterprise/latest/topics/sg_redaction.html";
 scope="external" format="html"/>
+        for details about how to enable this feature and set
+        up the regular expressions to detect and redact sensitive information 
within SQL statement text.
+      </p>
+
+    </conbody>
+
+  </concept>
 
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_map.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_map.xml b/docs/topics/impala_map.xml
index 41e4754..64851e9 100644
--- a/docs/topics/impala_map.xml
+++ b/docs/topics/impala_map.xml
@@ -7,6 +7,9 @@
       <metadata>
         <data name="Category" value="Impala"/>
         <data name="Category" value="Impala Data Types"/>
+        <data name="Category" value="SQL"/>
+        <data name="Category" value="Developers"/>
+        <data name="Category" value="Data Analysts"/>
       </metadata>
     </prolog>
 

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_math_functions.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_math_functions.xml 
b/docs/topics/impala_math_functions.xml
index fd16b37..c82a29b 100644
--- a/docs/topics/impala_math_functions.xml
+++ b/docs/topics/impala_math_functions.xml
@@ -2,7 +2,7 @@
 <concept id="math_functions">
 
   <title>Impala Mathematical Functions</title>
-  <titlealts><navtitle>Mathematical Functions</navtitle></titlealts>
+  <titlealts audience="PDF"><navtitle>Mathematical 
Functions</navtitle></titlealts>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -53,12 +53,12 @@
     <dl>
       <dlentry rev="1.4.0" id="abs">
 
-        <dt rev="2.0.1">
+        <dt rev="1.4.0 2.0.1">
           <codeph>abs(numeric_type a)</codeph>
 <!-- <codeph>abs(double a), abs(decimal(p,s) a)</codeph> -->
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">abs() function</indexterm>
           <b>Purpose:</b> Returns the absolute value of the argument.
           <p rev="2.0.1" 
conref="../shared/impala_common.xml#common/return_type_same"/>
@@ -119,6 +119,23 @@
 
       </dlentry>
 
+      <dlentry id="atan2" rev="2.3.0 IMPALA-1771">
+
+        <dt rev="2.3.0 IMPALA-1771">
+          <codeph>atan2(double a, double b)</codeph>
+        </dt>
+
+        <dd rev="2.3.0 IMPALA-1771">
+          <indexterm audience="Cloudera">atan2() function</indexterm>
+          <b>Purpose:</b> Returns the arctangent of the two arguments, with 
the signs of the arguments used to determine the
+          quadrant of the result.
+          <p>
+            <b>Return type:</b> <codeph>double</codeph>
+          </p>
+        </dd>
+
+      </dlentry>
+
       <dlentry id="bin">
 
         <dt>
@@ -138,7 +155,7 @@
 
       <dlentry rev="1.4.0" id="ceil">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>ceil(double a)</codeph>,
           <codeph>ceil(decimal(p,s) a)</codeph>,
           <codeph id="ceiling">ceiling(double a)</codeph>,
@@ -147,7 +164,7 @@
           <codeph rev="2.3.0">dceil(decimal(p,s) a)</codeph>
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">ceil() function</indexterm>
           <b>Purpose:</b> Returns the smallest integer that is greater than or 
equal to the argument.
           <p>
@@ -194,13 +211,29 @@
 
       </dlentry>
 
-      <dlentry id="cot" rev="2.3.0">
+      <dlentry id="cosh" rev="2.3.0 IMPALA-1771">
 
-        <dt>
+        <dt rev="2.3.0 IMPALA-1771">
+          <codeph>cosh(double a)</codeph>
+        </dt>
+
+        <dd rev="2.3.0 IMPALA-1771">
+          <indexterm audience="Cloudera">cosh() function</indexterm>
+          <b>Purpose:</b> Returns the hyperbolic cosine of the argument.
+          <p>
+            <b>Return type:</b> <codeph>double</codeph>
+          </p>
+        </dd>
+
+      </dlentry>
+
+      <dlentry id="cot" rev="2.3.0 IMPALA-1771">
+
+        <dt rev="2.3.0 IMPALA-1771">
           <codeph>cot(double a)</codeph>
         </dt>
 
-        <dd>
+        <dd rev="2.3.0 IMPALA-1771">
           <indexterm audience="Cloudera">cot() function</indexterm>
           <b>Purpose:</b> Returns the cotangent of the argument.
           <p>
@@ -236,7 +269,7 @@
         <dd>
           <indexterm audience="Cloudera">e() function</indexterm>
           <b>Purpose:</b> Returns the
-          <xref href="http://en.wikipedia.org/wiki/E_(mathematical_constant)" 
scope="external" format="html">mathematical
+          <xref href="https://en.wikipedia.org/wiki/E_(mathematical_constant" 
scope="external" format="html">mathematical
           constant e</xref>.
           <p>
             <b>Return type:</b> <codeph>double</codeph>
@@ -255,7 +288,7 @@
         <dd>
           <indexterm audience="Cloudera">exp() function</indexterm>
           <b>Purpose:</b> Returns the
-          <xref href="http://en.wikipedia.org/wiki/E_(mathematical_constant)" 
scope="external" format="html">mathematical
+          <xref href="https://en.wikipedia.org/wiki/E_(mathematical_constant" 
scope="external" format="html">mathematical
           constant e</xref> raised to the power of the argument.
           <p>
             <b>Return type:</b> <codeph>double</codeph>
@@ -266,10 +299,10 @@
 
       <dlentry rev="2.3.0" id="factorial">
 
-        <dt>
+        <dt rev="2.3.0">
           <codeph>factorial(integer_type a)</codeph>
         </dt>
-        <dd>
+        <dd rev="2.3.0">
           <indexterm audience="Cloudera">factorial() function</indexterm>
           <b>Purpose:</b> Computes the <xref 
href="https://en.wikipedia.org/wiki/Factorial"; scope="external" 
format="html">factorial</xref> of an integer value.
           It works with any integer type.
@@ -421,11 +454,11 @@ select fmod(9.9,3.3);
 
       <dlentry rev="1.2.2" id="fnv_hash">
 
-        <dt>
+        <dt rev="1.2.2">
           <codeph>fnv_hash(type v)</codeph>,
         </dt>
 
-        <dd>
+        <dd rev="1.2.2">
           <indexterm audience="Cloudera">fnv_hash() function</indexterm>
           <b>Purpose:</b> Returns a consistent 64-bit value derived from the 
input argument, for convenience of
           implementing hashing logic in an application.
@@ -509,13 +542,13 @@ select fmod(9.9,3.3);
 
       <dlentry rev="1.4.0" id="greatest">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>greatest(bigint a[, bigint b ...])</codeph>, 
<codeph>greatest(double a[, double b ...])</codeph>,
           <codeph>greatest(decimal(p,s) a[, decimal(p,s) b ...])</codeph>, 
<codeph>greatest(string a[, string b
           ...])</codeph>, <codeph>greatest(timestamp a[, timestamp b 
...])</codeph>
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">greatest() function</indexterm>
           <b>Purpose:</b> Returns the largest value from a list of expressions.
           <p conref="../shared/impala_common.xml#common/return_same_type"/>
@@ -542,35 +575,29 @@ select fmod(9.9,3.3);
 
       <dlentry rev="1.4.0" id="is_inf">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>is_inf(double a)</codeph>,
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">is_inf() function</indexterm>
           <b>Purpose:</b> Tests whether a value is equal to the special value 
<q>inf</q>, signifying infinity.
           <p>
             <b>Return type:</b> <codeph>boolean</codeph>
           </p>
           <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
-          <p>
-            Infinity and NaN can be specified in text data files as 
<codeph>inf</codeph> and <codeph>nan</codeph>
-            respectively, and Impala interprets them as these special values. 
They can also be produced by certain
-            arithmetic expressions; for example, <codeph>pow(-1, 0.5)</codeph> 
returns infinity and
-            <codeph>1/0</codeph> returns NaN. Or you can cast the literal 
values, such as <codeph>CAST('nan' AS
-            DOUBLE)</codeph> or <codeph>CAST('inf' AS DOUBLE)</codeph>.
-          </p>
+          <p conref="../shared/impala_common.xml#common/infinity_and_nan"/>
         </dd>
 
       </dlentry>
 
       <dlentry rev="1.4.0" id="is_nan">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>is_nan(double a)</codeph>,
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">is_nan() function</indexterm>
           <b>Purpose:</b> Tests whether a value is equal to the special value 
<q>NaN</q>, signifying <q>not a
           number</q>.
@@ -578,26 +605,20 @@ select fmod(9.9,3.3);
             <b>Return type:</b> <codeph>boolean</codeph>
           </p>
           <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
-          <p>
-            Infinity and NaN can be specified in text data files as 
<codeph>inf</codeph> and <codeph>nan</codeph>
-            respectively, and Impala interprets them as these special values. 
They can also be produced by certain
-            arithmetic expressions; for example, <codeph>pow(-1, 0.5)</codeph> 
returns infinity and
-            <codeph>1/0</codeph> returns NaN. Or you can cast the literal 
values, such as <codeph>CAST('nan' AS
-            DOUBLE)</codeph> or <codeph>CAST('inf' AS DOUBLE)</codeph>.
-          </p>
+          <p conref="../shared/impala_common.xml#common/infinity_and_nan"/>
         </dd>
 
       </dlentry>
 
       <dlentry rev="1.4.0" id="least">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>least(bigint a[, bigint b ...])</codeph>, 
<codeph>least(double a[, double b ...])</codeph>,
           <codeph>least(decimal(p,s) a[, decimal(p,s) b ...])</codeph>, 
<codeph>least(string a[, string b
           ...])</codeph>, <codeph>least(timestamp a[, timestamp b 
...])</codeph>
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">least() function</indexterm>
           <b>Purpose:</b> Returns the smallest value from a list of 
expressions.
           <p conref="../shared/impala_common.xml#common/return_same_type"/>
@@ -677,12 +698,12 @@ select fmod(9.9,3.3);
 
       <dlentry rev="1.4.0" id="max_int">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>max_int(), <ph id="max_tinyint">max_tinyint()</ph>, <ph 
id="max_smallint">max_smallint()</ph>,
           <ph id="max_bigint">max_bigint()</ph></codeph>
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">max_int() function</indexterm>
           <indexterm audience="Cloudera">max_tinyint() function</indexterm>
           <indexterm audience="Cloudera">max_smallint() function</indexterm>
@@ -704,12 +725,12 @@ select fmod(9.9,3.3);
 
       <dlentry rev="1.4.0" id="min_int">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>min_int(), <ph id="min_tinyint">min_tinyint()</ph>, <ph 
id="min_smallint">min_smallint()</ph>,
           <ph id="min_bigint">min_bigint()</ph></codeph>
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">min_int() function</indexterm>
           <indexterm audience="Cloudera">min_tinyint() function</indexterm>
           <indexterm audience="Cloudera">min_smallint() function</indexterm>
@@ -730,11 +751,11 @@ select fmod(9.9,3.3);
 
       <dlentry id="mod" rev="2.2.0">
 
-        <dt>
+        <dt rev="2.2.0">
           <codeph>mod(<varname>numeric_type</varname> a, 
<varname>same_type</varname> b)</codeph>
         </dt>
 
-        <dd>
+        <dd rev="2.2.0">
           <indexterm audience="Cloudera">mod() function</indexterm>
           <b>Purpose:</b> Returns the modulus of a number. Equivalent to the 
<codeph>%</codeph> arithmetic operator.
           Works with any size integer type, any size floating-point type, and 
<codeph>DECIMAL</codeph>
@@ -848,11 +869,11 @@ select mod(9.9,3.0);
 
       <dlentry id="pi">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>pi()</codeph>
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">pi() function</indexterm>
           <b>Purpose:</b> Returns the constant pi.
           <p>
@@ -954,14 +975,14 @@ select pmod(5,-2);
 
       <dlentry id="pow">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>pow(double a, double p)</codeph>,
           <codeph id="power">power(double a, double p)</codeph>,
           <codeph rev="2.3.0" id="dpow">dpow(double a, double p)</codeph>,
           <codeph rev="2.3.0" id="fpow">fpow(double a, double p)</codeph>
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">pow() function</indexterm>
           <indexterm audience="Cloudera">power() function</indexterm>
           <indexterm audience="Cloudera">dpow() function</indexterm>
@@ -976,11 +997,11 @@ select pmod(5,-2);
 
       <dlentry rev="1.4.0" id="precision">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>precision(<varname>numeric_expression</varname>)</codeph>
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">precision() function</indexterm>
           <b>Purpose:</b> Computes the precision (number of decimal digits) 
needed to represent the type of the
           argument expression as a <codeph>DECIMAL</codeph> value.
@@ -1160,11 +1181,11 @@ select x, unix_timestamp(now()), 
rand(unix_timestamp(now()))
 
       <dlentry rev="1.4.0" id="scale">
 
-        <dt>
+        <dt rev="1.4.0">
           <codeph>scale(<varname>numeric_expression</varname>)</codeph>
         </dt>
 
-        <dd>
+        <dd rev="1.4.0">
           <indexterm audience="Cloudera">scale() function</indexterm>
           <b>Purpose:</b> Computes the scale (number of decimal digits to the 
right of the decimal point) needed to
           represent the type of the argument expression as a 
<codeph>DECIMAL</codeph> value.
@@ -1215,6 +1236,22 @@ select x, unix_timestamp(now()), 
rand(unix_timestamp(now()))
 
       </dlentry>
 
+      <dlentry id="sinh" rev="2.3.0 IMPALA-1771">
+
+        <dt rev="2.3.0 IMPALA-1771">
+          <codeph>sinh(double a)</codeph>
+        </dt>
+
+        <dd rev="2.3.0 IMPALA-1771">
+          <indexterm audience="Cloudera">sinh() function</indexterm>
+          <b>Purpose:</b> Returns the hyperbolic sine of the argument.
+          <p>
+            <b>Return type:</b> <codeph>double</codeph>
+          </p>
+        </dd>
+
+      </dlentry>
+
       <dlentry id="sqrt">
 
         <dt>
@@ -1249,14 +1286,30 @@ select x, unix_timestamp(now()), 
rand(unix_timestamp(now()))
 
       </dlentry>
 
+      <dlentry id="tanh" rev="2.3.0 IMPALA-1771">
+
+        <dt rev="2.3.0 IMPALA-1771">
+          <codeph>tanh(double a)</codeph>
+        </dt>
+
+        <dd rev="2.3.0 IMPALA-1771">
+          <indexterm audience="Cloudera">tanh() function</indexterm>
+          <b>Purpose:</b> Returns the hyperbolic tangent of the argument.
+          <p>
+            <b>Return type:</b> <codeph>double</codeph>
+          </p>
+        </dd>
+
+      </dlentry>
+
       <dlentry rev="2.3.0" id="truncate">
 
-        <dt>
+        <dt rev="2.3.0">
           <codeph>truncate(double_or_decimal a[, digits_to_leave])</codeph>,
           <ph id="dtrunc"><codeph>dtrunc(double_or_decimal a[, 
digits_to_leave])</codeph></ph>
         </dt>
 
-        <dd>
+        <dd rev="2.3.0">
           <indexterm audience="Cloudera">truncate() function</indexterm>
           <indexterm audience="Cloudera">dtrunc() function</indexterm>
           <b>Purpose:</b> Removes some or all fractional digits from a numeric 
value.

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_max.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_max.xml b/docs/topics/impala_max.xml
index b989785..3f7b827 100644
--- a/docs/topics/impala_max.xml
+++ b/docs/topics/impala_max.xml
@@ -2,7 +2,7 @@
 <concept id="max">
 
   <title>MAX Function</title>
-  <titlealts><navtitle>MAX</navtitle></titlealts>
+  <titlealts audience="PDF"><navtitle>MAX</navtitle></titlealts>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
@@ -11,6 +11,8 @@
       <data name="Category" value="Analytic Functions"/>
       <data name="Category" value="Aggregate Functions"/>
       <data name="Category" value="Querying"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
     </metadata>
   </prolog>
 
@@ -38,10 +40,14 @@
 
     <p 
conref="../shared/impala_common.xml#common/return_type_same_except_string"/>
 
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/partition_key_optimization"/>
+
     <p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
 
     <p 
conref="../shared/impala_common.xml#common/complex_types_aggregation_explanation"/>
-    
+
     <p 
conref="../shared/impala_common.xml#common/complex_types_aggregation_example"/>
 
     <p conref="../shared/impala_common.xml#common/example_blurb"/>
@@ -111,7 +117,7 @@ select x, property,
   (
     <b>order by property, x desc</b>
     <b>rows between unbounded preceding and current row</b>
-  ) as 'maximum to this point'        
+  ) as 'maximum to this point'
 from int_t where property in ('prime','square');
 +---+----------+-----------------------+
 | x | property | maximum to this point |
@@ -130,7 +136,7 @@ select x, property,
   (
     <b>order by property, x desc</b>
     <b>range between unbounded preceding and current row</b>
-  ) as 'maximum to this point'        
+  ) as 'maximum to this point'
 from int_t where property in ('prime','square');
 +---+----------+-----------------------+
 | x | property | maximum to this point |
@@ -156,7 +162,7 @@ analytic context, the lower bound must be <codeph>UNBOUNDED 
PRECEDING</codeph>.
   (
     <b>order by property, x</b>
     <b>rows between unbounded preceding and 1 following</b>
-  ) as 'local maximum'             
+  ) as 'local maximum'
 from int_t where property in ('prime','square');
 +---+----------+---------------+
 | x | property | local maximum |

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3c2c8f12/docs/topics/impala_max_errors.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_max_errors.xml 
b/docs/topics/impala_max_errors.xml
index 86f3618..c6eb971 100644
--- a/docs/topics/impala_max_errors.xml
+++ b/docs/topics/impala_max_errors.xml
@@ -3,12 +3,15 @@
 <concept id="max_errors">
 
   <title>MAX_ERRORS Query Option</title>
+  <titlealts audience="PDF"><navtitle>MAX_ERRORS</navtitle></titlealts>
   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
       <data name="Category" value="Impala Query Options"/>
       <data name="Category" value="Troubleshooting"/>
       <data name="Category" value="Logs"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
     </metadata>
   </prolog>

[14/23] incubator-impala git commit: Update all impala* files to the latest CDH 5.9/5.10 versions.

Reply via email to