[03/51] [partial] incubator-impala git commit: IMPALA-4181 [DOCS] Publish rendered Impala documentation to ASF site

jbapple Wed, 12 Apr 2017 11:25:38 -0700

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_tables.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_tables.html 
b/docs/build/html/topics/impala_tables.html
new file mode 100644
index 0000000..f636f20
--- /dev/null
+++ b/docs/build/html/topics/impala_tables.html
@@ -0,0 +1,446 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_schema_objects.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="prodname" 
content="Impala"><meta name="version" content="Impala 2.8.x"><meta 
name="version" content="Impala 2.8.x"><meta name="version" content="Impala 
2.8.x"><meta name="DC.Format" content="XHTML"><meta name="DC.Identifier" 
content="tables"><link rel="stylesheet" type="text/css" 
href="../commonltr.css"><title>Overview of Impala Tables</title></head><body 
id="tables"><main role="main"><article role="article" 
aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">Overview of Impala 
Tables</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p"></p>
+
+    <p class="p">
+      Tables are the primary containers for data in Impala. They have the 
familiar row and column layout similar to
+      other database systems, plus some features such as partitioning often 
associated with higher-end data
+      warehouse systems.
+    </p>
+
+    <p class="p">
+      Logically, each table has a structure based on the definition of its 
columns, partitions, and other
+      properties.
+    </p>
+
+    <p class="p">
+      Physically, each table that uses HDFS storage is associated with a 
directory in HDFS. The table data consists of all the data files
+      underneath that directory:
+    </p>
+
+    <ul class="ul">
+      <li class="li">
+        <a class="xref" href="impala_tables.html#internal_tables">Internal 
tables</a> are managed by Impala, and use directories
+        inside the designated Impala work area.
+      </li>
+
+      <li class="li">
+        <a class="xref" href="impala_tables.html#external_tables">External 
tables</a> use arbitrary HDFS directories, where
+        the data files are typically shared between different Hadoop 
components.
+      </li>
+
+      <li class="li">
+        Large-scale data is usually handled by partitioned tables, where the 
data files are divided among different
+        HDFS subdirectories.
+      </li>
+    </ul>
+
+    <p class="p">
+      Impala tables can also represent data that is stored in HBase, or in the 
Amazon S3 filesystem (<span class="keyword">Impala 2.2</span> or higher),
+      or on Isilon storage devices (<span class="keyword">Impala 2.2.3</span> 
or higher).  See <a class="xref" href="impala_hbase.html#impala_hbase">Using 
Impala to Query HBase Tables</a>,
+      <a class="xref" href="impala_s3.html#s3">Using Impala with the Amazon S3 
Filesystem</a>, and <a class="xref" 
href="impala_isilon.html#impala_isilon">Using Impala with Isilon Storage</a>
+      for details about those special kinds of tables.
+    </p>
+
+    <p class="p">
+        Impala queries ignore files with extensions commonly used for 
temporary work files by Hadoop tools. Any
+        files with extensions <code class="ph codeph">.tmp</code> or <code 
class="ph codeph">.copying</code> are not considered part of the
+        Impala table. The suffix matching is case-insensitive, so for example 
Impala ignores both
+        <code class="ph codeph">.copying</code> and <code class="ph 
codeph">.COPYING</code> suffixes.
+      </p>
+
+    <p class="p toc inpage"></p>
+
+    <p class="p">
+      <strong class="ph b">Related statements:</strong> <a class="xref" 
href="impala_create_table.html#create_table">CREATE TABLE Statement</a>,
+      <a class="xref" href="impala_drop_table.html#drop_table">DROP TABLE 
Statement</a>, <a class="xref" href="impala_alter_table.html#alter_table">ALTER 
TABLE Statement</a>
+      <a class="xref" href="impala_insert.html#insert">INSERT Statement</a>, 
<a class="xref" href="impala_load_data.html#load_data">LOAD DATA Statement</a>,
+      <a class="xref" href="impala_select.html#select">SELECT Statement</a>
+    </p>
+  </div>
+
+  <nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_schema_objects.html">Impala Schema Objects and Object 
Names</a></div></div></nav><article class="topic concept nested1" 
aria-labelledby="ariaid-title2" id="tables__internal_tables">
+
+    <h2 class="title topictitle2" id="ariaid-title2">Internal Tables</h2>
+
+    <div class="body conbody">
+
+      <p class="p">
+        
+        The default kind of table produced by the <code class="ph 
codeph">CREATE TABLE</code> statement is known as an internal
+        table. (Its counterpart is the external table, produced by the <code 
class="ph codeph">CREATE EXTERNAL TABLE</code>
+        syntax.)
+      </p>
+
+      <ul class="ul">
+        <li class="li">
+          <p class="p">
+            Impala creates a directory in HDFS to hold the data files.
+          </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+            You can create data in internal tables by issuing <code class="ph 
codeph">INSERT</code> or <code class="ph codeph">LOAD DATA</code>
+            statements.
+          </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+            If you add or replace data using HDFS operations, issue the <code 
class="ph codeph">REFRESH</code> command in
+            <span class="keyword cmdname">impala-shell</span> so that Impala 
recognizes the changes in data files, block locations,
+            and so on.
+          </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+            When you issue a <code class="ph codeph">DROP TABLE</code> 
statement, Impala physically removes all the data files
+            from the directory.
+          </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+        To see whether a table is internal or external, and its associated 
HDFS location, issue the statement
+        <code class="ph codeph">DESCRIBE FORMATTED <var class="keyword 
varname">table_name</var></code>. The <code class="ph codeph">Table Type</code> 
field
+        displays <code class="ph codeph">MANAGED_TABLE</code> for internal 
tables and <code class="ph codeph">EXTERNAL_TABLE</code> for
+        external tables. The <code class="ph codeph">Location</code> field 
displays the path of the table directory as an HDFS
+        URI.
+      </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+            When you issue an <code class="ph codeph">ALTER TABLE</code> 
statement to rename an internal table, all data files
+            are moved into the new HDFS directory for the table. The files are 
moved even if they were formerly in
+            a directory outside the Impala data directory, for example in an 
internal table with a
+            <code class="ph codeph">LOCATION</code> attribute pointing to an 
outside HDFS directory.
+          </p>
+        </li>
+      </ul>
+
+      <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+      <div class="p">
+        You can switch a table from internal to external, or from external to 
internal, by using the <code class="ph codeph">ALTER
+        TABLE</code> statement:
+<pre class="pre codeblock"><code>
+-- Switch a table from internal to external.
+ALTER TABLE <var class="keyword varname">table_name</var> SET 
TBLPROPERTIES('EXTERNAL'='TRUE');
+
+-- Switch a table from external to internal.
+ALTER TABLE <var class="keyword varname">table_name</var> SET 
TBLPROPERTIES('EXTERNAL'='FALSE');
+</code></pre>
+      </div>
+
+      <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+      <p class="p">
+        <a class="xref" href="impala_tables.html#external_tables">External 
Tables</a>, <a class="xref" href="impala_create_table.html#create_table">CREATE 
TABLE Statement</a>,
+        <a class="xref" href="impala_drop_table.html#drop_table">DROP TABLE 
Statement</a>, <a class="xref" href="impala_alter_table.html#alter_table">ALTER 
TABLE Statement</a>,
+        <a class="xref" href="impala_describe.html#describe">DESCRIBE 
Statement</a>
+      </p>
+    </div>
+  </article>
+
+  <article class="topic concept nested1" aria-labelledby="ariaid-title3" 
id="tables__external_tables">
+
+    <h2 class="title topictitle2" id="ariaid-title3">External Tables</h2>
+
+    <div class="body conbody">
+
+      <p class="p">
+        
+        The syntax <code class="ph codeph">CREATE EXTERNAL TABLE</code> sets 
up an Impala table that points at existing data
+        files, potentially in HDFS locations outside the normal Impala data 
directories.. This operation saves the
+        expense of importing the data into a new table when you already have 
the data files in a known location in
+        HDFS, in the desired file format.
+      </p>
+
+      <ul class="ul">
+        <li class="li">
+          <p class="p">
+            You can use Impala to query the data in this table.
+          </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+            You can create data in external tables by issuing <code class="ph 
codeph">INSERT</code> or <code class="ph codeph">LOAD DATA</code>
+            statements.
+          </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+            If you add or replace data using HDFS operations, issue the <code 
class="ph codeph">REFRESH</code> command in
+            <span class="keyword cmdname">impala-shell</span> so that Impala 
recognizes the changes in data files, block locations,
+            and so on.
+          </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+            When you issue a <code class="ph codeph">DROP TABLE</code> 
statement in Impala, that removes the connection that
+            Impala has with the associated data files, but does not physically 
remove the underlying data. You can
+            continue to use the data files with other Hadoop components and 
HDFS operations.
+          </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+        To see whether a table is internal or external, and its associated 
HDFS location, issue the statement
+        <code class="ph codeph">DESCRIBE FORMATTED <var class="keyword 
varname">table_name</var></code>. The <code class="ph codeph">Table Type</code> 
field
+        displays <code class="ph codeph">MANAGED_TABLE</code> for internal 
tables and <code class="ph codeph">EXTERNAL_TABLE</code> for
+        external tables. The <code class="ph codeph">Location</code> field 
displays the path of the table directory as an HDFS
+        URI.
+      </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+            When you issue an <code class="ph codeph">ALTER TABLE</code> 
statement to rename an external table, all data files
+            are left in their original locations.
+          </p>
+        </li>
+
+        <li class="li">
+          <p class="p">
+            You can point multiple external tables at the same HDFS directory 
by using the same
+            <code class="ph codeph">LOCATION</code> attribute for each one. 
The tables could have different column definitions,
+            as long as the number and types of columns are compatible with the 
schema evolution considerations for
+            the underlying file type. For example, for text data files, one 
table might define a certain column as
+            a <code class="ph codeph">STRING</code> while another defines the 
same column as a <code class="ph codeph">BIGINT</code>.
+          </p>
+        </li>
+      </ul>
+
+      <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+      <div class="p">
+        You can switch a table from internal to external, or from external to 
internal, by using the <code class="ph codeph">ALTER
+        TABLE</code> statement:
+<pre class="pre codeblock"><code>
+-- Switch a table from internal to external.
+ALTER TABLE <var class="keyword varname">table_name</var> SET 
TBLPROPERTIES('EXTERNAL'='TRUE');
+
+-- Switch a table from external to internal.
+ALTER TABLE <var class="keyword varname">table_name</var> SET 
TBLPROPERTIES('EXTERNAL'='FALSE');
+</code></pre>
+      </div>
+
+      <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+      <p class="p">
+        <a class="xref" href="impala_tables.html#internal_tables">Internal 
Tables</a>, <a class="xref" href="impala_create_table.html#create_table">CREATE 
TABLE Statement</a>,
+        <a class="xref" href="impala_drop_table.html#drop_table">DROP TABLE 
Statement</a>, <a class="xref" href="impala_alter_table.html#alter_table">ALTER 
TABLE Statement</a>,
+        <a class="xref" href="impala_describe.html#describe">DESCRIBE 
Statement</a>
+      </p>
+    </div>
+  </article>
+
+  <article class="topic concept nested1" aria-labelledby="ariaid-title4" 
id="tables__table_file_formats">
+    <h2 class="title topictitle2" id="ariaid-title4">File Formats</h2>
+
+    <div class="body conbody">
+      <p class="p">
+        Each table has an associated file format, which determines how Impala 
interprets the
+        associated data files. See <a class="xref" 
href="impala_file_formats.html#file_formats">How Impala Works with Hadoop File 
Formats</a> for details.
+      </p>
+      <p class="p">
+        You set the file format during the <code class="ph codeph">CREATE 
TABLE</code> statement,
+        or change it later using the <code class="ph codeph">ALTER 
TABLE</code> statement.
+        Partitioned tables can have a different file format for individual 
partitions,
+        allowing you to change the file format used in your ETL process for 
new data
+        without going back and reconverting all the existing data in the same 
table.
+      </p>
+      <p class="p">
+        Any <code class="ph codeph">INSERT</code> statements produce new data 
files with the current file format of the table.
+        For existing data files, changing the file format of the table does 
not automatically do any data conversion.
+        You must use <code class="ph codeph">TRUNCATE TABLE</code> or <code 
class="ph codeph">INSERT OVERWRITE</code> to remove any previous data
+        files that use the old file format.
+        Then you use the <code class="ph codeph">LOAD DATA</code> statement, 
<code class="ph codeph">INSERT ... SELECT</code>, or other mechanism
+        to put data files of the correct format into the table.
+      </p>
+      <p class="p">
+        The default file format, text, is the most flexible and easy to 
produce when you are just getting started with
+        Impala. The Parquet file format offers the highest query performance 
and uses compression to reduce storage
+        requirements; therefore, where practical, use Parquet for Impala 
tables with substantial amounts of data.
+        <span class="ph">Also, the complex types (<code class="ph 
codeph">ARRAY</code>, <code class="ph codeph">STRUCT</code>, and <code 
class="ph codeph">MAP</code>)
+        available in <span class="keyword">Impala 2.3</span> and higher are 
currently only supported with the Parquet file type.</span>
+        Based on your existing ETL workflow, you might use other file formats 
such as Avro, possibly doing a final
+        conversion step to Parquet to take advantage of its performance for 
analytic queries.
+      </p>
+    </div>
+  </article>
+
+  <article class="topic concept nested1" aria-labelledby="ariaid-title5" 
id="tables__kudu_tables">
+    <h2 class="title topictitle2" id="ariaid-title5">Kudu Tables</h2>
+    
+
+    <div class="body conbody">
+      <p class="p">
+        Tables stored in Apache Kudu are treated specially, because Kudu 
manages its data independently of HDFS files.
+        Some information about the table is stored in the metastore database 
for use by Impala. Other table metadata is
+        managed internally by Kudu.
+      </p>
+
+      <p class="p">
+        When you create a Kudu table through Impala, it is assigned an 
internal Kudu table name of the form
+        <code class="ph codeph">impala::<var class="keyword 
varname">db_name</var>.<var class="keyword varname">table_name</var></code>. 
You can see the Kudu-assigned name
+        in the output of <code class="ph codeph">DESCRIBE FORMATTED</code>, in 
the <code class="ph codeph">kudu.table_name</code> field of the table 
properties.
+        The Kudu-assigned name remains the same even if you use <code 
class="ph codeph">ALTER TABLE</code> to rename the Impala table
+        or move it to a different Impala database. If you issue the statement
+        <code class="ph codeph">ALTER TABLE <var class="keyword 
varname">impala_name</var> SET TBLPROPERTIES('kudu.table_name' = '<var 
class="keyword varname">different_kudu_table_name</var>')</code>,
+        the effect is different depending on whether the Impala table was 
created with a regular <code class="ph codeph">CREATE TABLE</code>
+        statement (that is, if it is an internal or managed table), or if it 
was created with a
+        <code class="ph codeph">CREATE EXTERNAL TABLE</code> statement (and 
therefore is an external table). Changing the <code class="ph 
codeph">kudu.table_name</code>
+        property of an internal table physically renames the underlying Kudu 
table to match the new name.
+        Changing the <code class="ph codeph">kudu.table_name</code> property 
of an external table switches which underlying Kudu table
+        the Impala table refers to; the underlying Kudu table must already 
exist.
+      </p>
+
+      <p class="p">
+        The following example shows what happens with both internal and 
external Kudu tables as the <code class="ph codeph">kudu.table_name</code>
+        property is changed. In practice, external tables are typically used 
to access underlying Kudu tables that were created
+        outside of Impala, that is, through the Kudu API.
+      </p>
+
+<pre class="pre codeblock"><code>
+-- This is an internal table that we will create and then rename.
+create table old_name (id bigint primary key, s string)
+  partition by hash(id) partitions 2 stored as kudu;
+
+-- Initially, the name OLD_NAME is the same on the Impala and Kudu sides.
+describe formatted old_name;
+...
+| Location:          | hdfs://host.example.com:8020/path/user.db/old_name
+| Table Type:        | MANAGED_TABLE         | NULL
+| Table Parameters:  | NULL                  | NULL
+|                    | DO_NOT_UPDATE_STATS   | true
+|                    | kudu.master_addresses | vd0342.example.com
+|                    | kudu.table_name       | impala::user.old_name
+
+-- ALTER TABLE RENAME TO changes the Impala name but not the underlying Kudu 
name.
+alter table old_name rename to new_name;
+
+describe formatted new_name;
+| Location:          | hdfs://host.example.com:8020/path/user.db/new_name
+| Table Type:        | MANAGED_TABLE         | NULL
+| Table Parameters:  | NULL                  | NULL
+|                    | DO_NOT_UPDATE_STATS   | true
+|                    | kudu.master_addresses | vd0342.example.com
+|                    | kudu.table_name       | impala::user.old_name
+
+-- Setting TBLPROPERTIES changes the underlying Kudu name.
+alter table new_name
+  set tblproperties('kudu.table_name' = 'impala::user.new_name');
+
+describe formatted new_name;
+| Location:          | hdfs://host.example.com:8020/path/user.db/new_name
+| Table Type:        | MANAGED_TABLE         | NULL
+| Table Parameters:  | NULL                  | NULL
+|                    | DO_NOT_UPDATE_STATS   | true
+|                    | kudu.master_addresses | vd0342.example.com
+|                    | kudu.table_name       | impala::user.new_name
+
+-- Put some data in the table to demonstrate how external tables can map to
+-- different underlying Kudu tables.
+insert into new_name values (0, 'zero'), (1, 'one'), (2, 'two');
+
+-- This external table points to the same underlying Kudu table, NEW_NAME,
+-- as we created above. No need to declare columns or other table aspects.
+create external table kudu_table_alias stored as kudu
+  tblproperties('kudu.table_name' = 'impala::user.new_name');
+
+-- The external table can fetch data from the NEW_NAME table that already
+-- existed and already had data.
+select * from kudu_table_alias limit 100;
++----+------+
+| id | s    |
++----+------+
+| 1  | one  |
+| 0  | zero |
+| 2  | two  |
++----+------+
+
+-- We cannot re-point the external table at a different underlying Kudu table
+-- unless that other underlying Kudu table already exists.
+alter table kudu_table_alias
+  set tblproperties('kudu.table_name' = 'impala::user.yet_another_name');
+ERROR:
+TableLoadingException: Error opening Kudu table 
'impala::user.yet_another_name',
+  Kudu error: The table does not exist: table_name: 
"impala::user.yet_another_name"
+
+-- Once the underlying Kudu table exists, we can re-point the external table 
to it.
+create table yet_another_name (id bigint primary key, x int, y int, s string)
+  partition by hash(id) partitions 2 stored as kudu;
+
+alter table kudu_table_alias
+  set tblproperties('kudu.table_name' = 'impala::user.yet_another_name');
+
+-- Now no data is returned because this other table is empty.
+select * from kudu_table_alias limit 100;
+
+-- The Impala table automatically recognizes the table schema of the new table,
+-- for example the extra X and Y columns not present in the original table.
+describe kudu_table_alias;
++------+--------+---------+-------------+----------+...
+| name | type   | comment | primary_key | nullable |...
++------+--------+---------+-------------+----------+...
+| id   | bigint |         | true        | false    |...
+| x    | int    |         | false       | true     |...
+| y    | int    |         | false       | true     |...
+| s    | string |         | false       | true     |...
++------+--------+---------+-------------+----------+...
+</code></pre>
+
+      <p class="p">
+        The <code class="ph codeph">SHOW TABLE STATS</code> output for a Kudu 
table shows Kudu-specific details about the layout of the table.
+        Instead of information about the number and sizes of files, the 
information is divided by the Kudu tablets.
+        For each tablet, the output includes the fields
+        <code class="ph codeph"># Rows</code> (although this number is not 
currently computed), <code class="ph codeph">Start Key</code>, <code class="ph 
codeph">Stop Key</code>, <code class="ph codeph">Leader Replica</code>, and 
<code class="ph codeph"># Replicas</code>.
+        The output of <code class="ph codeph">SHOW COLUMN STATS</code>, 
illustrating the distribution of values within each column, is the same for 
Kudu tables
+        as for HDFS-backed tables.
+      </p>
+
+      <div class="p">
+        The distinction between internal and external tables has some special
+        details for Kudu tables. Tables created entirely through Impala are
+        internal tables. The table name as represented within Kudu includes
+        notation such as an <code class="ph codeph">impala::</code> prefix and 
the Impala
+        database name. External Kudu tables are those created by a non-Impala
+        mechanism, such as a user application calling the Kudu APIs. For
+        these tables, the <code class="ph codeph">CREATE EXTERNAL TABLE</code> 
syntax lets
+        you establish a mapping from Impala to the existing Kudu table:
+<pre class="pre codeblock"><code>
+CREATE EXTERNAL TABLE impala_name STORED AS KUDU
+  TBLPROPERTIES('kudu.table_name' = 'original_kudu_name');
+</code></pre>
+        External Kudu tables differ in one important way from other external
+        tables: adding or dropping a column or range partition changes the
+        data in the underlying Kudu table, in contrast to an HDFS-backed
+        external table where existing data files are left untouched.
+      </div>
+    </div>
+  </article>
+
+</article></main></body></html>
\ No newline at end of file


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_timeouts.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_timeouts.html 
b/docs/build/html/topics/impala_timeouts.html
new file mode 100644
index 0000000..2005c7d
--- /dev/null
+++ b/docs/build/html/topics/impala_timeouts.html
@@ -0,0 +1,168 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_admin.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="version" 
content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta 
name="DC.Format" content="XHTML"><meta name="DC.Identifier" 
content="timeouts"><link rel="stylesheet" type="text/css" 
href="../commonltr.css"><title>Setting Timeout Periods for Daemons, Queries, 
and Sessions</title></head><body id="timeouts"><main role="main"><article 
role="article" aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">Setting Timeout Periods for 
Daemons, Queries, and Sessions</h1>
+
+  
+
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      Depending on how busy your <span class="keyword"></span> cluster is, you 
might increase or decrease various timeout
+      values. Increase timeouts if Impala is cancelling operations 
prematurely, when the system
+      is responding slower than usual but the operations are still successful 
if given extra
+      time. Decrease timeouts if operations are idle or hanging for long 
periods, and the idle
+      or hung operations are consuming resources and reducing concurrency.
+    </p>
+
+    <p class="p toc inpage"></p>
+
+  </div>
+
+  <nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_admin.html">Impala 
Administration</a></div></div></nav><article class="topic concept nested1" 
aria-labelledby="ariaid-title2" id="timeouts__statestore_timeout">
+
+    <h2 class="title topictitle2" id="ariaid-title2">Increasing the Statestore 
Timeout</h2>
+
+    <div class="body conbody">
+
+      <p class="p">
+        If you have an extensive Impala schema, for example with hundreds of 
databases, tens of
+        thousands of tables, and so on, you might encounter timeout errors 
during startup as the
+        Impala catalog service broadcasts metadata to all the Impala nodes 
using the statestore
+        service. To avoid such timeout errors on startup, increase the 
statestore timeout value
+        from its default of 10 seconds. Specify the timeout value using the
+        <code class="ph codeph">-statestore_subscriber_timeout_seconds</code> 
option for the statestore
+        service, using the configuration instructions in
+        <a class="xref" 
href="impala_config_options.html#config_options">Modifying Impala Startup 
Options</a>. The symptom of this problem is
+        messages in the <code class="ph codeph">impalad</code> log such as:
+      </p>
+
+<pre class="pre codeblock"><code>Connection with state-store lost
+Trying to re-register with state-store</code></pre>
+
+      <p class="p">
+        See <a class="xref" 
href="impala_scalability.html#statestore_scalability">Scalability 
Considerations for the Impala Statestore</a> for more details about
+        statestore operation and settings on clusters with a large number of 
Impala-related
+        objects such as tables and partitions.
+      </p>
+
+    </div>
+
+  </article>
+
+  <article class="topic concept nested1" aria-labelledby="ariaid-title3" 
id="timeouts__impalad_timeout">
+
+    <h2 class="title topictitle2" id="ariaid-title3">Setting the Idle Query 
and Idle Session Timeouts for impalad</h2>
+
+    <div class="body conbody">
+
+      <p class="p">
+        To keep long-running queries or idle sessions from tying up cluster 
resources, you can
+        set timeout intervals for both individual queries, and entire sessions.
+      </p>
+
+      <div class="note note note_note"><span class="note__title 
notetitle">Note:</span> 
+        <p class="p">
+          The timeout clock for queries and sessions only starts ticking when 
the query or session is idle.
+          For queries, this means the query has results ready but is waiting 
for a client to fetch the data. A
+          query can run for an arbitrary time without triggering a timeout, 
because the query is computing results
+          rather than sitting idle waiting for the results to be fetched. The 
timeout period is intended to prevent
+          unclosed queries from consuming resources and taking up slots in the 
admission count of running queries,
+          potentially preventing other queries from starting.
+        </p>
+        <p class="p">
+          For sessions, this means that no query has been submitted for some 
period of time.
+        </p>
+      </div>
+
+      <p class="p">
+        Specify the following startup options for the <span class="keyword 
cmdname">impalad</span> daemon:
+      </p>
+
+      <ul class="ul">
+        <li class="li">
+          The <code class="ph codeph">--idle_query_timeout</code> option 
specifies the time in seconds after
+          which an idle query is cancelled. This could be a query whose 
results were all fetched
+          but was never closed, or one whose results were partially fetched 
and then the client
+          program stopped requesting further results. This condition is most 
likely to occur in
+          a client program using the JDBC or ODBC interfaces, rather than in 
the interactive
+          <span class="keyword cmdname">impala-shell</span> interpreter. Once 
the query is cancelled, the client
+          program cannot retrieve any further results.
+        </li>
+
+        <li class="li">
+          The <code class="ph codeph">--idle_session_timeout</code> option 
specifies the time in seconds after
+          which an idle session is expired. A session is idle when no activity 
is occurring for
+          any of the queries in that session, and the session has not started 
any new queries.
+          Once a session is expired, you cannot issue any new query requests 
to it. The session
+          remains open, but the only operation you can perform is to close it. 
The default value
+          of 0 means that sessions never expire.
+        </li>
+      </ul>
+
+      <p class="p">
+        For instructions on changing <span class="keyword 
cmdname">impalad</span> startup options, see
+        <a class="xref" 
href="impala_config_options.html#config_options">Modifying Impala Startup 
Options</a>.
+      </p>
+
+      <p class="p">
+        You can reduce the idle query timeout by using the <code class="ph 
codeph">QUERY_TIMEOUT_S</code>
+        query option. Any value specified for the <code class="ph 
codeph">--idle_query_timeout</code> startup
+        option serves as an upper limit for the <code class="ph 
codeph">QUERY_TIMEOUT_S</code> query option.
+        See <a class="xref" 
href="impala_query_timeout_s.html#query_timeout_s">QUERY_TIMEOUT_S Query Option 
(Impala 2.0 or higher only)</a> for details.
+      </p>
+
+    </div>
+
+  </article>
+  <article class="topic concept nested1" aria-labelledby="ariaid-title4" 
id="timeouts__concept_rfy_jl1_rx">
+    <h2 class="title topictitle2" id="ariaid-title4">Setting Timeout and 
Retries for Thrift Connections to the Backend
+      Client</h2>
+    <div class="body conbody">
+      <p class="p">Impala connections to the backend client are subject to 
failure in
+        cases when the network is momentarily overloaded. To avoid failed
+        queries due to transient network problems, you can configure the number
+        of Thrift connection retries using the following option: </p>
+      <ul class="ul" id="concept_rfy_jl1_rx__ul_bj3_ql1_rx">
+        <li class="li">The <code class="ph 
codeph">--backend_client_connection_num_retries</code> option
+          specifies the number of times Impala will try connecting to the
+          backend client after the first connection attempt fails. By default,
+            <span class="keyword cmdname">impalad</span> will attempt three 
re-connections before
+          it returns a failure. </li>
+      </ul>
+      <p class="p">You can configure timeouts for sending and receiving data 
from the
+        backend client. Therefore, if for some reason a query hangs, instead of
+        waiting indefinitely for a response, Impala will terminate the
+        connection after a configurable timeout.</p>
+      <ul class="ul" id="concept_rfy_jl1_rx__ul_vm2_2v1_rx">
+        <li class="li">The <code class="ph 
codeph">--backend_client_rpc_timeout_ms</code> option can be
+          used to specify the number of milliseconds Impala should wait for a
+          response from the backend client before it terminates the connection
+          and signals a failure. The default value for this property is 300000
+          milliseconds, or 5 minutes. </li>
+      </ul>
+    </div>
+  </article>
+
+  <article class="topic concept nested1" aria-labelledby="ariaid-title5" 
id="timeouts__cancel_query">
+
+    <h2 class="title topictitle2" id="ariaid-title5">Cancelling a Query</h2>
+
+    <div class="body conbody">
+
+      <p class="p">
+        Sometimes, an Impala query might run for an unexpectedly long time, 
tying up resources
+        in the cluster. You can cancel the query explicitly, independent of 
the timeout period,
+        by going into the web UI for the <span class="keyword 
cmdname">impalad</span> host (on port 25000 by
+        default), and using the link on the <code class="ph 
codeph">/queries</code> tab to cancel the running
+        query. For example, press <code class="ph codeph">^C</code> in <span 
class="keyword cmdname">impala-shell</span>.
+      </p>
+
+    </div>
+
+  </article>
+
+</article></main></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_timestamp.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_timestamp.html 
b/docs/build/html/topics/impala_timestamp.html
new file mode 100644
index 0000000..02f86fc
--- /dev/null
+++ b/docs/build/html/topics/impala_timestamp.html
@@ -0,0 +1,514 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_datatypes.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="version" 
content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta 
name="DC.Format" content="XHTML"><meta name="DC.Identifier" 
content="timestamp"><link rel="stylesheet" type="text/css" 
href="../commonltr.css"><title>TIMESTAMP Data Type</title></head><body 
id="timestamp"><main role="main"><article role="article" 
aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">TIMESTAMP Data Type</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      A data type used in <code class="ph codeph">CREATE TABLE</code> and 
<code class="ph codeph">ALTER TABLE</code> statements, representing a
+      point in time.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Syntax:</strong>
+      </p>
+
+    <p class="p">
+      In the column definition of a <code class="ph codeph">CREATE 
TABLE</code> statement:
+    </p>
+
+<pre class="pre codeblock"><code><var class="keyword 
varname">column_name</var> TIMESTAMP</code></pre>
+
+    <p class="p">
+      <strong class="ph b">Range:</strong> Allowed date values range from 
1400-01-01 to 9999-12-31; this range is different from the Hive
+      <code class="ph codeph">TIMESTAMP</code> type. Internally, the 
resolution of the time portion of a
+      <code class="ph codeph">TIMESTAMP</code> value is in nanoseconds.
+    </p>
+
+    <p class="p">
+      <strong class="ph b">INTERVAL expressions:</strong>
+    </p>
+
+    <p class="p">
+      You can perform date arithmetic by adding or subtracting a specified 
number of time units, using the
+      <code class="ph codeph">INTERVAL</code> keyword and the <code class="ph 
codeph">+</code> and <code class="ph codeph">-</code> operators or
+      <code class="ph codeph">date_add()</code> and <code class="ph 
codeph">date_sub()</code> functions. You can specify units as
+      <code class="ph codeph">YEAR[S]</code>, <code class="ph 
codeph">MONTH[S]</code>, <code class="ph codeph">WEEK[S]</code>, <code 
class="ph codeph">DAY[S]</code>,
+      <code class="ph codeph">HOUR[S]</code>, <code class="ph 
codeph">MINUTE[S]</code>, <code class="ph codeph">SECOND[S]</code>,
+      <code class="ph codeph">MILLISECOND[S]</code>, <code class="ph 
codeph">MICROSECOND[S]</code>, and <code class="ph 
codeph">NANOSECOND[S]</code>. You can
+      only specify one time unit in each interval expression, for example 
<code class="ph codeph">INTERVAL 3 DAYS</code> or
+      <code class="ph codeph">INTERVAL 25 HOURS</code>, but you can produce 
any granularity by adding together successive
+      <code class="ph codeph">INTERVAL</code> values, such as <code class="ph 
codeph"><var class="keyword varname">timestamp_value</var> + INTERVAL 3 WEEKS -
+      INTERVAL 1 DAY + INTERVAL 10 MICROSECONDS</code>.
+    </p>
+
+    <p class="p">
+      For example:
+    </p>
+
+<pre class="pre codeblock"><code>select now() + interval 1 day;
+select date_sub(now(), interval 5 minutes);
+insert into auction_details
+  select auction_id, auction_start_time, auction_start_time + interval 2 days 
+ interval 12 hours
+  from new_auctions;</code></pre>
+
+    <p class="p">
+      <strong class="ph b">Time zones:</strong>
+    </p>
+
+    <p class="p">
+      By default, Impala does not store timestamps using the local timezone, 
to avoid undesired results from
+      unexpected time zone issues. Timestamps are stored and interpreted 
relative to UTC, both when written to or
+      read from data files, or when converted to or from Unix time values 
through functions such as
+      <code class="ph codeph">from_unixtime()</code> or <code class="ph 
codeph">unix_timestamp()</code>. To convert such a
+      <code class="ph codeph">TIMESTAMP</code> value to one that represents 
the date and time in a specific time zone, convert
+      the original value with the <code class="ph 
codeph">from_utc_timestamp()</code> function.
+    </p>
+
+    <p class="p">
+      Because Impala does not assume that <code class="ph 
codeph">TIMESTAMP</code> values are in any particular time zone, you
+      must be conscious of the time zone aspects of data that you query, 
insert, or convert.
+    </p>
+
+    <p class="p">
+      For consistency with Unix system calls, the <code class="ph 
codeph">TIMESTAMP</code> returned by the <code class="ph codeph">now()</code>
+      function represents the local time in the system time zone, rather than 
in UTC. To store values relative to
+      the current time in a portable way, convert any <code class="ph 
codeph">now()</code> return values using the
+      <code class="ph codeph">to_utc_timestamp()</code> function first. For 
example, the following example shows that the current
+      time in California (where this Impala cluster is located) is shortly 
after 2 PM. If that value was written to a data
+      file, and shipped off to a distant server to be analyzed alongside other 
data from far-flung locations, the
+      dates and times would not match up precisely because of time zone 
differences. Therefore, the
+      <code class="ph codeph">to_utc_timestamp()</code> function converts it 
using a common reference point, the UTC time zone
+      (descended from the old Greenwich Mean Time standard). The <code 
class="ph codeph">'PDT'</code> argument indicates that the
+      original value is from the Pacific time zone with Daylight Saving Time 
in effect. When servers in all
+      geographic locations run the same transformation on any local date and 
time values (with the appropriate time
+      zone argument), the stored data uses a consistent representation. Impala 
queries can use functions such as
+      <code class="ph codeph">EXTRACT()</code>, <code class="ph 
codeph">MIN()</code>, <code class="ph codeph">AVG()</code>, and so on to do 
time-series
+      analysis on those timestamps.
+    </p>
+
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select now();
++-------------------------------+
+| now()                         |
++-------------------------------+
+| 2015-04-09 14:07:46.580465000 |
++-------------------------------+
+[localhost:21000] &gt; select to_utc_timestamp(now(), 'PDT');
++--------------------------------+
+| to_utc_timestamp(now(), 'pdt') |
++--------------------------------+
+| 2015-04-09 21:08:07.664547000  |
++--------------------------------+
+</code></pre>
+
+    <p class="p">
+      The converse function, <code class="ph 
codeph">from_utc_timestamp()</code>, lets you take stored <code class="ph 
codeph">TIMESTAMP</code>
+      data or calculated results and convert back to local date and time for 
processing on the application side.
+      The following example shows how you might represent some future date 
(such as the ending date and time of an
+      auction) in UTC, and then convert back to local time when convenient for 
reporting or other processing. The
+      final query in the example tests whether this arbitrary UTC date and 
time has passed yet, by converting it
+      back to the local time zone and comparing it against the current date 
and time.
+    </p>
+
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select 
to_utc_timestamp(now() + interval 2 weeks, 'PDT');
++---------------------------------------------------+
+| to_utc_timestamp(now() + interval 2 weeks, 'pdt') |
++---------------------------------------------------+
+| 2015-04-23 21:08:34.152923000                     |
++---------------------------------------------------+
+[localhost:21000] &gt; select from_utc_timestamp('2015-04-23 
21:08:34.152923000','PDT');
++------------------------------------------------------------+
+| from_utc_timestamp('2015-04-23 21:08:34.152923000', 'pdt') |
++------------------------------------------------------------+
+| 2015-04-23 14:08:34.152923000                              |
++------------------------------------------------------------+
+[localhost:21000] &gt; select from_utc_timestamp('2015-04-23 
21:08:34.152923000','PDT') &lt; now();
++--------------------------------------------------------------------+
+| from_utc_timestamp('2015-04-23 21:08:34.152923000', 'pdt') &lt; now() |
++--------------------------------------------------------------------+
+| false                                                              |
++--------------------------------------------------------------------+
+</code></pre>
+
+    <p class="p">
+      If you have data files written by Hive, those <code class="ph 
codeph">TIMESTAMP</code> values represent the local timezone
+      of the host where the data was written, potentially leading to 
inconsistent results when processed by Impala.
+      To avoid compatibility problems or having to code workarounds, you can 
specify one or both of these
+      <span class="keyword cmdname">impalad</span> startup flags: <code 
class="ph codeph">-use_local_tz_for_unix_timestamp_conversions=true</code>
+      <code class="ph 
codeph">-convert_legacy_hive_parquet_utc_timestamps=true</code>. Although
+      <code class="ph 
codeph">-convert_legacy_hive_parquet_utc_timestamps</code> is turned off by 
default to avoid performance overhead, where practical
+      turn it on when processing <code class="ph codeph">TIMESTAMP</code> 
columns in Parquet files written by Hive, to avoid unexpected behavior.
+    </p>
+
+    <p class="p">
+      The <code class="ph 
codeph">-use_local_tz_for_unix_timestamp_conversions</code> setting affects 
conversions from
+      <code class="ph codeph">TIMESTAMP</code> to <code class="ph 
codeph">BIGINT</code>, or from <code class="ph codeph">BIGINT</code>
+      to <code class="ph codeph">TIMESTAMP</code>. By default, Impala treats 
all <code class="ph codeph">TIMESTAMP</code> values as UTC,
+      to simplify analysis of time-series data from different geographic 
regions. When you enable the
+      <code class="ph 
codeph">-use_local_tz_for_unix_timestamp_conversions</code> setting, these 
operations
+      treat the input values as if they are in the local tie zone of the host 
doing the processing.
+      See <a class="xref" 
href="impala_datetime_functions.html#datetime_functions">Impala Date and Time 
Functions</a> for the list of functions
+      affected by the <code class="ph 
codeph">-use_local_tz_for_unix_timestamp_conversions</code> setting.
+    </p>
+
+    <p class="p">
+      The following sequence of examples shows how the interpretation of <code 
class="ph codeph">TIMESTAMP</code> values in
+      Parquet tables is affected by the setting of the <code class="ph 
codeph">-convert_legacy_hive_parquet_utc_timestamps</code>
+      setting.
+    </p>
+
+    <p class="p">
+      Regardless of the <code class="ph 
codeph">-convert_legacy_hive_parquet_utc_timestamps</code> setting,
+      <code class="ph codeph">TIMESTAMP</code> columns in text tables can be 
written and read interchangeably by Impala and Hive:
+    </p>
+
+<pre class="pre codeblock"><code>Impala DDL and queries for text table:
+
+[localhost:21000] &gt; create table t1 (x timestamp);
+[localhost:21000] &gt; insert into t1 values (now()), (now() + interval 1 day);
+[localhost:21000] &gt; select x from t1;
++-------------------------------+
+| x                             |
++-------------------------------+
+| 2015-04-07 15:43:02.892403000 |
+| 2015-04-08 15:43:02.892403000 |
++-------------------------------+
+[localhost:21000] &gt; select to_utc_timestamp(x, 'PDT') from t1;
++-------------------------------+
+| to_utc_timestamp(x, 'pdt')    |
++-------------------------------+
+| 2015-04-07 22:43:02.892403000 |
+| 2015-04-08 22:43:02.892403000 |
++-------------------------------+
+
+Hive query for text table:
+
+hive&gt; select * from t1;
+OK
+2015-04-07 15:43:02.892403
+2015-04-08 15:43:02.892403
+Time taken: 1.245 seconds, Fetched: 2 row(s)
+</code></pre>
+
+    <p class="p">
+      When the table uses Parquet format, Impala expects any time zone 
adjustment to be applied prior to writing,
+      while <code class="ph codeph">TIMESTAMP</code> values written by Hive 
are adjusted to be in the UTC time zone. When Hive
+      queries Parquet data files that it wrote, it adjusts the <code class="ph 
codeph">TIMESTAMP</code> values back to the local
+      time zone, while Impala does no conversion. Hive does no time zone 
conversion when it queries Impala-written
+      Parquet files.
+    </p>
+
+<pre class="pre codeblock"><code>Impala DDL and queries for Parquet table:
+
+[localhost:21000] &gt; create table p1 stored as parquet as select x from t1;
++-------------------+
+| summary           |
++-------------------+
+| Inserted 2 row(s) |
++-------------------+
+[localhost:21000] &gt; select x from p1;
++-------------------------------+
+| x                             |
++-------------------------------+
+| 2015-04-07 15:43:02.892403000 |
+| 2015-04-08 15:43:02.892403000 |
++-------------------------------+
+
+Hive DDL and queries for Parquet table:
+
+hive&gt; create table h1 (x timestamp) stored as parquet;
+OK
+hive&gt; insert into h1 select * from p1;
+...
+OK
+Time taken: 35.573 seconds
+hive&gt; select x from p1;
+OK
+2015-04-07 15:43:02.892403
+2015-04-08 15:43:02.892403
+Time taken: 0.324 seconds, Fetched: 2 row(s)
+hive&gt; select x from h1;
+OK
+2015-04-07 15:43:02.892403
+2015-04-08 15:43:02.892403
+Time taken: 0.197 seconds, Fetched: 2 row(s)
+</code></pre>
+
+    <p class="p">
+      The discrepancy arises when Impala queries the Hive-created Parquet 
table. The underlying values in the
+      <code class="ph codeph">TIMESTAMP</code> column are different from the 
ones written by Impala, even though they were copied
+      from one table to another by an <code class="ph codeph">INSERT ... 
SELECT</code> statement in Hive. Hive did an implicit
+      conversion from the local time zone to UTC as it wrote the values to 
Parquet.
+    </p>
+
+<pre class="pre codeblock"><code>Impala query for TIMESTAMP values from 
Impala-written and Hive-written data:
+
+[localhost:21000] &gt; select * from p1;
++-------------------------------+
+| x                             |
++-------------------------------+
+| 2015-04-07 15:43:02.892403000 |
+| 2015-04-08 15:43:02.892403000 |
++-------------------------------+
+Fetched 2 row(s) in 0.29s
+[localhost:21000] &gt; select * from h1;
++-------------------------------+
+| x                             |
++-------------------------------+
+| 2015-04-07 22:43:02.892403000 |
+| 2015-04-08 22:43:02.892403000 |
++-------------------------------+
+Fetched 2 row(s) in 0.41s
+
+Underlying integer values for Impala-written and Hive-written data:
+
+[localhost:21000] &gt; select cast(x as bigint) from p1;
++-------------------+
+| cast(x as bigint) |
++-------------------+
+| 1428421382        |
+| 1428507782        |
++-------------------+
+Fetched 2 row(s) in 0.38s
+[localhost:21000] &gt; select cast(x as bigint) from h1;
++-------------------+
+| cast(x as bigint) |
++-------------------+
+| 1428446582        |
+| 1428532982        |
++-------------------+
+Fetched 2 row(s) in 0.20s
+</code></pre>
+
+    <p class="p">
+      When the <code class="ph 
codeph">-convert_legacy_hive_parquet_utc_timestamps</code> setting is enabled, 
Impala recognizes
+      the Parquet data files written by Hive, and applies the same 
UTC-to-local-timezone conversion logic during
+      the query as Hive uses, making the contents of the Impala-written <code 
class="ph codeph">P1</code> table and the
+      Hive-written <code class="ph codeph">H1</code> table appear identical, 
whether represented as <code class="ph codeph">TIMESTAMP</code>
+      values or the underlying <code class="ph codeph">BIGINT</code> integers:
+    </p>
+
+<pre class="pre codeblock"><code>[localhost:21000] &gt; select x from p1;
++-------------------------------+
+| x                             |
++-------------------------------+
+| 2015-04-07 15:43:02.892403000 |
+| 2015-04-08 15:43:02.892403000 |
++-------------------------------+
+Fetched 2 row(s) in 0.37s
+[localhost:21000] &gt; select x from h1;
++-------------------------------+
+| x                             |
++-------------------------------+
+| 2015-04-07 15:43:02.892403000 |
+| 2015-04-08 15:43:02.892403000 |
++-------------------------------+
+Fetched 2 row(s) in 0.19s
+[localhost:21000] &gt; select cast(x as bigint) from p1;
++-------------------+
+| cast(x as bigint) |
++-------------------+
+| 1428446582        |
+| 1428532982        |
++-------------------+
+Fetched 2 row(s) in 0.29s
+[localhost:21000] &gt; select cast(x as bigint) from h1;
++-------------------+
+| cast(x as bigint) |
++-------------------+
+| 1428446582        |
+| 1428532982        |
++-------------------+
+Fetched 2 row(s) in 0.22s
+</code></pre>
+
+    <p class="p">
+      <strong class="ph b">Conversions:</strong>
+    </p>
+
+    <p class="p">
+        Impala automatically converts <code class="ph codeph">STRING</code> 
literals of the correct format into
+        <code class="ph codeph">TIMESTAMP</code> values. Timestamp values are 
accepted in the format
+        <code class="ph codeph">"yyyy-MM-dd HH:mm:ss.SSSSSS"</code>, and can 
consist of just the date, or just the time, with or
+        without the fractional second portion. For example, you can specify 
<code class="ph codeph">TIMESTAMP</code> values such as
+        <code class="ph codeph">'1966-07-30'</code>, <code class="ph 
codeph">'08:30:00'</code>, or <code class="ph codeph">'1985-09-25 
17:45:30.005'</code>.
+        <span class="ph">Casting an integer or floating-point value <code 
class="ph codeph">N</code> to
+        <code class="ph codeph">TIMESTAMP</code> produces a value that is 
<code class="ph codeph">N</code> seconds past the start of the epoch
+        date (January 1, 1970). By default, the result value represents a date 
and time in the UTC time zone.
+        If the setting <code class="ph 
codeph">-use_local_tz_for_unix_timestamp_conversions=true</code> is in effect,
+        the resulting <code class="ph codeph">TIMESTAMP</code> represents a 
date and time in the local time zone.</span>
+      </p>
+
+    <p class="p">
+      In Impala 1.3 and higher, the <code class="ph 
codeph">FROM_UNIXTIME()</code> and <code class="ph 
codeph">UNIX_TIMESTAMP()</code>
+      functions allow a wider range of format strings, with more flexibility 
in element order, repetition of letter
+      placeholders, and separator characters. In <span class="keyword">Impala 
2.3</span> and higher, the <code class="ph codeph">UNIX_TIMESTAMP()</code>
+      function also allows a numeric timezone offset to be specified as part 
of the input string.
+      See <a class="xref" 
href="impala_datetime_functions.html#datetime_functions">Impala Date and Time 
Functions</a> for details.
+    </p>
+
+    <p class="p">
+        In Impala 2.2.0 and higher, built-in functions that accept or return 
integers representing <code class="ph codeph">TIMESTAMP</code> values
+        use the <code class="ph codeph">BIGINT</code> type for parameters and 
return values, rather than <code class="ph codeph">INT</code>.
+        This change lets the date and time functions avoid an overflow error 
that would otherwise occur
+        on January 19th, 2038 (known as the
+        <a class="xref" href="http://en.wikipedia.org/wiki/Year_2038_problem"; 
target="_blank"><span class="q">"Year 2038 problem"</span> or <span 
class="q">"Y2K38 problem"</span></a>).
+        This change affects the <code class="ph codeph">from_unixtime()</code> 
and <code class="ph codeph">unix_timestamp()</code> functions.
+        You might need to change application code that interacts with these 
functions, change the types of
+        columns that store the return values, or add <code class="ph 
codeph">CAST()</code> calls to SQL statements that
+        call these functions.
+      </p>
+
+    <p class="p">
+      <strong class="ph b">Partitioning:</strong>
+    </p>
+
+    <p class="p">
+      Although you cannot use a <code class="ph codeph">TIMESTAMP</code> 
column as a partition key, you can extract the
+      individual years, months, days, hours, and so on and partition based on 
those columns. Because the partition
+      key column values are represented in HDFS directory names, rather than 
as fields in the data files
+      themselves, you can also keep the original <code class="ph 
codeph">TIMESTAMP</code> values if desired, without duplicating
+      data or wasting storage space. See <a class="xref" 
href="impala_partitioning.html#partition_key_columns">Partition Key Columns</a> 
for more
+      details on partitioning with date and time values.
+    </p>
+
+<pre class="pre codeblock"><code>[localhost:21000] &gt; create table timeline 
(event string) partitioned by (happened timestamp);
+ERROR: AnalysisException: Type 'TIMESTAMP' is not supported as 
partition-column type in column: happened
+</code></pre>
+
+    <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+<pre class="pre codeblock"><code>select cast('1966-07-30' as timestamp);
+select cast('1985-09-25 17:45:30.005' as timestamp);
+select cast('08:30:00' as timestamp);
+select hour('1970-01-01 15:30:00');         -- Succeeds, returns 15.
+select hour('1970-01-01 15:30');            -- Returns NULL because seconds 
field required.
+select hour('1970-01-01 27:30:00');         -- Returns NULL because hour value 
out of range.
+select dayofweek('2004-06-13');             -- Returns 1, representing Sunday.
+select dayname('2004-06-13');               -- Returns 'Sunday'.
+select date_add('2004-06-13', 365);         -- Returns 2005-06-13 with zeros 
for hh:mm:ss fields.
+select day('2004-06-13');                   -- Returns 13.
+select datediff('1989-12-31','1984-09-01'); -- How many days between these 2 
dates?
+select now();                               -- Returns current date and time 
in local timezone.
+
+create table dates_and_times (t timestamp);
+insert into dates_and_times values
+  ('1966-07-30'), ('1985-09-25 17:45:30.005'), ('08:30:00'), (now());
+</code></pre>
+
+    <p class="p">
+        <strong class="ph b">NULL considerations:</strong> Casting any 
unrecognized <code class="ph codeph">STRING</code> value to this type produces a
+        <code class="ph codeph">NULL</code> value.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Partitioning:</strong> Because this type 
potentially has so many distinct values, it is often not a sensible
+        choice for a partition key column. For example, events 1 millisecond 
apart would be stored in different
+        partitions. Consider using the <code class="ph codeph">TRUNC()</code> 
function to condense the number of distinct values,
+        and partition on a new column with the truncated values.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">HBase considerations:</strong> This data type is 
fully compatible with HBase tables.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Parquet considerations:</strong> This type is 
fully compatible with Parquet tables.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Text table considerations:</strong> Values of 
this type are potentially larger in text tables than in tables
+        using Parquet or other binary formats.
+      </p>
+
+
+
+    <p class="p">
+        <strong class="ph b">Internal details:</strong> Represented in memory 
as a 16-byte value.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Added in:</strong> Available in all versions of 
Impala.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Column statistics considerations:</strong> 
Because this type has a fixed size, the maximum and average size
+        fields are always filled in for column statistics, even before you run 
the <code class="ph codeph">COMPUTE STATS</code>
+        statement.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Sqoop considerations:</strong>
+      </p>
+
+    <p class="p"> If you use Sqoop to
+        convert RDBMS data to Parquet, be careful with interpreting any
+        resulting values from <code class="ph codeph">DATE</code>, <code 
class="ph codeph">DATETIME</code>,
+        or <code class="ph codeph">TIMESTAMP</code> columns. The underlying 
values are
+        represented as the Parquet <code class="ph codeph">INT64</code> type, 
which is
+        represented as <code class="ph codeph">BIGINT</code> in the Impala 
table. The Parquet
+        values represent the time in milliseconds, while Impala interprets
+          <code class="ph codeph">BIGINT</code> as the time in seconds. 
Therefore, if you have
+        a <code class="ph codeph">BIGINT</code> column in a Parquet table that 
was imported
+        this way from Sqoop, divide the values by 1000 when interpreting as the
+          <code class="ph codeph">TIMESTAMP</code> type.</p>
+
+    <p class="p">
+        <strong class="ph b">Restrictions:</strong>
+      </p>
+
+    <p class="p">
+      If you cast a <code class="ph codeph">STRING</code> with an unrecognized 
format to a <code class="ph codeph">TIMESTAMP</code>, the result
+      is <code class="ph codeph">NULL</code> rather than an error. Make sure 
to test your data pipeline to be sure any textual
+      date and time values are in a format that Impala <code class="ph 
codeph">TIMESTAMP</code> can recognize.
+    </p>
+
+    <p class="p">
+        Currently, Avro tables cannot contain <code class="ph 
codeph">TIMESTAMP</code> columns. If you need to store date and
+        time values in Avro tables, as a workaround you can use a <code 
class="ph codeph">STRING</code> representation of the
+        values, convert the values to <code class="ph codeph">BIGINT</code> 
with the <code class="ph codeph">UNIX_TIMESTAMP()</code> function,
+        or create separate numeric columns for individual date and time fields 
using the <code class="ph codeph">EXTRACT()</code>
+        function.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Kudu considerations:</strong>
+      </p>
+    <p class="p">
+        Currently, the data types <code class="ph codeph">DECIMAL</code>, 
<code class="ph codeph">TIMESTAMP</code>, <code class="ph codeph">CHAR</code>, 
<code class="ph codeph">VARCHAR</code>,
+        <code class="ph codeph">ARRAY</code>, <code class="ph 
codeph">MAP</code>, and <code class="ph codeph">STRUCT</code> cannot be used 
with Kudu tables.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+    <ul class="ul">
+      <li class="li">
+
+        <a class="xref" 
href="impala_literals.html#timestamp_literals">Timestamp Literals</a>.
+      </li>
+
+      <li class="li">
+        To convert to or from different date formats, or perform date 
arithmetic, use the date and time functions
+        described in <a class="xref" 
href="impala_datetime_functions.html#datetime_functions">Impala Date and Time 
Functions</a>. In particular, the
+        <code class="ph codeph">from_unixtime()</code> function requires a 
case-sensitive format string such as
+        <code class="ph codeph">"yyyy-MM-dd HH:mm:ss.SSSS"</code>, matching 
one of the allowed variations of a
+        <code class="ph codeph">TIMESTAMP</code> value (date plus time, only 
date, only time, optional fractional seconds).
+      </li>
+
+      <li class="li">
+        See <a class="xref" 
href="impala_langref_unsupported.html#langref_hiveql_delta">SQL Differences 
Between Impala and Hive</a> for details about differences in
+        <code class="ph codeph">TIMESTAMP</code> handling between Impala and 
Hive.
+      </li>
+    </ul>
+
+  </div>
+
+<nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_datatypes.html">Data 
Types</a></div></div></nav></article></main></body></html>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/75c46918/docs/build/html/topics/impala_tinyint.html
----------------------------------------------------------------------
diff --git a/docs/build/html/topics/impala_tinyint.html 
b/docs/build/html/topics/impala_tinyint.html
new file mode 100644
index 0000000..9efc098
--- /dev/null
+++ b/docs/build/html/topics/impala_tinyint.html
@@ -0,0 +1,131 @@
+<!DOCTYPE html
+  SYSTEM "about:legacy-compat">
+<html lang="en"><head><meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8"><meta charset="UTF-8"><meta name="copyright" content="(C) 
Copyright 2017"><meta name="DC.rights.owner" content="(C) Copyright 2017"><meta 
name="DC.Type" content="concept"><meta name="DC.Relation" scheme="URI" 
content="../topics/impala_datatypes.html"><meta name="prodname" 
content="Impala"><meta name="prodname" content="Impala"><meta name="version" 
content="Impala 2.8.x"><meta name="version" content="Impala 2.8.x"><meta 
name="DC.Format" content="XHTML"><meta name="DC.Identifier" 
content="tinyint"><link rel="stylesheet" type="text/css" 
href="../commonltr.css"><title>TINYINT Data Type</title></head><body 
id="tinyint"><main role="main"><article role="article" 
aria-labelledby="ariaid-title1">
+
+  <h1 class="title topictitle1" id="ariaid-title1">TINYINT Data Type</h1>
+  
+  
+
+  <div class="body conbody">
+
+    <p class="p">
+      A 1-byte integer data type used in <code class="ph codeph">CREATE 
TABLE</code> and <code class="ph codeph">ALTER TABLE</code> statements.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">Syntax:</strong>
+      </p>
+
+    <p class="p">
+      In the column definition of a <code class="ph codeph">CREATE 
TABLE</code> statement:
+    </p>
+
+<pre class="pre codeblock"><code><var class="keyword 
varname">column_name</var> TINYINT</code></pre>
+
+    <p class="p">
+      <strong class="ph b">Range:</strong> -128 .. 127. There is no <code 
class="ph codeph">UNSIGNED</code> subtype.
+    </p>
+
+    <p class="p">
+      <strong class="ph b">Conversions:</strong> Impala automatically converts 
to a larger integer type (<code class="ph codeph">SMALLINT</code>,
+      <code class="ph codeph">INT</code>, or <code class="ph 
codeph">BIGINT</code>) or a floating-point type (<code class="ph 
codeph">FLOAT</code> or
+      <code class="ph codeph">DOUBLE</code>) automatically. Use <code 
class="ph codeph">CAST()</code> to convert to <code class="ph 
codeph">STRING</code> or
+      <code class="ph codeph">TIMESTAMP</code>.
+      <span class="ph">Casting an integer or floating-point value <code 
class="ph codeph">N</code> to
+        <code class="ph codeph">TIMESTAMP</code> produces a value that is 
<code class="ph codeph">N</code> seconds past the start of the epoch
+        date (January 1, 1970). By default, the result value represents a date 
and time in the UTC time zone.
+        If the setting <code class="ph 
codeph">-use_local_tz_for_unix_timestamp_conversions=true</code> is in effect,
+        the resulting <code class="ph codeph">TIMESTAMP</code> represents a 
date and time in the local time zone.</span>
+    </p>
+
+    <p class="p">
+        Impala does not return column overflows as <code class="ph 
codeph">NULL</code>, so that customers can distinguish
+        between <code class="ph codeph">NULL</code> data and overflow 
conditions similar to how they do so with traditional
+        database systems. Impala returns the largest or smallest value in the 
range for the type. For example,
+        valid values for a <code class="ph codeph">tinyint</code> range from 
-128 to 127. In Impala, a <code class="ph codeph">tinyint</code>
+        with a value of -200 returns -128 rather than <code class="ph 
codeph">NULL</code>. A <code class="ph codeph">tinyint</code> with a
+        value of 200 returns 127.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Usage notes:</strong>
+      </p>
+
+    <p class="p">
+      For a convenient and automated way to check the bounds of the <code 
class="ph codeph">TINYINT</code> type, call the
+      functions <code class="ph codeph">MIN_TINYINT()</code> and <code 
class="ph codeph">MAX_TINYINT()</code>.
+    </p>
+
+    <p class="p">
+      If an integer value is too large to be represented as a <code class="ph 
codeph">TINYINT</code>, use a
+      <code class="ph codeph">SMALLINT</code> instead.
+    </p>
+
+    <p class="p">
+        <strong class="ph b">NULL considerations:</strong> Casting any 
non-numeric value to this type produces a <code class="ph codeph">NULL</code>
+        value.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Examples:</strong>
+      </p>
+
+<pre class="pre codeblock"><code>CREATE TABLE t1 (x TINYINT);
+SELECT CAST(100 AS TINYINT);
+</code></pre>
+
+    <p class="p">
+        <strong class="ph b">Parquet considerations:</strong>
+      </p>
+
+
+
+    <p class="p">
+      Physically, Parquet files represent <code class="ph 
codeph">TINYINT</code> and <code class="ph codeph">SMALLINT</code> values as 
32-bit
+      integers. Although Impala rejects attempts to insert out-of-range values 
into such columns, if you create a
+      new table with the <code class="ph codeph">CREATE TABLE ... LIKE 
PARQUET</code> syntax, any <code class="ph codeph">TINYINT</code> or
+      <code class="ph codeph">SMALLINT</code> columns in the original table 
turn into <code class="ph codeph">INT</code> columns in the new
+      table.
+    </p>
+
+
+
+    <p class="p">
+        <strong class="ph b">HBase considerations:</strong> This data type is 
fully compatible with HBase tables.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Text table considerations:</strong> Values of 
this type are potentially larger in text tables than in tables
+        using Parquet or other binary formats.
+      </p>
+
+
+
+    <p class="p">
+        <strong class="ph b">Internal details:</strong> Represented in memory 
as a 1-byte value.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Added in:</strong> Available in all versions of 
Impala.
+      </p>
+
+    <p class="p">
+        <strong class="ph b">Column statistics considerations:</strong> 
Because this type has a fixed size, the maximum and average size
+        fields are always filled in for column statistics, even before you run 
the <code class="ph codeph">COMPUTE STATS</code>
+        statement.
+      </p>
+
+
+
+    <p class="p">
+        <strong class="ph b">Related information:</strong>
+      </p>
+
+    <p class="p">
+      <a class="xref" href="impala_literals.html#numeric_literals">Numeric 
Literals</a>, <a class="xref" href="impala_tinyint.html#tinyint">TINYINT Data 
Type</a>,
+      <a class="xref" href="impala_smallint.html#smallint">SMALLINT Data 
Type</a>, <a class="xref" href="impala_int.html#int">INT Data Type</a>,
+      <a class="xref" href="impala_bigint.html#bigint">BIGINT Data Type</a>, 
<a class="xref" href="impala_decimal.html#decimal">DECIMAL Data Type (Impala 
1.4 or higher only)</a>,
+      <a class="xref" href="impala_math_functions.html#math_functions">Impala 
Mathematical Functions</a>
+    </p>
+  </div>
+<nav role="navigation" class="related-links"><div class="familylinks"><div 
class="parentlink"><strong>Parent topic:</strong> <a class="link" 
href="../topics/impala_datatypes.html">Data 
Types</a></div></div></nav></article></main></body></html>
\ No newline at end of file

[03/51] [partial] incubator-impala git commit: IMPALA-4181 [DOCS] Publish rendered Impala documentation to ASF site

Reply via email to