[08/22] incubator-impala git commit: First try at porting over the source files necessary for the Impala SQL Reference.

jrussell Tue, 26 Jul 2016 16:06:13 -0700

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_max.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_max.xml b/docs/topics/impala_max.xml
new file mode 100644
index 0000000..b989785
--- /dev/null
+++ b/docs/topics/impala_max.xml
@@ -0,0 +1,192 @@
+<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE concept PUBLIC "-//OASIS//DTD 
DITA Concept//EN" "concept.dtd">
+<concept id="max">
+
+  <title>MAX Function</title>
+  <titlealts><navtitle>MAX</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Impala Functions"/>
+      <data name="Category" value="Analytic Functions"/>
+      <data name="Category" value="Aggregate Functions"/>
+      <data name="Category" value="Querying"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">max() function</indexterm>
+      An aggregate function that returns the maximum value from a set of 
numbers. Opposite of the
+      <codeph>MIN</codeph> function. Its single argument can be numeric 
column, or the numeric result of a function
+      or expression applied to the column value. Rows with a 
<codeph>NULL</codeph> value for the specified column
+      are ignored. If the table is empty, or all the values supplied to 
<codeph>MAX</codeph> are
+      <codeph>NULL</codeph>, <codeph>MAX</codeph> returns 
<codeph>NULL</codeph>.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+<codeblock>MAX([DISTINCT | ALL] <varname>expression</varname>) [OVER 
(<varname>analytic_clause</varname>)]</codeblock>
+
+    <p>
+      When the query contains a <codeph>GROUP BY</codeph> clause, returns one 
value for each combination of
+      grouping values.
+    </p>
+
+    <p 
conref="../shared/impala_common.xml#common/restrictions_sliding_window"/>
+
+    <p 
conref="../shared/impala_common.xml#common/return_type_same_except_string"/>
+
+    <p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
+
+    <p 
conref="../shared/impala_common.xml#common/complex_types_aggregation_explanation"/>
+    
+    <p 
conref="../shared/impala_common.xml#common/complex_types_aggregation_example"/>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+<codeblock>-- Find the largest value for this column in the table.
+select max(c1) from t1;
+-- Find the largest value for this column from a subset of the table.
+select max(c1) from t1 where month = 'January' and year = '2013';
+-- Find the largest value from a set of numeric function results.
+select max(length(s)) from t1;
+-- Can also be used in combination with DISTINCT and/or GROUP BY.
+-- Return more than one result.
+select month, year, max(purchase_price) from store_stats group by month, year;
+-- Filter the input to eliminate duplicates before performing the calculation.
+select max(distinct x) from t1;
+</codeblock>
+
+    <p rev="2.0.0">
+      The following examples show how to use <codeph>MAX()</codeph> in an 
analytic context. They use a table
+      containing integers from 1 to 10. Notice how the <codeph>MAX()</codeph> 
is reported for each input value, as
+      opposed to the <codeph>GROUP BY</codeph> clause which condenses the 
result set.
+<codeblock>select x, property, max(x) over (partition by property) as max from 
int_t where property in ('odd','even');
++----+----------+-----+
+| x  | property | max |
++----+----------+-----+
+| 2  | even     | 10  |
+| 4  | even     | 10  |
+| 6  | even     | 10  |
+| 8  | even     | 10  |
+| 10 | even     | 10  |
+| 1  | odd      | 9   |
+| 3  | odd      | 9   |
+| 5  | odd      | 9   |
+| 7  | odd      | 9   |
+| 9  | odd      | 9   |
++----+----------+-----+
+</codeblock>
+
+Adding an <codeph>ORDER BY</codeph> clause lets you experiment with results 
that are cumulative or apply to a moving
+set of rows (the <q>window</q>). The following examples use 
<codeph>MAX()</codeph> in an analytic context
+(that is, with an <codeph>OVER()</codeph> clause) to display the smallest 
value of <codeph>X</codeph>
+encountered up to each row in the result set. The examples use two columns in 
the <codeph>ORDER BY</codeph>
+clause to produce a sequence of values that rises and falls, to illustrate how 
the <codeph>MAX()</codeph>
+result only increases or stays the same throughout each partition within the 
result set.
+The basic <codeph>ORDER BY x</codeph> clause implicitly
+activates a window clause of <codeph>RANGE BETWEEN UNBOUNDED PRECEDING AND 
CURRENT ROW</codeph>,
+which is effectively the same as <codeph>ROWS BETWEEN UNBOUNDED PRECEDING AND 
CURRENT ROW</codeph>,
+therefore all of these examples produce the same results:
+
+<codeblock>select x, property,
+  max(x) <b>over (order by property, x desc)</b> as 'maximum to this point'
+from int_t where property in ('prime','square');
++---+----------+-----------------------+
+| x | property | maximum to this point |
++---+----------+-----------------------+
+| 7 | prime    | 7                     |
+| 5 | prime    | 7                     |
+| 3 | prime    | 7                     |
+| 2 | prime    | 7                     |
+| 9 | square   | 9                     |
+| 4 | square   | 9                     |
+| 1 | square   | 9                     |
++---+----------+-----------------------+
+
+select x, property,
+  max(x) over
+  (
+    <b>order by property, x desc</b>
+    <b>rows between unbounded preceding and current row</b>
+  ) as 'maximum to this point'        
+from int_t where property in ('prime','square');
++---+----------+-----------------------+
+| x | property | maximum to this point |
++---+----------+-----------------------+
+| 7 | prime    | 7                     |
+| 5 | prime    | 7                     |
+| 3 | prime    | 7                     |
+| 2 | prime    | 7                     |
+| 9 | square   | 9                     |
+| 4 | square   | 9                     |
+| 1 | square   | 9                     |
++---+----------+-----------------------+
+
+select x, property,
+  max(x) over
+  (
+    <b>order by property, x desc</b>
+    <b>range between unbounded preceding and current row</b>
+  ) as 'maximum to this point'        
+from int_t where property in ('prime','square');
++---+----------+-----------------------+
+| x | property | maximum to this point |
++---+----------+-----------------------+
+| 7 | prime    | 7                     |
+| 5 | prime    | 7                     |
+| 3 | prime    | 7                     |
+| 2 | prime    | 7                     |
+| 9 | square   | 9                     |
+| 4 | square   | 9                     |
+| 1 | square   | 9                     |
++---+----------+-----------------------+
+</codeblock>
+
+The following examples show how to construct a moving window, with a running 
maximum taking into account all rows before
+and 1 row after the current row.
+Because of a restriction in the Impala <codeph>RANGE</codeph> syntax, this 
type of
+moving window is possible with the <codeph>ROWS BETWEEN</codeph> clause but 
not the <codeph>RANGE BETWEEN</codeph> clause.
+Because of an extra Impala restriction on the <codeph>MAX()</codeph> and 
<codeph>MIN()</codeph> functions in an
+analytic context, the lower bound must be <codeph>UNBOUNDED PRECEDING</codeph>.
+<codeblock>select x, property,
+  max(x) over
+  (
+    <b>order by property, x</b>
+    <b>rows between unbounded preceding and 1 following</b>
+  ) as 'local maximum'             
+from int_t where property in ('prime','square');
++---+----------+---------------+
+| x | property | local maximum |
++---+----------+---------------+
+| 2 | prime    | 3             |
+| 3 | prime    | 5             |
+| 5 | prime    | 7             |
+| 7 | prime    | 7             |
+| 1 | square   | 7             |
+| 4 | square   | 9             |
+| 9 | square   | 9             |
++---+----------+---------------+
+
+-- Doesn't work because of syntax restriction on RANGE clause.
+select x, property,
+  max(x) over
+  (
+    <b>order by property, x</b>
+    <b>range between unbounded preceding and 1 following</b>
+  ) as 'local maximum'
+from int_t where property in ('prime','square');
+ERROR: AnalysisException: RANGE is only supported with both the lower and 
upper bounds UNBOUNDED or one UNBOUNDED and the other CURRENT ROW.
+</codeblock>
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      <xref href="impala_analytic_functions.xml#analytic_functions"/>, <xref 
href="impala_min.xml#min"/>,
+      <xref href="impala_avg.xml#avg"/>
+    </p>
+  </conbody>
+</concept>


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_max_errors.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_max_errors.xml 
b/docs/topics/impala_max_errors.xml
new file mode 100644
index 0000000..86f3618
--- /dev/null
+++ b/docs/topics/impala_max_errors.xml
@@ -0,0 +1,44 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="max_errors">
+
+  <title>MAX_ERRORS Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Troubleshooting"/>
+      <data name="Category" value="Logs"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">MAX_ERRORS query option</indexterm>
+      Maximum number of non-fatal errors for any particular query that are 
recorded in the Impala log file. For
+      example, if a billion-row table had a non-fatal data error in every row, 
you could diagnose the problem
+      without all billion errors being logged. Unspecified or 0 indicates the 
built-in default value of 1000.
+    </p>
+
+    <p>
+      This option only controls how many errors are reported. To specify 
whether Impala continues or halts when it
+      encounters such errors, use the <codeph>ABORT_ON_ERROR</codeph> option.
+    </p>
+
+    <p>
+      <b>Type:</b> numeric
+    </p>
+
+    <p>
+      <b>Default:</b> 0 (meaning 1000 errors)
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+    <p>
+      <xref href="impala_abort_on_error.xml#abort_on_error"/>,
+      <xref href="impala_logging.xml#logging"/>
+    </p>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_max_io_buffers.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_max_io_buffers.xml 
b/docs/topics/impala_max_io_buffers.xml
new file mode 100644
index 0000000..b08c57e
--- /dev/null
+++ b/docs/topics/impala_max_io_buffers.xml
@@ -0,0 +1,28 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="max_io_buffers">
+
+  <title>MAX_IO_BUFFERS Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Deprecated Features"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      Deprecated query option. Currently has no effect.
+    </p>
+
+    <p>
+      <b>Type:</b> numeric
+    </p>
+
+    <p>
+      <b>Default:</b> 0
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_max_scan_range_length.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_max_scan_range_length.xml 
b/docs/topics/impala_max_scan_range_length.xml
new file mode 100644
index 0000000..a790fc7
--- /dev/null
+++ b/docs/topics/impala_max_scan_range_length.xml
@@ -0,0 +1,45 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="max_scan_range_length">
+
+  <title>MAX_SCAN_RANGE_LENGTH Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">MAX_SCAN_RANGE_LENGTH query 
option</indexterm>
+      Maximum length of the scan range. Interacts with the number of HDFS 
blocks in the table to determine how many
+      CPU cores across the cluster are involved with the processing for a 
query. (Each core processes one scan
+      range.)
+    </p>
+
+    <p>
+      Lowering the value can sometimes increase parallelism if you have unused 
CPU capacity, but a too-small value
+      can limit query performance because each scan range involves extra 
overhead.
+    </p>
+
+    <p>
+      Only applicable to HDFS tables. Has no effect on Parquet tables. 
Unspecified or 0 indicates backend default,
+      which is the same as the HDFS block size for each table.
+    </p>
+
+    <p>
+      Although the scan range can be arbitrarily long, Impala internally uses 
an 8 MB read buffer so that it can
+      query tables with huge block sizes without allocating equivalent blocks 
of memory.
+    </p>
+
+    <p>
+      <b>Type:</b> numeric
+    </p>
+
+    <p>
+      <b>Default:</b> 0
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_mem_limit.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_mem_limit.xml b/docs/topics/impala_mem_limit.xml
new file mode 100644
index 0000000..fd12953
--- /dev/null
+++ b/docs/topics/impala_mem_limit.xml
@@ -0,0 +1,208 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="mem_limit">
+
+  <title>MEM_LIMIT Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Scalability"/>
+      <data name="Category" value="Memory"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">MEM_LIMIT query option</indexterm>
+      When resource management is not enabled, defines the maximum amount of 
memory a query can allocate on each node.
+      Therefore, the total memory that can be used by a query is the 
<codeph>MEM_LIMIT</codeph> times the number of nodes.
+    </p>
+
+    <p rev="CDH-32135">
+      There are two levels of memory limit for Impala.
+      The <codeph>-mem_limit</codeph> startup option sets an overall limit for 
the <cmdname>impalad</cmdname> process
+      (which handles multiple queries concurrently).
+      That limit is typically expressed in terms of a percentage of the RAM 
available on the host, such as <codeph>-mem_limit=70%</codeph>.
+      The <codeph>MEM_LIMIT</codeph> query option, which you set through 
<cmdname>impala-shell</cmdname>
+      or the <codeph>SET</codeph> statement in a JDBC or ODBC application, 
applies to each individual query.
+      The <codeph>MEM_LIMIT</codeph> query option is usually expressed as a 
fixed size such as <codeph>10gb</codeph>,
+      and must always be less than the <cmdname>impalad</cmdname> memory limit.
+    </p>
+
+    <p rev="CDH-32135">
+      If query processing exceeds the specified memory limit on any node, 
either the per-query limit or the
+      <cmdname>impalad</cmdname> limit, Impala cancels the query automatically.
+      Memory limits are checked periodically during query processing, so the 
actual memory in use
+      might briefly exceed the limit without the query being cancelled.
+    </p>
+
+    <p>
+      When resource management is enabled in CDH 5, the mechanism for this 
option changes. If set, it overrides the
+      automatic memory estimate from Impala. Impala requests this amount of 
memory from YARN on each node, and the
+      query does not proceed until that much memory is available. The actual 
memory used by the query could be
+      lower, since some queries use much less memory than others. With 
resource management, the
+      <codeph>MEM_LIMIT</codeph> setting acts both as a hard limit on the 
amount of memory a query can use on any
+      node (enforced by YARN) and a guarantee that that much memory will be 
available on each node while the query
+      is being executed. When resource management is enabled but no 
<codeph>MEM_LIMIT</codeph> setting is
+      specified, Impala estimates the amount of memory needed on each node for 
each query, requests that much
+      memory from YARN before starting the query, and then internally sets the 
<codeph>MEM_LIMIT</codeph> on each
+      node to the requested amount of memory during the query. Thus, if the 
query takes more memory than was
+      originally estimated, Impala detects that the <codeph>MEM_LIMIT</codeph> 
is exceeded and cancels the query
+      itself.
+    </p>
+
+    <p>
+      <b>Type:</b> numeric
+    </p>
+
+    <p rev="CDH-32135">
+      <b>Units:</b> A numeric argument represents memory size in bytes; you 
can also use a suffix of <codeph>m</codeph> or <codeph>mb</codeph>
+      for megabytes, or more commonly <codeph>g</codeph> or 
<codeph>gb</codeph> for gigabytes. If you specify a value with unrecognized
+      formats, subsequent queries fail with an error.
+    </p>
+
+    <p rev="CDH-32135">
+      <b>Default:</b> 0 (unlimited)
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p rev="CDH-32135">
+      The <codeph>MEM_LIMIT</codeph> setting is primarily useful in a 
high-concurrency setting,
+      or on a cluster with a workload shared between Impala and other data 
processing components.
+      You can prevent any query from accidentally using much more memory than 
expected,
+      which could negatively impact other Impala queries.
+    </p>
+
+    <p rev="CDH-32135">
+      Use the output of the <codeph>SUMMARY</codeph> command in 
<cmdname>impala-shell</cmdname>
+      to get a report of memory used for each phase of your most heavyweight 
queries on each node, 
+      and then set a <codeph>MEM_LIMIT</codeph> somewhat higher than that.
+      See <xref href="impala_explain_plan.xml#perf_summary"/> for usage 
information about
+      the <codeph>SUMMARY</codeph> command.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb" 
rev="CDH-32135"/>
+
+    <p rev="CDH-32135">
+      The following examples show how to set the <codeph>MEM_LIMIT</codeph> 
query option
+      using a fixed number of bytes, or suffixes representing gigabytes or 
megabytes.
+    </p>
+
+<codeblock rev="CDH-32135">
+[localhost:21000] > set mem_limit=3000000000;
+MEM_LIMIT set to 3000000000
+[localhost:21000] > select 5;
+Query: select 5
++---+
+| 5 |
++---+
+| 5 |
++---+
+
+[localhost:21000] > set mem_limit=3g;
+MEM_LIMIT set to 3g
+[localhost:21000] > select 5;
+Query: select 5
++---+
+| 5 |
++---+
+| 5 |
++---+
+
+[localhost:21000] > set mem_limit=3gb;
+MEM_LIMIT set to 3gb
+[localhost:21000] > select 5;
++---+
+| 5 |
++---+
+| 5 |
++---+
+
+[localhost:21000] > set mem_limit=3m;
+MEM_LIMIT set to 3m
+[localhost:21000] > select 5;
++---+
+| 5 |
++---+
+| 5 |
++---+
+[localhost:21000] > set mem_limit=3mb;
+MEM_LIMIT set to 3mb
+[nightly55-2.vpc.cloudera.com:21000] > select 5;
++---+
+| 5 |
++---+
+</codeblock>
+
+    <p rev="CDH-32135">
+      The following examples show how unrecognized <codeph>MEM_LIMIT</codeph>
+      values lead to errors for subsequent queries.
+    </p>
+
+<codeblock rev="CDH-32135">
+[localhost:21000] > set mem_limit=3tb;
+MEM_LIMIT set to 3tb
+[localhost:21000] > select 5;
+ERROR: Failed to parse query memory limit from '3tb'.
+
+[localhost:21000] > set mem_limit=xyz;
+MEM_LIMIT set to xyz
+[localhost:21000] > select 5;
+Query: select 5
+ERROR: Failed to parse query memory limit from 'xyz'.
+</codeblock>
+
+    <p rev="CDH-32135">
+      The following examples shows the automatic query cancellation
+      when the <codeph>MEM_LIMIT</codeph> value is exceeded
+      on any host involved in the Impala query. First it runs a
+      successful query and checks the largest amount of memory
+      used on any node for any stage of the query.
+      Then it sets an artificially low <codeph>MEM_LIMIT</codeph>
+      setting so that the same query cannot run.
+    </p>
+
+<codeblock rev="CDH-32135">
+[localhost:21000] > select count(*) from customer;
+Query: select count(*) from customer
++----------+
+| count(*) |
++----------+
+| 150000   |
++----------+
+
+[localhost:21000] > select count(distinct c_name) from customer;
+Query: select count(distinct c_name) from customer
++------------------------+
+| count(distinct c_name) |
++------------------------+
+| 150000                 |
++------------------------+
+
+[localhost:21000] > summary;
++--------------+--------+----------+----------+---------+------------+----------+---------------+---------------+
+| Operator     | #Hosts | Avg Time | Max Time | #Rows   | Est. #Rows | Peak 
Mem | Est. Peak Mem | Detail        |
++--------------+--------+----------+----------+---------+------------+----------+---------------+---------------+
+| 06:AGGREGATE | 1      | 230.00ms | 230.00ms | 1       | 1          | 16.00 
KB | -1 B          | FINALIZE      |
+| 05:EXCHANGE  | 1      | 43.44us  | 43.44us  | 1       | 1          | 0 B     
 | -1 B          | UNPARTITIONED |
+| 02:AGGREGATE | 1      | 227.14ms | 227.14ms | 1       | 1          | 12.00 
KB | 10.00 MB      |               |
+| 04:AGGREGATE | 1      | 126.27ms | 126.27ms | 150.00K | 150.00K    | 15.17 
MB | 10.00 MB      |               |
+| 03:EXCHANGE  | 1      | 44.07ms  | 44.07ms  | 150.00K | 150.00K    | 0 B     
 | 0 B           | HASH(c_name)  |
+<b>| 01:AGGREGATE | 1      | 361.94ms | 361.94ms | 150.00K | 150.00K    | 
23.04 MB | 10.00 MB      |               |</b>
+| 00:SCAN HDFS | 1      | 43.64ms  | 43.64ms  | 150.00K | 150.00K    | 24.19 
MB | 64.00 MB      | tpch.customer |
++--------------+--------+----------+----------+---------+------------+----------+---------------+---------------+
+
+[localhost:21000] > set mem_limit=15mb;
+MEM_LIMIT set to 15mb
+[localhost:21000] > select count(distinct c_name) from customer;
+Query: select count(distinct c_name) from customer
+ERROR: 
+Memory limit exceeded
+Query did not have enough memory to get the minimum required buffers in the 
block manager.
+</codeblock>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_min.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_min.xml b/docs/topics/impala_min.xml
new file mode 100644
index 0000000..a63fc4c
--- /dev/null
+++ b/docs/topics/impala_min.xml
@@ -0,0 +1,191 @@
+<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE concept PUBLIC "-//OASIS//DTD 
DITA Concept//EN" "concept.dtd">
+<concept id="min">
+
+  <title>MIN Function</title>
+  <titlealts><navtitle>MIN</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Impala Functions"/>
+      <data name="Category" value="Analytic Functions"/>
+      <data name="Category" value="Aggregate Functions"/>
+      <data name="Category" value="Querying"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">min() function</indexterm>
+      An aggregate function that returns the minimum value from a set of 
numbers. Opposite of the
+      <codeph>MAX</codeph> function. Its single argument can be numeric 
column, or the numeric result of a function
+      or expression applied to the column value. Rows with a 
<codeph>NULL</codeph> value for the specified column
+      are ignored. If the table is empty, or all the values supplied to 
<codeph>MIN</codeph> are
+      <codeph>NULL</codeph>, <codeph>MIN</codeph> returns 
<codeph>NULL</codeph>.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+<codeblock>MIN([DISTINCT | ALL] <varname>expression</varname>) [OVER 
(<varname>analytic_clause</varname>)]</codeblock>
+
+    <p>
+      When the query contains a <codeph>GROUP BY</codeph> clause, returns one 
value for each combination of
+      grouping values.
+    </p>
+
+    <p 
conref="../shared/impala_common.xml#common/restrictions_sliding_window"/>
+
+    <p 
conref="../shared/impala_common.xml#common/return_type_same_except_string"/>
+
+    <p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
+
+    <p 
conref="../shared/impala_common.xml#common/complex_types_aggregation_explanation"/>
+    
+    <p 
conref="../shared/impala_common.xml#common/complex_types_aggregation_example"/>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+<codeblock>-- Find the smallest value for this column in the table.
+select min(c1) from t1;
+-- Find the smallest value for this column from a subset of the table.
+select min(c1) from t1 where month = 'January' and year = '2013';
+-- Find the smallest value from a set of numeric function results.
+select min(length(s)) from t1;
+-- Can also be used in combination with DISTINCT and/or GROUP BY.
+-- Return more than one result.
+select month, year, min(purchase_price) from store_stats group by month, year;
+-- Filter the input to eliminate duplicates before performing the calculation.
+select min(distinct x) from t1;
+</codeblock>
+
+    <p rev="2.0.0">
+      The following examples show how to use <codeph>MIN()</codeph> in an 
analytic context. They use a table
+      containing integers from 1 to 10. Notice how the <codeph>MIN()</codeph> 
is reported for each input value, as
+      opposed to the <codeph>GROUP BY</codeph> clause which condenses the 
result set.
+<codeblock>select x, property, min(x) over (partition by property) as min from 
int_t where property in ('odd','even');
++----+----------+-----+
+| x  | property | min |
++----+----------+-----+
+| 2  | even     | 2   |
+| 4  | even     | 2   |
+| 6  | even     | 2   |
+| 8  | even     | 2   |
+| 10 | even     | 2   |
+| 1  | odd      | 1   |
+| 3  | odd      | 1   |
+| 5  | odd      | 1   |
+| 7  | odd      | 1   |
+| 9  | odd      | 1   |
++----+----------+-----+
+</codeblock>
+
+Adding an <codeph>ORDER BY</codeph> clause lets you experiment with results 
that are cumulative or apply to a moving
+set of rows (the <q>window</q>). The following examples use 
<codeph>MIN()</codeph> in an analytic context
+(that is, with an <codeph>OVER()</codeph> clause) to display the smallest 
value of <codeph>X</codeph>
+encountered up to each row in the result set. The examples use two columns in 
the <codeph>ORDER BY</codeph>
+clause to produce a sequence of values that rises and falls, to illustrate how 
the <codeph>MIN()</codeph>
+result only decreases or stays the same throughout each partition within the 
result set.
+The basic <codeph>ORDER BY x</codeph> clause implicitly
+activates a window clause of <codeph>RANGE BETWEEN UNBOUNDED PRECEDING AND 
CURRENT ROW</codeph>,
+which is effectively the same as <codeph>ROWS BETWEEN UNBOUNDED PRECEDING AND 
CURRENT ROW</codeph>,
+therefore all of these examples produce the same results:
+
+<codeblock>select x, property, min(x) <b>over (order by property, x desc)</b> 
as 'minimum to this point'
+  from int_t where property in ('prime','square');
++---+----------+-----------------------+
+| x | property | minimum to this point |
++---+----------+-----------------------+
+| 7 | prime    | 7                     |
+| 5 | prime    | 5                     |
+| 3 | prime    | 3                     |
+| 2 | prime    | 2                     |
+| 9 | square   | 2                     |
+| 4 | square   | 2                     |
+| 1 | square   | 1                     |
++---+----------+-----------------------+
+
+select x, property,
+  min(x) over
+  (
+    <b>order by property, x desc</b>
+    <b>range between unbounded preceding and current row</b>
+  ) as 'minimum to this point'
+from int_t where property in ('prime','square');
++---+----------+-----------------------+
+| x | property | minimum to this point |
++---+----------+-----------------------+
+| 7 | prime    | 7                     |
+| 5 | prime    | 5                     |
+| 3 | prime    | 3                     |
+| 2 | prime    | 2                     |
+| 9 | square   | 2                     |
+| 4 | square   | 2                     |
+| 1 | square   | 1                     |
++---+----------+-----------------------+
+
+select x, property,
+  min(x) over
+  (
+    <b>order by property, x desc</b>
+    <b>rows between unbounded preceding and current row</b>
+  ) as 'minimum to this point'
+from int_t where property in ('prime','square');
++---+----------+-----------------------+
+| x | property | minimum to this point |
++---+----------+-----------------------+
+| 7 | prime    | 7                     |
+| 5 | prime    | 5                     |
+| 3 | prime    | 3                     |
+| 2 | prime    | 2                     |
+| 9 | square   | 2                     |
+| 4 | square   | 2                     |
+| 1 | square   | 1                     |
++---+----------+-----------------------+
+</codeblock>
+
+The following examples show how to construct a moving window, with a running 
minimum taking into account all rows before
+and 1 row after the current row.
+Because of a restriction in the Impala <codeph>RANGE</codeph> syntax, this 
type of
+moving window is possible with the <codeph>ROWS BETWEEN</codeph> clause but 
not the <codeph>RANGE BETWEEN</codeph> clause.
+Because of an extra Impala restriction on the <codeph>MAX()</codeph> and 
<codeph>MIN()</codeph> functions in an
+analytic context, the lower bound must be <codeph>UNBOUNDED PRECEDING</codeph>.
+<codeblock>select x, property,
+  min(x) over
+  (
+    <b>order by property, x desc</b>
+    <b>rows between unbounded preceding and 1 following</b>
+  ) as 'local minimum'         
+from int_t where property in ('prime','square');
++---+----------+---------------+
+| x | property | local minimum |
++---+----------+---------------+
+| 7 | prime    | 5             |
+| 5 | prime    | 3             |
+| 3 | prime    | 2             |
+| 2 | prime    | 2             |
+| 9 | square   | 2             |
+| 4 | square   | 1             |
+| 1 | square   | 1             |
++---+----------+---------------+
+
+-- Doesn't work because of syntax restriction on RANGE clause.
+select x, property,
+  min(x) over
+  (
+    <b>order by property, x desc</b>
+    <b>range between unbounded preceding and 1 following</b>
+  ) as 'local minimum'
+from int_t where property in ('prime','square');
+ERROR: AnalysisException: RANGE is only supported with both the lower and 
upper bounds UNBOUNDED or one UNBOUNDED and the other CURRENT ROW.
+</codeblock>
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      <xref href="impala_analytic_functions.xml#analytic_functions"/>, <xref 
href="impala_max.xml#max"/>,
+      <xref href="impala_avg.xml#avg"/>
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_misc_functions.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_misc_functions.xml 
b/docs/topics/impala_misc_functions.xml
new file mode 100644
index 0000000..bb9f062
--- /dev/null
+++ b/docs/topics/impala_misc_functions.xml
@@ -0,0 +1,148 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="misc_functions">
+
+  <title>Impala Miscellaneous Functions</title>
+  <titlealts><navtitle>Miscellaneous Functions</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Functions"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Querying"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      Impala supports the following utility functions that do not operate on a 
particular column or data type:
+    </p>
+
+    <dl>
+      <dlentry rev="1.3.0" id="current_database">
+
+        <dt>
+          <codeph>current_database()</codeph>
+        </dt>
+
+        <dd>
+          <indexterm audience="Cloudera">current_database() 
function</indexterm>
+          <b>Purpose:</b> Returns the database that the session is currently 
using, either <codeph>default</codeph>
+          if no database has been selected, or whatever database the session 
switched to through a
+          <codeph>USE</codeph> statement or the 
<cmdname>impalad</cmdname><codeph>-d</codeph> option.
+          <p>
+            <b>Return type:</b> <codeph>string</codeph>
+          </p>
+        </dd>
+
+      </dlentry>
+
+      <dlentry rev="5.4.5" id="effective_user">
+
+        <dt>
+          <codeph>effective_user()</codeph>
+        </dt>
+
+        <dd>
+          <indexterm audience="Cloudera">effective_user() function</indexterm>
+          <b>Purpose:</b> Typically returns the same value as 
<codeph>user()</codeph>,
+          except if delegation is enabled, in which case it returns the ID of 
the delegated user.
+          <p>
+            <b>Return type:</b> <codeph>string</codeph>
+          </p>
+          <p>
+            <b>Added in:</b> CDH 5.4.5
+          </p>
+        </dd>
+
+      </dlentry>
+
+      <dlentry rev="1.3.0" id="pid">
+
+        <dt>
+          <codeph>pid()</codeph>
+        </dt>
+
+        <dd>
+          <indexterm audience="Cloudera">pid() function</indexterm>
+          <b>Purpose:</b> Returns the process ID of the 
<cmdname>impalad</cmdname> daemon that the session is
+          connected to. You can use it during low-level debugging, to issue 
Linux commands that trace, show the
+          arguments, and so on the <cmdname>impalad</cmdname> process.
+          <p>
+            <b>Return type:</b> <codeph>int</codeph>
+          </p>
+        </dd>
+
+      </dlentry>
+
+      <dlentry audience="Cloudera" id="sleep">
+
+        <dt>
+          <codeph>sleep(int ms)</codeph>
+        </dt>
+
+        <dd>
+          <indexterm audience="Cloudera">sleep() function</indexterm>
+          <b>Purpose:</b> Pauses the query for a specified number of 
milliseconds. For slowing down queries with
+          small result sets enough to monitor runtime execution, memory usage, 
or other factors that otherwise
+          would be difficult to capture during the brief interval of query 
execution. When used in the
+          <codeph>SELECT</codeph> list, it is called once for each row in the 
result set; adjust the number of
+          milliseconds accordingly. For example, a query <codeph>SELECT *, 
sleep(5) FROM
+          table_with_1000_rows</codeph> would take at least 5 seconds to 
complete (5 milliseconds * 1000 rows in
+          result set). To avoid an excessive number of concurrent queries, use 
this function for troubleshooting on
+          test and development systems, not for production queries.
+          <p>
+            <b>Return type:</b> N/A
+          </p>
+        </dd>
+
+      </dlentry>
+
+      <dlentry rev="1.1" id="user">
+
+        <dt>
+          <codeph>user()</codeph>
+        </dt>
+
+        <dd>
+          <indexterm audience="Cloudera">user() function</indexterm>
+          <b>Purpose:</b> Returns the username of the Linux user who is 
connected to the <cmdname>impalad</cmdname>
+          daemon. Typically called a single time, in a query without any 
<codeph>FROM</codeph> clause, to
+          understand how authorization settings apply in a security context; 
once you know the logged-in user name,
+          you can check which groups that user belongs to, and from the list 
of groups you can check which roles
+          are available to those groups through the authorization policy file.
+          <p conref="../shared/impala_common.xml#common/user_kerberized"/>
+          <p>
+            When delegation is enabled, consider calling the 
<codeph>effective_user()</codeph> function instead.
+          </p>
+          <p>
+            <b>Return type:</b> <codeph>string</codeph>
+          </p>
+        </dd>
+
+      </dlentry>
+
+      <dlentry id="version">
+
+        <dt>
+          <codeph>version()</codeph>
+        </dt>
+
+        <dd>
+          <indexterm audience="Cloudera">version() function</indexterm>
+          <b>Purpose:</b> Returns information such as the precise version 
number and build date for the
+          <codeph>impalad</codeph> daemon that you are currently connected to. 
Typically used to confirm that you
+          are connected to the expected level of Impala to use a particular 
feature, or to connect to several nodes
+          and confirm they are all running the same level of 
<cmdname>impalad</cmdname>.
+          <p>
+            <b>Return type:</b> <codeph>string</codeph> (with one or more 
embedded newlines)
+          </p>
+        </dd>
+
+      </dlentry>
+    </dl>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_ndv.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_ndv.xml b/docs/topics/impala_ndv.xml
new file mode 100644
index 0000000..a1e5527
--- /dev/null
+++ b/docs/topics/impala_ndv.xml
@@ -0,0 +1,133 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept rev="1.2.1" id="ndv">
+
+  <title>NDV Function</title>
+  <titlealts><navtitle>NDV</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Impala Functions"/>
+      <data name="Category" value="Aggregate Functions"/>
+      <data name="Category" value="Querying"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">ndv() function</indexterm>
+      An aggregate function that returns an approximate value similar to the 
result of <codeph>COUNT(DISTINCT
+      <varname>col</varname>)</codeph>, the <q>number of distinct values</q>. 
It is much faster than the
+      combination of <codeph>COUNT</codeph> and <codeph>DISTINCT</codeph>, and 
uses a constant amount of memory and
+      thus is less memory-intensive for columns with high cardinality.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+<codeblock>NDV([DISTINCT | ALL] <varname>expression</varname>)</codeblock>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p rev="1.2.2">
+      This is the mechanism used internally by the <codeph>COMPUTE 
STATS</codeph> statement for computing the
+      number of distinct values in a column.
+    </p>
+
+    <p>
+      Because this number is an estimate, it might not reflect the precise 
number of different values in the
+      column, especially if the cardinality is very low or very high. If the 
estimated number is higher than the
+      number of rows in the table, Impala adjusts the value internally during 
query planning.
+    </p>
+
+    <p 
conref="../shared/impala_common.xml#common/former_odd_return_type_string"/>
+
+<!-- <p 
conref="/Content/impala_common_xi44078.xml#common/restrictions_sliding_window"/>
 -->
+
+    <p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
+
+    <p 
conref="../shared/impala_common.xml#common/complex_types_aggregation_explanation"/>
+    
+    <p 
conref="../shared/impala_common.xml#common/complex_types_aggregation_example"/>
+
+    <p conref="../shared/impala_common.xml#common/restrictions_blurb"/>
+
+    <p 
conref="../shared/impala_common.xml#common/analytic_not_allowed_caveat"/>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+    <p>
+      The following example queries a billion-row table to illustrate the 
relative performance of
+      <codeph>COUNT(DISTINCT)</codeph> and <codeph>NDV()</codeph>. It shows 
how <codeph>COUNT(DISTINCT)</codeph>
+      gives a precise answer, but is inefficient for large-scale data where an 
approximate result is sufficient.
+      The <codeph>NDV()</codeph> function gives an approximate result but is 
much faster.
+    </p>
+
+<codeblock>select count(distinct col1) from sample_data;
++---------------------+
+| count(distinct col1)|
++---------------------+
+| 100000              |
++---------------------+
+Fetched 1 row(s) in 20.13s
+
+select cast(ndv(col1) as bigint) as col1 from sample_data;
++----------+
+| col1     |
++----------+
+| 139017   |
++----------+
+Fetched 1 row(s) in 8.91s
+</codeblock>
+
+    <p>
+      The following example shows how you can code multiple 
<codeph>NDV()</codeph> calls in a single query, to
+      easily learn which columns have substantially more or fewer distinct 
values. This technique is faster than
+      running a sequence of queries with <codeph>COUNT(DISTINCT)</codeph> 
calls.
+    </p>
+
+<codeblock>select cast(ndv(col1) as bigint) as col1, cast(ndv(col2) as bigint) 
as col2,
+    cast(ndv(col3) as bigint) as col3, cast(ndv(col4) as bigint) as col4
+  from sample_data;
++----------+-----------+------------+-----------+
+| col1     | col2      | col3       | col4      |
++----------+-----------+------------+-----------+
+| 139017   | 282       | 46         | 145636240 |
++----------+-----------+------------+-----------+
+Fetched 1 row(s) in 34.97s
+
+select count(distinct col1) from sample_data;
++---------------------+
+| count(distinct col1)|
++---------------------+
+| 100000              |
++---------------------+
+Fetched 1 row(s) in 20.13s
+
+select count(distinct col2) from sample_data;
++----------------------+
+| count(distinct col2) |
++----------------------+
+| 278                  |
++----------------------+
+Fetched 1 row(s) in 20.09s
+
+select count(distinct col3) from sample_data;
++-----------------------+
+| count(distinct col3)  |
++-----------------------+
+| 46                    |
++-----------------------+
+Fetched 1 row(s) in 19.12s
+
+select count(distinct col4) from sample_data;
++----------------------+
+| count(distinct col4) |
++----------------------+
+| 147135880            |
++----------------------+
+Fetched 1 row(s) in 266.95s
+</codeblock>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_num_nodes.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_num_nodes.xml b/docs/topics/impala_num_nodes.xml
new file mode 100644
index 0000000..75ae8e8
--- /dev/null
+++ b/docs/topics/impala_num_nodes.xml
@@ -0,0 +1,45 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="num_nodes">
+
+  <title>NUM_NODES Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">NUM_NODES query option</indexterm>
+      Limit the number of nodes that process a query, typically during 
debugging.
+
+    </p>
+
+    <p>
+      <b>Type:</b> numeric
+    </p>
+
+<p>
+      <b>Allowed values:</b> Only accepts the values 0
+      (meaning all nodes) or 1 (meaning all work is done on the coordinator 
node).
+</p>
+
+    <p>
+      <b>Default:</b> 0
+    </p>
+
+     <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+     <p>
+       If you are diagnosing a problem that you suspect is due to a timing 
issue due to
+       distributed query processing, you can set <codeph>NUM_NODES=1</codeph> 
to verify
+       if the problem still occurs when all the work is done on a single node.
+     </p>
+
+    <p conref="../shared/impala_common.xml#common/num_nodes_tip"/>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_num_scanner_threads.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_num_scanner_threads.xml 
b/docs/topics/impala_num_scanner_threads.xml
new file mode 100644
index 0000000..27cf883
--- /dev/null
+++ b/docs/topics/impala_num_scanner_threads.xml
@@ -0,0 +1,32 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="num_scanner_threads">
+
+  <title>NUM_SCANNER_THREADS Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">NUM_SCANNER_THREADS query 
option</indexterm>
+      Maximum number of scanner threads (on each node) used for each query. By 
default, Impala uses as many cores
+      as are available (one thread per core). You might lower this value if 
queries are using excessive resources
+      on a busy cluster. Impala imposes a maximum value automatically, so a 
high value has no practical effect.
+    </p>
+
+    <p>
+      <b>Type:</b> numeric
+    </p>
+
+    <p>
+      <b>Default:</b> 0
+    </p>
+
+    <note conref="../shared/impala_common.xml#common/compute_stats_parquet"/>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/463ddf92/docs/topics/impala_offset.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_offset.xml b/docs/topics/impala_offset.xml
new file mode 100644
index 0000000..c9c073d
--- /dev/null
+++ b/docs/topics/impala_offset.xml
@@ -0,0 +1,64 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept rev="1.2.1" id="offset">
+
+  <title>OFFSET Clause</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Querying"/>
+      <data name="Category" value="Reports"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      The <codeph>OFFSET</codeph> clause in a <codeph>SELECT</codeph> query 
causes the result set to start some
+      number of rows after the logical first item. The result set is numbered 
starting from zero, so <codeph>OFFSET
+      0</codeph> produces the same result as leaving out the 
<codeph>OFFSET</codeph> clause. Always use this clause
+      in combination with <codeph>ORDER BY</codeph> (so that it is clear which 
item should be first, second, and so
+      on) and <codeph>LIMIT</codeph> (so that the result set covers a bounded 
range, such as items 0-9, 100-199,
+      and so on).
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/limit_and_offset"/>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+    <p>
+      The following example shows how you could run a <q>paging</q> query 
originally written for a traditional
+      database application. Because typical Impala queries process megabytes 
or gigabytes of data and read large
+      data files from disk each time, it is inefficient to run a separate 
query to retrieve each small group of
+      items. Use this technique only for compatibility while porting older 
applications, then rewrite the
+      application code to use a single query with a large result set, and 
display pages of results from the cached
+      result set.
+    </p>
+
+<codeblock>[localhost:21000] &gt; create table numbers (x int);
+[localhost:21000] &gt; insert into numbers select x from very_long_sequence;
+Inserted 1000000 rows in 1.34s
+[localhost:21000] &gt; select x from numbers order by x limit 5 offset 0;
++----+
+| x  |
++----+
+| 1  |
+| 2  |
+| 3  |
+| 4  |
+| 5  |
++----+
+[localhost:21000] &gt; select x from numbers order by x limit 5 offset 5;
++----+
+| x  |
++----+
+| 6  |
+| 7  |
+| 8  |
+| 9  |
+| 10 |
++----+
+</codeblock>
+  </conbody>
+</concept>

[08/22] incubator-impala git commit: First try at porting over the source files necessary for the Impala SQL Reference.

Reply via email to