Repository: incubator-trafodion
Updated Branches:
  refs/heads/master 66827b053 -> b6b9f43b6


[TRAFODION-2164] Incremental Update Statistics documentation


Project: http://git-wip-us.apache.org/repos/asf/incubator-trafodion/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-trafodion/commit/36446cea
Tree: http://git-wip-us.apache.org/repos/asf/incubator-trafodion/tree/36446cea
Diff: http://git-wip-us.apache.org/repos/asf/incubator-trafodion/diff/36446cea

Branch: refs/heads/master
Commit: 36446cead6c0961384bbc402f8f0fcc772d7de4c
Parents: 1cb7a1a
Author: Dave Birdsall <dbirds...@apache.org>
Authored: Thu Sep 15 18:42:28 2016 +0000
Committer: Dave Birdsall <dbirds...@apache.org>
Committed: Thu Sep 15 18:42:28 2016 +0000

----------------------------------------------------------------------
 docs/shared/revisions.txt                       |  1 +
 .../src/asciidoc/_chapters/about.adoc           | 15 +---
 .../src/asciidoc/_chapters/sql_statements.adoc  |  4 +
 .../src/asciidoc/_chapters/sql_utilities.adoc   | 86 +++++++++++++++++++-
 4 files changed, 89 insertions(+), 17 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/36446cea/docs/shared/revisions.txt
----------------------------------------------------------------------
diff --git a/docs/shared/revisions.txt b/docs/shared/revisions.txt
index 6afa98d..f58a61d 100644
--- a/docs/shared/revisions.txt
+++ b/docs/shared/revisions.txt
@@ -26,6 +26,7 @@
 [cols="2",options="header"]
 |===
 | Version    | Date
+| 2.1.0      | TBD
 | 2.0.1      | July 7, 2016
 | 2.0.0      | June 6, 2016
 | 1.3.0      | January, 2016  

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/36446cea/docs/sql_reference/src/asciidoc/_chapters/about.adoc
----------------------------------------------------------------------
diff --git a/docs/sql_reference/src/asciidoc/_chapters/about.adoc 
b/docs/sql_reference/src/asciidoc/_chapters/about.adoc
index 247e70d..b3b9ef6 100644
--- a/docs/sql_reference/src/asciidoc/_chapters/about.adoc
+++ b/docs/sql_reference/src/asciidoc/_chapters/about.adoc
@@ -51,20 +51,7 @@ This edition includes updates for these new features:
 [cols="50%,50%",options="header"]
 |===
 | New Feature                                           | Location in the 
Manual
-| On Line Analytical Process (OLAP) window functions    | 
<<olap_functions,OLAP Functions>>
-| Ability to cancel DDL, update statistics, and
-additional child query operations in addition to
-DML statements                                          | 
<<control_query_cancel_statement,CONTROL QUERY CANCEL Statement>> 
-| Authorization required to run the CONTROL QUERY
-CANCEL Statement                                        | 
<<control_query_cancel_statement,CONTROL QUERY CANCEL Statement>>
-| Ability to grant privileges on behalf of a role
-using the GRANTED BY clause.                            | 
<<grant_component_privilege_statement,GRANT COMPONENT PRIVILEGE Statement>>
-| Authorization required for all SHOWDDL commands       | 
<<showddl_statement,SHOWDDL Statement>> +
-<<SHOWDDL_SCHEMA_Statement,SHOWDDL SCHEMA Statement>>
-| Ability to display the DDL syntax of a library object
-using the SHOWDDL LIBRARY command                       | 
<<showddl_statement,SHOWDDL Statement>>
-| Listing of HBase objects using the GET HBASE OBJECTS
-command through an SQL interface                        | 
<<get_hbase_objects_statement,GET HBASE OBJECTS Statement>>
+| Incremental UPDATE STATISTICS                         | 
<<update_statistics_statement,UPDATE STATISTICS Statement>>
 |===
 
 <<<

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/36446cea/docs/sql_reference/src/asciidoc/_chapters/sql_statements.adoc
----------------------------------------------------------------------
diff --git a/docs/sql_reference/src/asciidoc/_chapters/sql_statements.adoc 
b/docs/sql_reference/src/asciidoc/_chapters/sql_statements.adoc
index c0c218b..8e2fbb6 100644
--- a/docs/sql_reference/src/asciidoc/_chapters/sql_statements.adoc
+++ b/docs/sql_reference/src/asciidoc/_chapters/sql_statements.adoc
@@ -2344,6 +2344,10 @@ creates an HBase table if it does not already exist when 
the table is created. T
 +
 specifies the ANSI logical name of the table. See 
<<database_object_names,Database Object Names>>. This name must be
 unique among names of tables and views within its schema.
++
+TIP: Avoid using table names that begin with "TRAF_SAMPLE_", as {project-name} 
SQL assumes that
+such tables are persistent sample tables. See also 
<<update_statistics_incremental_update_statistics,
+Incremental Update Statistics>>.
 
 * `SALT USING _num_ PARTITIONS [ON (_column_[, _column_]&#8230;)]`
 +

http://git-wip-us.apache.org/repos/asf/incubator-trafodion/blob/36446cea/docs/sql_reference/src/asciidoc/_chapters/sql_utilities.adoc
----------------------------------------------------------------------
diff --git a/docs/sql_reference/src/asciidoc/_chapters/sql_utilities.adoc 
b/docs/sql_reference/src/asciidoc/_chapters/sql_utilities.adoc
index d386e64..248f994 100644
--- a/docs/sql_reference/src/asciidoc/_chapters/sql_utilities.adoc
+++ b/docs/sql_reference/src/asciidoc/_chapters/sql_utilities.adoc
@@ -686,11 +686,12 @@ devise optimized access plans.
 UPDATE STATISTICS is a {project-name} SQL extension.
 
 ```
-UPDATE STATISTICS FOR TABLE table [CLEAR | on-clause]
+UPDATE STATISTICS FOR TABLE table [CLEAR | on-clause | sample-table-clause ]
 
 on-clause is:
     ON column-group-list CLEAR
   | ON column-group-list [histogram-option]...
+  | ON column-group-list INCREMENTAL WHERE predicate
 
 column-group-list is:
     column-list [,column-list]...
@@ -716,8 +717,12 @@ histogram-option is:
 
 sample-option is:
     [r ROWS]
-  | RANDOM percent PERCENT
+  | RANDOM percent PERCENT [PERSISTENT]
   | PERIODIC size ROWS EVERY period ROWS
+
+sample-table-clause is:
+    CREATE SAMPLE RANDOM percent PERCENT
+  | REMOVE SAMPLE
 ```
 
 [[update_statistics_syntax]]
@@ -872,7 +877,6 @@ than specified, or if the sample size is greater than the 
system limit,
 A row sample is used to read _r_ rows from the table. The value _r_ must
 be an integer that is greater than zero 
 
-<<<
 **** `RANDOM _percent_ PERCENT`
 +
 directs {project-name} SQL to choose rows randomly from the table. The value
@@ -881,6 +885,12 @@ addition, only the first four digits to the right of the 
decimal point
 are significant. For example, value 0.00001 is considered to be 0.0000,
 Value 1.23456 is considered to be 1.2345.
 
+***** `PERSISTENT`
++
+directs {project-name} SQL to create a persistent sample table and store the
+random sample in it. This table can then be used later for updating statistics
+incrementally.
+
 **** `PERIODIC _size_ ROWS EVERY _period_ ROW`
 +
 directs {project-name} SQL to choose the first _size_ number of rows from
@@ -890,6 +900,28 @@ _size_ &#60;= _period_). The size of the _period_ is 
defined by the number
 of rows specified for _period_. The value _period_ must be an integer
 that is greater than zero (_period_ > 0).
 
+* `INCREMENTAL WHERE _predicate_`
++
+directs {project-name} SQL to update statistics incrementally. That is, instead
+of taking a fresh sample of the entire table, {project-name} SQL will use a 
previously
+created persistent sample table. {project-name} SQL will update the persistent 
sample
+by replacing any rows satisfying the _predicate_ with a fresh sample of rows 
from
+the original table satisfying the _predicate_. The sampling rate used is the
+_percent_ specified when the persistent sample table was created. Statistics
+are then generated from this updated sample. See also
+<<update_statistics_incremental_update_statistics,
+Incremental Update Statistics>>.
+
+* `CREATE SAMPLE RANDOM _percent_ PERCENT`
++
+Creates a persistent sample table associated with this table. The sample is
+created using a random sample of _percent_ percent of the rows. The table
+can then be used for later incremental statistics update.
+
+* `REMOVE SAMPLE`
++
+Drops the persistent sample table associated with this table.
+
 [[update_statistics_considerations]]
 === Considerations for UPDATE STATISTICS
 
@@ -1046,6 +1078,35 @@ NECESSARY COLUMN[S] to generate the needed histograms:
 update statistics for table _table-name_ on necessary columns sample;
 ```
 
+[[update_statistics_incremental_update_statistics]]
+==== Incremental Update Statistics
+
+UPDATE STATISTICS processing time can be lengthy for very large tables.
+One strategy for reducing the time is to create histograms only for
+columns that actually need them (for example, using the ON NECESSARY COLUMNS 
+column group). Another strategy is to update statistics incrementally. These
+strategies can be used together if desired.
+
+To use the incremental update statistics feature, you must first create
+statistics for the table and create a persistent sample table. One way to
+do this is to perform a normal update statistics command, adding the
+PERSISTENT keyword to the _sample-option_. Another way to do this if you
+already have reasonably up-to-date statistics for the table, is to create
+a persistent sample table separately using the CREATE SAMPLE option.
+
+You can then perform update statistics incrementally by using the INCREMENTAL
+WHERE _predicate_ syntax in the on-clause. The _predicate_ should be chosen
+to describe the set of rows that have changed since the last statistics update
+was performed. For example, if your table contains a column with a timestamp
+giving the date and time of last update, this is a particularly useful column
+to use in the _predicate_.
+
+If you decide later that you wish to change the _percent_ sampling rate used
+for the persistent sample table, you can do so by dropping the persistent
+sample table (using REMOVE SAMPLE) and creating a new one (by using the
+CREATE SAMPLE option). Using a higher _percent_ results in more accurate
+histograms, but at the price of a longer-running operation.
+
 <<<
 [[update_statistics_examples]]
 === Examples of UPDATE STATISTICS
@@ -1108,4 +1169,23 @@ table ADDRESS:
 UPDATE STATISTICS FOR TABLE address ON street CLEAR;
 ```
 
+* This example generates statistics but also creates a persistent 
+sample table for use when updating statistics incrementally:
++
+```
+UPDATE STATISTICS FOR TABLE address
+ON (street), (city), (state), (zip, type)
+SAMPLE RANDOM 5 PERCENT PERSISTENT;
+```
 
+* This example updates statistics incrementally. It assumes that
+a persistent sample table has already been created. The predicate
+in the WHERE clause describes the set of rows that have changed
+since statistics were last updated. Here we assume that rows
+with a state of California are the only rows that have changed:
++
+```
+UPDATE STATISTICS FOR TABLE address
+ON EXISTING COLUMNS
+INCREMENTAL WHERE state = 'CA';
+```

Reply via email to