[2/4] kudu git commit: Add docs for non-covering range partitioning

adar Mon, 15 Aug 2016 17:26:51 -0700

Add docs for non-covering range partitioning

Change-Id: I3b0fd7500c5399db9dcad617ae67fea247307353
Reviewed-on: http://gerrit.cloudera.org:8080/3796
Reviewed-by: Dan Burkert <[email protected]>
Tested-by: Misty Stanley-Jones <[email protected]>



Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/274dfb05
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/274dfb05
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/274dfb05

Branch: refs/heads/master
Commit: 274dfb05944edef1fac3be78ea0c699b048c5b31
Parents: 0fb4409
Author: Misty Stanley-Jones <[email protected]>
Authored: Wed Jul 27 12:19:52 2016 -0700
Committer: Misty Stanley-Jones <[email protected]>
Committed: Mon Aug 15 22:10:56 2016 +0000

----------------------------------------------------------------------
 docs/kudu_impala_integration.adoc |  55 ++++++++++
 docs/schema_design.adoc           | 193 +++++++++++++++++++--------------
 2 files changed, 168 insertions(+), 80 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/274dfb05/docs/kudu_impala_integration.adoc
----------------------------------------------------------------------
diff --git a/docs/kudu_impala_integration.adoc 
b/docs/kudu_impala_integration.adoc
index 793b402..fb29bd8 100755
--- a/docs/kudu_impala_integration.adoc
+++ b/docs/kudu_impala_integration.adoc
@@ -795,6 +795,61 @@ The example creates 16 buckets. You could also use `HASH 
(id, sku) INTO 16 BUCKE
 However, a scan for `sku` values would almost always impact all 16 buckets, 
rather
 than possibly being limited to 4.
 
+// Not ready for 0.10 but don't want to lose the work
+////
+
+.Non-Covering Range Partitions
+Kudu TODO:VERSION and higher supports the use of non-covering range partitions,
+which address scenarios like the following:
+
+- Without non-covering range partitions, in the case of time-series data or 
other
+  schemas which need to account for constantly-increasing primary keys, tablets
+  serving old data will be relatively fixed in size, while tablets receiving 
new
+  data will grow without bounds.
+
+- In cases where you want to partition data based on its category, such as 
sales
+  region or product type, without non-covering range partitions you must know 
all
+  of the partitions ahead of time or manually recreate your table if partitions
+  need to be added or removed, such as the introduction or elimination of a 
product
+  type.
+
+Non-covering range partitions have some caveats. Be sure to read the
+link:/docs/schema_design.html [Schema Design guide].
+
+This example creates a tablet per year (5 tablets total), for storing log 
data. It uses `RANGE BOUND`
+to ensure that the table only accepts data from 2011 to 2017. Keys outside of 
these
+ranges will be rejected.
+
+[source,sql]
+----
+CREATE TABLE sales_by_year (year INT32, sale_id INT32, amount INT32)
+PRIMARY KEY (sale_id, year)
+DISTRIBUTE BY RANGE (year)
+              RANGE BOUND ((2011), (2016))
+              SPLIT ROWS ((2012), (2013), (2014), (2015), (2016));
+----
+
+When records start coming in for 2017, they will be rejected. At that point, 
the `2017`
+range should be added. An `alter table add range partition` or `alter table 
drop range
+partition` operation allows you to add or drop a range partition.
+
+The next example creates a table per sales region. Data for regions other than 
`North
+America`, `Europe`, or `Asia` will be rejected. This example does not use 
explicit
+split rows, but the range bounds provide implicit split rows, so three tablets 
would
+be  created. If a new range is added, a new tablet is created.
+
+[source,sql]
+----
+CREATE TABLE sales_by_region (region STRING, sale_id INT32, amount INT32)
+PRIMARY KEY (sale_id, region)
+DISTRIBUTE BY RANGE (region)
+              RANGE BOUND (("North America"), ("North America\0")),
+              RANGE BOUND (("Europe"), ("Europe\0")),
+              RANGE BOUND (("Asia"), ("Asia\0"));
+----
+
+////
+
 [[partitioning_rules_of_thumb]]
 ==== Partitioning Rules of Thumb
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/274dfb05/docs/schema_design.adoc
----------------------------------------------------------------------
diff --git a/docs/schema_design.adoc b/docs/schema_design.adoc
index 0851e44..1d3078f 100644
--- a/docs/schema_design.adoc
+++ b/docs/schema_design.adoc
@@ -41,89 +41,21 @@ be a new concept for those familiar with traditional 
relational databases. The
 next sections discuss <<alter-schema,altering the schema>> of an existing 
table,
 and <<known-limitations,known limitations>> with regard to schema design.
 
-[[column-design]]
-== Column Design
-
-A Kudu Table consists of one or more columns, each with a predefined type.
-Columns that are not part of the primary key may optionally be nullable.
-Supported column types include:
-
-* boolean
-* 8-bit signed integer
-* 16-bit signed integer
-* 32-bit signed integer
-* 64-bit signed integer
-* timestamp
-* single-precision (32-bit) IEEE-754 floating-point number
-* double-precision (64-bit) IEEE-754 floating-point number
-* UTF-8 encoded string
-* binary
-
-Kudu takes advantage of strongly-typed columns and a columnar on-disk storage
-format to provide efficient encoding and serialization. To make the most of 
these
-features, columns must be specified as the appropriate type, rather than
-simulating a 'schemaless' table using string or binary columns for data which
-may otherwise be structured. In addition to encoding, Kudu optionally allows
-compression to be specified on a per-column basis.
-
-[[encoding]]
-=== Column Encoding
-
-Each column in a Kudu table can be created with an encoding, based on the type
-of the column. Columns use plain encoding by default.
-
-.Encoding Types
-[options="header"]
-|===
-| Column Type        | Encoding
-| integer, timestamp | plain, bitshuffle, run length
-| float              | plain, bitshuffle
-| bool               | plain, dictionary, run length
-| string, binary     | plain, prefix, dictionary
-|===
-
-[[plain]]
-Plain Encoding:: Data is stored in its natural format. For example, `int32` 
values
-are stored as fixed-size 32-bit little-endian integers.
-
-[[bitshuffle]]
-Bitshuffle Encoding:: Data is rearranged to store the most significant bit of
-every value, followed by the second most significant bit of every value, and so
-on. Finally, the result is LZ4 compressed. Bitshuffle encoding is a good 
choice for
-columns that have many repeated values, or values that change by small amounts
-when sorted by primary key. The
-https://github.com/kiyo-masui/bitshuffle[bitshuffle] project has a good
-overview of performance and use cases.
-
-[[run-length]]
-Run Length Encoding:: _Runs_ (consecutive repeated values) are compressed in a
-column by storing only the value and the count. Run length encoding is 
effective
-for columns with many consecutive repeated values when sorted by primary key.
-
-[[dictionary]]
-Dictionary Encoding:: A dictionary of unique values is built, and each column 
value
-is encoded as its corresponding index in the dictionary. Dictionary encoding
-is effective for columns with low cardinality. If the column values of a given 
row set
-are unable to be compressed because the number of unique values is too high, 
Kudu will
-transparently fall back to plain encoding for that row set. This is evaluated 
during
-flush.
+== The Perfect Schema
 
-[[prefix]]
-Prefix Encoding:: Common prefixes are compressed in consecutive column values. 
Prefix
-encoding can be effective for values that share common prefixes, or the first
-column of the primary key, since rows are sorted by primary key within tablets.
+The perfect schema would accomplish the following:
 
-[[compression]]
-=== Column Compression
+- Data would be distributed in such a way that reads and writes are spread 
evenly
+  across tablet servers. This is impacted by the partition schema.
+- Tablets would grow at an even, predictable rate and load across tablets 
would remain
+  steady over time. This is most impacted by the partition schema.
+- Scans would read the minimum amount of data necessary to fulfill a query. 
This
+  is impacted mostly by primary key design, but partition design also plays a
+  role via partition pruning.
 
-Kudu allows per-column compression using LZ4, `snappy`, or `zlib` compression
-codecs. By default, columns are stored uncompressed. Consider using compression
-if reducing storage space is more important than raw scan performance.
-
-Every data set will compress differently, but in general LZ4 has the least 
effect on
-performance, while `zlib` will compress to the smallest data sizes.
-Bitshuffle-encoded columns are inherently compressed using LZ4, so it is not
-typically beneficial to apply additional compression on top of this encoding.
+The perfect schema depends on the characteristics of your data, what you need 
to do
+with it, and the topology of your cluster. Schema design is the single most 
important
+thing within your control to maximize the performance of your Kudu cluster.
 
 [[primary-keys]]
 == Primary Keys
@@ -209,6 +141,23 @@ should only include the `last_name` column. In that case, 
Kudu would guarantee t
 customers with the same last name would fall into the same tablet, regardless 
of
 the provided split rows.
 
+[[range-partition-management]]
+==== Range Partition Management
+
+Kudu 0.10 introduces the ability to specify bounded range partitions during
+table creation, and the ability add and drop range partitions on the fly. This 
is
+a good strategy for data which is always increasing, such as timestamps, or for
+categorical data, such as geographic regions.
+
+For example, during table creation, bounded range partitions can be
+added for the regions 'US-EAST', 'US-WEST', and 'EUROPE'. If you attempt to 
insert a
+row with a region that does not match an existing range partition, the 
insertion will
+fail. Later, when a new region is needed it can be efficiently added as part 
of an
+`ALTER TABLE` operation. This feature is particularly useful for timeseries 
data,
+since it allows new range partitions for the current period to be added as
+needed, and old partitions covering historical periods to be dropped if
+necessary.
+
 [[hash-bucketing]]
 === Hash Bucketing
 
@@ -272,6 +221,90 @@ You can alter a table's schema in the following ways:
 
 You cannot modify the partition schema after table creation.
 
+[[column-design]]
+== Column Design
+
+A Kudu Table consists of one or more columns, each with a predefined type.
+Columns that are not part of the primary key may optionally be nullable.
+Supported column types include:
+
+* boolean
+* 8-bit signed integer
+* 16-bit signed integer
+* 32-bit signed integer
+* 64-bit signed integer
+* timestamp
+* single-precision (32-bit) IEEE-754 floating-point number
+* double-precision (64-bit) IEEE-754 floating-point number
+* UTF-8 encoded string
+* binary
+
+Kudu takes advantage of strongly-typed columns and a columnar on-disk storage
+format to provide efficient encoding and serialization. To make the most of 
these
+features, columns must be specified as the appropriate type, rather than
+simulating a 'schemaless' table using string or binary columns for data which
+may otherwise be structured. In addition to encoding, Kudu optionally allows
+compression to be specified on a per-column basis.
+
+[[encoding]]
+=== Column Encoding
+
+Each column in a Kudu table can be created with an encoding, based on the type
+of the column. Columns use plain encoding by default.
+
+.Encoding Types
+[options="header"]
+|===
+| Column Type        | Encoding
+| integer, timestamp | plain, bitshuffle, run length
+| float              | plain, bitshuffle
+| bool               | plain, dictionary, run length
+| string, binary     | plain, prefix, dictionary
+|===
+
+[[plain]]
+Plain Encoding:: Data is stored in its natural format. For example, `int32` 
values
+are stored as fixed-size 32-bit little-endian integers.
+
+[[bitshuffle]]
+Bitshuffle Encoding:: Data is rearranged to store the most significant bit of
+every value, followed by the second most significant bit of every value, and so
+on. Finally, the result is LZ4 compressed. Bitshuffle encoding is a good 
choice for
+columns that have many repeated values, or values that change by small amounts
+when sorted by primary key. The
+https://github.com/kiyo-masui/bitshuffle[bitshuffle] project has a good
+overview of performance and use cases.
+
+[[run-length]]
+Run Length Encoding:: _Runs_ (consecutive repeated values) are compressed in a
+column by storing only the value and the count. Run length encoding is 
effective
+for columns with many consecutive repeated values when sorted by primary key.
+
+[[dictionary]]
+Dictionary Encoding:: A dictionary of unique values is built, and each column 
value
+is encoded as its corresponding index in the dictionary. Dictionary encoding
+is effective for columns with low cardinality. If the column values of a given 
row set
+are unable to be compressed because the number of unique values is too high, 
Kudu will
+transparently fall back to plain encoding for that row set. This is evaluated 
during
+flush.
+
+[[prefix]]
+Prefix Encoding:: Common prefixes are compressed in consecutive column values. 
Prefix
+encoding can be effective for values that share common prefixes, or the first
+column of the primary key, since rows are sorted by primary key within tablets.
+
+[[compression]]
+=== Column Compression
+
+Kudu allows per-column compression using LZ4, `snappy`, or `zlib` compression
+codecs. By default, columns are stored uncompressed. Consider using compression
+if reducing storage space is more important than raw scan performance.
+
+Every data set will compress differently, but in general LZ4 has the least 
effect on
+performance, while `zlib` will compress to the smallest data sizes.
+Bitshuffle-encoded columns are inherently compressed using LZ4, so it is not
+typically beneficial to apply additional compression on top of this encoding.
+
 [[known-limitations]]
 == Known Limitations

[2/4] kudu git commit: Add docs for non-covering range partitioning

Reply via email to