new range partitioning features blog post

Change-Id: I53504d849c2aca9ff613b11e67d1533536283931
Reviewed-on: http://gerrit.cloudera.org:8080/4012
Reviewed-by: Todd Lipcon <t...@apache.org>
Tested-by: Todd Lipcon <t...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/bb0fb419
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/bb0fb419
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/bb0fb419

Branch: refs/heads/gh-pages
Commit: bb0fb4190859f174da02d61e7059662bb856f089
Parents: d4a5115
Author: Dan Burkert <d...@cloudera.com>
Authored: Tue Aug 16 19:23:07 2016 -0700
Committer: Todd Lipcon <t...@apache.org>
Committed: Tue Aug 23 05:57:32 2016 +0000

----------------------------------------------------------------------
 ...016-08-23-new-range-partitioning-features.md |  99 +++++++++++++++++++
 .../range-and-hash-partitioning.png             | Bin 0 -> 85316 bytes
 .../range-partitioning-on-time.png              | Bin 0 -> 58003 bytes
 3 files changed, 99 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/bb0fb419/_posts/2016-08-23-new-range-partitioning-features.md
----------------------------------------------------------------------
diff --git a/_posts/2016-08-23-new-range-partitioning-features.md 
b/_posts/2016-08-23-new-range-partitioning-features.md
new file mode 100644
index 0000000..fe382b9
--- /dev/null
+++ b/_posts/2016-08-23-new-range-partitioning-features.md
@@ -0,0 +1,99 @@
+---
+layout: post
+title: "New Range Partitioning Features in Kudu 0.10"
+author: Dan Burkert
+---
+
+Kudu 0.10 is shipping with a few important new features for range partitioning.
+These features are designed to make Kudu easier to scale for certain workloads,
+like time series. This post will introduce these features, and discuss how to 
use
+them to effectively design tables for scalability and performance.
+
+<!--more-->
+
+Since Kudu's initial release, tables have had the constraint that once created,
+the set of partitions is static. This forces users to plan ahead and create
+enough partitions for the expected size of the table, because once the table is
+created no further partitions can be added. When using hash partitioning,
+creating more partitions is as straightforward as specifying more buckets. For
+range partitioning, however, knowing where to put the extra partitions ahead of
+time can be difficult or impossible.
+
+The common solution to this problem in other distributed databases is to allow
+range partitions to split into smaller child range partitions. Unfortunately,
+range splitting typically has a large performance impact on running tables,
+since child partitions need to eventually be recompacted and rebalanced to a
+remote server. Range splitting is particularly thorny with Kudu, because rows
+are stored in tablets in primary key sorted order, which does not necessarily
+match the range partitioning order. If the range partition key is different 
than
+the primary key, then splitting requires inspecting and shuffling each
+individual row, instead of splitting the tablet in half.
+
+## Adding and Dropping Range Partitions
+
+As an alternative to range partition splitting, Kudu now allows range 
partitions
+to be added and dropped on the fly, without locking the table or otherwise
+affecting concurrent operations on other partitions. This solution is not
+strictly as powerful as full range partition splitting, but it strikes a good
+balance between flexibility, performance, and operational overhead.
+Additionally, this feature does not preclude range splitting in the future if
+there is a push to implement it. To support adding and dropping range
+partitions, Kudu had to remove an even more fundamental restriction when using
+range partitions.
+
+Previously, range partitions could only be created by specifying split points.
+Split points divide an implicit partition covering the entire range into
+contiguous and disjoint partitions. When using split points, the first and last
+partitions are always unbounded below and above, respectively. A consequence of
+the final partition being unbounded is that datasets which are 
range-partitioned
+on a column that increases in value over time will eventually have far more 
rows
+in the last partition than in any other. Unbalanced partitions are commonly
+referred to as hotspots, and until Kudu 0.10 they have been difficult to avoid
+when storing time series data in Kudu.
+
+![png]({{ site.github.url 
}}/img/2016-08-23-new-range-partitioning-features/range-partitioning-on-time.png){:
 .img-responsive}
+
+The figure above shows the tablets created by two different attempts to
+partition a table by range on a timestamp column. The first, above in blue, 
uses
+split points. The second, below in green, uses bounded range partitions
+specified during table creation. With bounded range partitions, there is no
+longer a guarantee that every possible row has a corresponding range partition.
+As a result, Kudu will now reject writes which fall in a 'non-covered' range.
+
+Now that tables are no longer required to have range partitions covering all
+possible rows, Kudu can support adding range partitions to cover the otherwise
+unoccupied space. Dropping a range partition will result in unoccupied space
+where the range partition was previously. In the example above, we may want to
+add a range partition covering 2017 at the end of the year, so that we can
+continue collecting data in the future. By lazily adding range partitions we
+avoid hotspotting, avoid the need to specify range partitions up front for time
+periods far in the future, and avoid the downsides of splitting. Additionally,
+historical data which is no longer useful can be efficiently deleted by 
dropping
+the entire range partition.
+
+## What About Hash Partitioning?
+
+Since Kudu's hash partitioning feature originally shipped in version 0.6, it 
has
+been possible to create tables which combine hash partitioning with range
+partitioning. The new range partitioning features continue to work seamlessly
+when combined with hash partitioning. Just as before, the number of tablets
+which comprise a table will be the product of the number of range partitions 
and
+the number of hash partition buckets. Adding or dropping a range partition will
+result in the creation or deletion of one tablet per hash bucket.
+
+![png]({{ site.github.url 
}}/img/2016-08-23-new-range-partitioning-features/range-and-hash-partitioning.png){:
 .img-responsive}
+
+The diagram above shows a time series table range-partitioned on the timestamp
+and hash-partitioned with two buckets. The hash partitioning could be on the
+timestamp column, or it could be on any other column or columns in the primary
+key. In this example only two years of historical data is needed, so at the end
+of 2016 a new range partition is added for 2017 and the historical 2014 range
+partition is dropped. This causes two new tablets to be created for 2017, and
+the two existing tablets for 2014 to be deleted.
+
+## Getting Started
+
+Beginning with the Kudu 0.10 release, users can add and drop range partitions
+through the Java and C++ client APIs. Range partitions on existing tables can 
be
+dropped and replacements added, but it requires the servers and all clients to
+be updated to 0.10.

http://git-wip-us.apache.org/repos/asf/kudu/blob/bb0fb419/img/2016-08-23-new-range-partitioning-features/range-and-hash-partitioning.png
----------------------------------------------------------------------
diff --git 
a/img/2016-08-23-new-range-partitioning-features/range-and-hash-partitioning.png
 
b/img/2016-08-23-new-range-partitioning-features/range-and-hash-partitioning.png
new file mode 100644
index 0000000..1d61ae4
Binary files /dev/null and 
b/img/2016-08-23-new-range-partitioning-features/range-and-hash-partitioning.png
 differ

http://git-wip-us.apache.org/repos/asf/kudu/blob/bb0fb419/img/2016-08-23-new-range-partitioning-features/range-partitioning-on-time.png
----------------------------------------------------------------------
diff --git 
a/img/2016-08-23-new-range-partitioning-features/range-partitioning-on-time.png 
b/img/2016-08-23-new-range-partitioning-features/range-partitioning-on-time.png
new file mode 100644
index 0000000..f8894a1
Binary files /dev/null and 
b/img/2016-08-23-new-range-partitioning-features/range-partitioning-on-time.png 
differ

Reply via email to