[05/50] hbase git commit: HBASE-15907 updates for HBase Shell pre-splitting docs

2016-06-10 Thread syuanjiang
HBASE-15907 updates for HBase Shell pre-splitting docs

(cherry picked from commit 01adec574d9ccbdd6183466cb8ee6b43935d69ca)


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/73ec3385
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/73ec3385
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/73ec3385

Branch: refs/heads/hbase-12439
Commit: 73ec33856d0ee2ac1e058c6f7e1ccffa4476fbc0
Parents: eb64cd9
Author: Ronan Stokes 
Authored: Mon May 30 23:52:43 2016 -0700
Committer: Misty Stanley-Jones 
Committed: Tue May 31 13:52:46 2016 -0700

--
 src/main/asciidoc/_chapters/performance.adoc | 19 ++-
 src/main/asciidoc/_chapters/shell.adoc   | 62 +++
 2 files changed, 79 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hbase/blob/73ec3385/src/main/asciidoc/_chapters/performance.adoc
--
diff --git a/src/main/asciidoc/_chapters/performance.adoc 
b/src/main/asciidoc/_chapters/performance.adoc
index a0c00ae..5f27640 100644
--- a/src/main/asciidoc/_chapters/performance.adoc
+++ b/src/main/asciidoc/_chapters/performance.adoc
@@ -499,7 +499,7 @@ For bulk imports, this means that all clients will write to 
the same region unti
 A useful pattern to speed up the bulk import process is to pre-create empty 
regions.
 Be somewhat conservative in this, because too-many regions can actually 
degrade performance.
 
-There are two different approaches to pre-creating splits.
+There are two different approaches to pre-creating splits using the HBase API.
 The first approach is to rely on the default `Admin` strategy (which is 
implemented in `Bytes.split`)...
 
 [source,java]
@@ -511,7 +511,7 @@ int numberOfRegions = ...;  // # of regions to create
 admin.createTable(table, startKey, endKey, numberOfRegions);
 
 
-And the other approach is to define the splits yourself...
+And the other approach, using the HBase API, is to define the splits 
yourself...
 
 [source,java]
 
@@ -519,8 +519,23 @@ byte[][] splits = ...;   // create your own splits
 admin.createTable(table, splits);
 
 
+You can achieve a similar effect using the HBase Shell to create tables by 
specifying split options. 
+
+[source]
+
+# create table with specific split points
+hbase>create 't1','f1',SPLITS => ['\x10\x00', '\x20\x00', '\x30\x00', 
'\x40\x00']
+
+# create table with four regions based on random bytes keys
+hbase>create 't2','f1', { NUMREGIONS => 4 , SPLITALGO => 'UniformSplit' }
+
+# create table with five regions based on hex keys
+create 't3','f1', { NUMREGIONS => 5, SPLITALGO => 'HexStringSplit' }
+
+
 See <> for issues related to understanding your keyspace 
and pre-creating regions.
 See <>  
for discussion on manually pre-splitting regions.
+See <> for more details of using the HBase Shell to 
pre-split tables.
 
 [[def.log.flush]]
 ===  Table Creation: Deferred Log Flush

http://git-wip-us.apache.org/repos/asf/hbase/blob/73ec3385/src/main/asciidoc/_chapters/shell.adoc
--
diff --git a/src/main/asciidoc/_chapters/shell.adoc 
b/src/main/asciidoc/_chapters/shell.adoc
index a4237fd..8f1f59b 100644
--- a/src/main/asciidoc/_chapters/shell.adoc
+++ b/src/main/asciidoc/_chapters/shell.adoc
@@ -352,6 +352,68 @@ hbase(main):022:0> Date.new(1218920189000).toString() => 
"Sat Aug 16 20:56:29 UT
 
 To output in a format that is exactly like that of the HBase log format will 
take a little messing with 
link:http://download.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html[SimpleDateFormat].
 
+[[tricks.pre-split]]
+=== Pre-splitting tables with the HBase Shell
+You can use a variety of options to pre-split tables when creating them via 
the HBase Shell `create` command.
+
+The simplest approach is to specify an array of split points when creating the 
table. Note that when specifying string literals as split points, these will 
create split points based on the underlying byte representation of the string. 
So when specifying a split point of '10', we are actually specifying the byte 
split point '\x31\30'.
+
+The split points will define `n+1` regions where `n` is the number of split 
points. The lowest region will contain all keys from the lowest possible key up 
to but not including the first split point key.
+The next region will contain keys from the first split point up to, but not 
including the next split point key.
+This will continue for all split points up to the last. The last region will 
be defined from the last split point up to the maximum possible key.
+
+[source]
+
+hbase>create 't1','f',SPLITS 

hbase git commit: HBASE-15907 updates for HBase Shell pre-splitting docs

2016-05-31 Thread misty
Repository: hbase
Updated Branches:
  refs/heads/master eb64cd9dd -> 73ec33856


HBASE-15907 updates for HBase Shell pre-splitting docs

(cherry picked from commit 01adec574d9ccbdd6183466cb8ee6b43935d69ca)


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/73ec3385
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/73ec3385
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/73ec3385

Branch: refs/heads/master
Commit: 73ec33856d0ee2ac1e058c6f7e1ccffa4476fbc0
Parents: eb64cd9
Author: Ronan Stokes 
Authored: Mon May 30 23:52:43 2016 -0700
Committer: Misty Stanley-Jones 
Committed: Tue May 31 13:52:46 2016 -0700

--
 src/main/asciidoc/_chapters/performance.adoc | 19 ++-
 src/main/asciidoc/_chapters/shell.adoc   | 62 +++
 2 files changed, 79 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hbase/blob/73ec3385/src/main/asciidoc/_chapters/performance.adoc
--
diff --git a/src/main/asciidoc/_chapters/performance.adoc 
b/src/main/asciidoc/_chapters/performance.adoc
index a0c00ae..5f27640 100644
--- a/src/main/asciidoc/_chapters/performance.adoc
+++ b/src/main/asciidoc/_chapters/performance.adoc
@@ -499,7 +499,7 @@ For bulk imports, this means that all clients will write to 
the same region unti
 A useful pattern to speed up the bulk import process is to pre-create empty 
regions.
 Be somewhat conservative in this, because too-many regions can actually 
degrade performance.
 
-There are two different approaches to pre-creating splits.
+There are two different approaches to pre-creating splits using the HBase API.
 The first approach is to rely on the default `Admin` strategy (which is 
implemented in `Bytes.split`)...
 
 [source,java]
@@ -511,7 +511,7 @@ int numberOfRegions = ...;  // # of regions to create
 admin.createTable(table, startKey, endKey, numberOfRegions);
 
 
-And the other approach is to define the splits yourself...
+And the other approach, using the HBase API, is to define the splits 
yourself...
 
 [source,java]
 
@@ -519,8 +519,23 @@ byte[][] splits = ...;   // create your own splits
 admin.createTable(table, splits);
 
 
+You can achieve a similar effect using the HBase Shell to create tables by 
specifying split options. 
+
+[source]
+
+# create table with specific split points
+hbase>create 't1','f1',SPLITS => ['\x10\x00', '\x20\x00', '\x30\x00', 
'\x40\x00']
+
+# create table with four regions based on random bytes keys
+hbase>create 't2','f1', { NUMREGIONS => 4 , SPLITALGO => 'UniformSplit' }
+
+# create table with five regions based on hex keys
+create 't3','f1', { NUMREGIONS => 5, SPLITALGO => 'HexStringSplit' }
+
+
 See <> for issues related to understanding your keyspace 
and pre-creating regions.
 See <>  
for discussion on manually pre-splitting regions.
+See <> for more details of using the HBase Shell to 
pre-split tables.
 
 [[def.log.flush]]
 ===  Table Creation: Deferred Log Flush

http://git-wip-us.apache.org/repos/asf/hbase/blob/73ec3385/src/main/asciidoc/_chapters/shell.adoc
--
diff --git a/src/main/asciidoc/_chapters/shell.adoc 
b/src/main/asciidoc/_chapters/shell.adoc
index a4237fd..8f1f59b 100644
--- a/src/main/asciidoc/_chapters/shell.adoc
+++ b/src/main/asciidoc/_chapters/shell.adoc
@@ -352,6 +352,68 @@ hbase(main):022:0> Date.new(1218920189000).toString() => 
"Sat Aug 16 20:56:29 UT
 
 To output in a format that is exactly like that of the HBase log format will 
take a little messing with 
link:http://download.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html[SimpleDateFormat].
 
+[[tricks.pre-split]]
+=== Pre-splitting tables with the HBase Shell
+You can use a variety of options to pre-split tables when creating them via 
the HBase Shell `create` command.
+
+The simplest approach is to specify an array of split points when creating the 
table. Note that when specifying string literals as split points, these will 
create split points based on the underlying byte representation of the string. 
So when specifying a split point of '10', we are actually specifying the byte 
split point '\x31\30'.
+
+The split points will define `n+1` regions where `n` is the number of split 
points. The lowest region will contain all keys from the lowest possible key up 
to but not including the first split point key.
+The next region will contain keys from the first split point up to, but not 
including the next split point key.
+This will continue for all split points up to the last. The last region will 
be defined from the last split point up to