[5/6] hbase git commit: updating docs from master

ndimiduk Thu, 30 Nov 2017 20:42:12 -0800

updating docs from master


Project: http://git-wip-us.apache.org/repos/asf/hbase/repo
Commit: http://git-wip-us.apache.org/repos/asf/hbase/commit/2e9a55be
Tree: http://git-wip-us.apache.org/repos/asf/hbase/tree/2e9a55be
Diff: http://git-wip-us.apache.org/repos/asf/hbase/diff/2e9a55be

Branch: refs/heads/branch-1.1
Commit: 2e9a55befc308b4892ea5a083412e4f36178ed1a
Parents: b6ff374
Author: Nick Dimiduk <[email protected]>
Authored: Thu Nov 30 19:53:20 2017 -0800
Committer: Nick Dimiduk <[email protected]>
Committed: Thu Nov 30 19:53:20 2017 -0800

----------------------------------------------------------------------
 .../asciidoc/_chapters/appendix_acl_matrix.adoc |   1 +
 .../appendix_contributing_to_documentation.adoc |  42 +-
 src/main/asciidoc/_chapters/architecture.adoc   | 269 ++++--
 src/main/asciidoc/_chapters/asf.adoc            |   4 +-
 src/main/asciidoc/_chapters/backup_restore.adoc | 912 +++++++++++++++++++
 src/main/asciidoc/_chapters/community.adoc      |   6 +-
 src/main/asciidoc/_chapters/compression.adoc    |  10 +-
 src/main/asciidoc/_chapters/configuration.adoc  |  40 +-
 src/main/asciidoc/_chapters/cp.adoc             |  12 +-
 src/main/asciidoc/_chapters/datamodel.adoc      |  34 +-
 src/main/asciidoc/_chapters/developer.adoc      | 351 ++++---
 src/main/asciidoc/_chapters/external_apis.adoc  |  27 +-
 src/main/asciidoc/_chapters/faq.adoc            |   4 +-
 .../asciidoc/_chapters/getting_started.adoc     |   6 +-
 src/main/asciidoc/_chapters/hbase-default.adoc  |  52 +-
 src/main/asciidoc/_chapters/hbase_apis.adoc     |   2 +-
 src/main/asciidoc/_chapters/mapreduce.adoc      | 112 ++-
 src/main/asciidoc/_chapters/ops_mgt.adoc        | 159 +++-
 src/main/asciidoc/_chapters/other_info.adoc     |  14 +-
 src/main/asciidoc/_chapters/performance.adoc    |  41 +-
 src/main/asciidoc/_chapters/preface.adoc        |   4 +-
 src/main/asciidoc/_chapters/protobuf.adoc       |   2 +-
 src/main/asciidoc/_chapters/rpc.adoc            |   2 +-
 src/main/asciidoc/_chapters/schema_design.adoc  |  41 +-
 src/main/asciidoc/_chapters/security.adoc       |  12 +-
 src/main/asciidoc/_chapters/spark.adoc          |   4 +-
 src/main/asciidoc/_chapters/sql.adoc            |   4 +-
 .../_chapters/thrift_filter_language.adoc       |   2 +-
 src/main/asciidoc/_chapters/tracing.adoc        |   8 +-
 .../asciidoc/_chapters/troubleshooting.adoc     |  19 +-
 src/main/asciidoc/_chapters/unit_testing.adoc   |   8 +-
 src/main/asciidoc/_chapters/upgrading.adoc      | 166 +++-
 src/main/asciidoc/_chapters/zookeeper.adoc      |   8 +-
 src/main/asciidoc/book.adoc                     |   9 +-
 34 files changed, 1855 insertions(+), 532 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hbase/blob/2e9a55be/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc 
b/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc
index 1d7c748..0c99b1f 100644
--- a/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc
+++ b/src/main/asciidoc/_chapters/appendix_acl_matrix.adoc
@@ -123,6 +123,7 @@ In case the table goes out of date, the unit tests which 
check for accuracy of p
 |        | getReplicationPeerConfig | superuser\|global(A)
 |        | updateReplicationPeerConfig | superuser\|global(A)
 |        | listReplicationPeers | superuser\|global(A)
+|        | getClusterStatus | superuser\|global(A)
 | Region | openRegion | superuser\|global(A)
 |        | closeRegion | superuser\|global(A)
 |        | flush | 
superuser\|global(A)\|global\(C)\|TableOwner\|table(A)\|table\(C)

http://git-wip-us.apache.org/repos/asf/hbase/blob/2e9a55be/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc
----------------------------------------------------------------------
diff --git 
a/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc 
b/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc
index 0337182..a603c16 100644
--- a/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc
+++ b/src/main/asciidoc/_chapters/appendix_contributing_to_documentation.adoc
@@ -35,9 +35,9 @@ including the documentation.
 
 In HBase, documentation includes the following areas, and probably some others:
 
-* The link:http://hbase.apache.org/book.html[HBase Reference
+* The link:https://hbase.apache.org/book.html[HBase Reference
   Guide] (this book)
-* The link:http://hbase.apache.org/[HBase website]
+* The link:https://hbase.apache.org/[HBase website]
 * API documentation
 * Command-line utility output and help text
 * Web UI strings, explicit help text, context-sensitive strings, and others
@@ -119,14 +119,14 @@ JIRA and add a version number to the name of the new 
patch.
 
 === Editing the HBase Website
 
-The source for the HBase website is in the HBase source, in the 
_src/main/site/_ directory.
+The source for the HBase website is in the HBase source, in the _src/site/_ 
directory.
 Within this directory, source for the individual pages is in the _xdocs/_ 
directory,
 and images referenced in those pages are in the _resources/images/_ directory.
 This directory also stores images used in the HBase Reference Guide.
 
 The website's pages are written in an HTML-like XML dialect called xdoc, which
 has a reference guide at
-http://maven.apache.org/archives/maven-1.x/plugins/xdoc/reference/xdocs.html.
+https://maven.apache.org/archives/maven-1.x/plugins/xdoc/reference/xdocs.html.
 You can edit these files in a plain-text editor, an IDE, or an XML editor such
 as XML Mind XML Editor (XXE) or Oxygen XML Author.
 
@@ -138,23 +138,23 @@ When you are satisfied with your changes, follow the 
procedure in
 [[website_publish]]
 === Publishing the HBase Website and Documentation
 
-HBase uses the ASF's `gitpubsub` mechanism.
-. After generating the website and documentation
-artifacts using `mvn clean site site:stage`, check out the `asf-site` 
repository.
+HBase uses the ASF's `gitpubsub` mechanism. A Jenkins job runs the
+`dev-support/jenkins-scripts/generate-hbase-website.sh` script, which runs the
+`mvn clean site site:stage` against the `master` branch of the `hbase`
+repository and commits the built artifacts to the `asf-site` branch of the
+`hbase-site` repository. When the commit is pushed, the website is redeployed
+automatically. If the script encounters an error, an email is sent to the
+developer mailing list. You can run the script manually or examine it to see 
the
+steps involved.
 
-. Remove previously-generated content using the following command:
-+
-----
-rm -rf rm -rf *apidocs* *book* *.html *.pdf* css js
-----
-+
-WARNING: Do not remove the `0.94/` directory. To regenerate them, you must 
check out
-the 0.94 branch and run `mvn clean site site:stage` from there, and then copy 
the
-artifacts to the 0.94/ directory of the `asf-site` branch.
-
-. Copy the contents of `target/staging` to the branch.
+[[website_check_links]]
+=== Checking the HBase Website for Broken Links
 
-. Add and commit your changes, and submit a patch for review.
+A Jenkins job runs periodically to check HBase website for broken links, using
+the `dev-support/jenkins-scripts/check-website-links.sh` script. This script
+uses a tool called `linklint` to check for bad links and create a report. If
+broken links are found, an email is sent to the developer mailing list. You can
+run the script manually or examine it to see the steps involved.
 
 === HBase Reference Guide Style Guide and Cheat Sheet
 
@@ -216,7 +216,7 @@ link:http://www.google.com[Google]
 ----
 image::sunset.jpg[Alt Text]
 ----
-(put the image in the src/main/site/resources/images directory)
+(put the image in the src/site/resources/images directory)
 | An inline image | The image with alt text, as part of the text flow |
 ----
 image:sunset.jpg [Alt Text]
@@ -389,7 +389,7 @@ Inline images cannot have titles. They are generally small 
images like GUI butto
 image:sunset.jpg[Alt Text]
 ----
 
-When doing a local build, save the image to the 
_src/main/site/resources/images/_ directory.
+When doing a local build, save the image to the _src/site/resources/images/_ 
directory.
 When you link to the image, do not include the directory portion of the path.
 The image will be copied to the appropriate target location during the build 
of the output.
 

http://git-wip-us.apache.org/repos/asf/hbase/blob/2e9a55be/src/main/asciidoc/_chapters/architecture.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/architecture.adoc 
b/src/main/asciidoc/_chapters/architecture.adoc
index 930fa60..9a3cbd9 100644
--- a/src/main/asciidoc/_chapters/architecture.adoc
+++ b/src/main/asciidoc/_chapters/architecture.adoc
@@ -76,7 +76,7 @@ HBase can run quite well stand-alone on a laptop - but this 
should be considered
 [[arch.overview.hbasehdfs]]
 === What Is The Difference Between HBase and Hadoop/HDFS?
 
-link:http://hadoop.apache.org/hdfs/[HDFS] is a distributed file system that is 
well suited for the storage of large files.
+link:https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html[HDFS]
 is a distributed file system that is well suited for the storage of large 
files.
 Its documentation states that it is not, however, a general purpose file 
system, and does not provide fast individual record lookups in files.
 HBase, on the other hand, is built on top of HDFS and provides fast record 
lookups (and updates) for large tables.
 This can sometimes be a point of conceptual confusion.
@@ -88,31 +88,10 @@ See the <<datamodel>> and the rest of this chapter for more 
information on how H
 
 The catalog table `hbase:meta` exists as an HBase table and is filtered out of 
the HBase shell's `list` command, but is in fact a table just like any other.
 
-[[arch.catalog.root]]
-=== -ROOT-
-
-NOTE: The `-ROOT-` table was removed in HBase 0.96.0.
-Information here should be considered historical.
-
-The `-ROOT-` table kept track of the location of the `.META` table (the 
previous name for the table now called `hbase:meta`) prior to HBase 0.96.
-The `-ROOT-` table structure was as follows:
-
-.Key
-
-* .META.
-  region key (`.META.,,1`)
-
-.Values
-
-* `info:regioninfo` (serialized 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html[HRegionInfo]
 instance of `hbase:meta`)
-* `info:server` (server:port of the RegionServer holding `hbase:meta`)
-* `info:serverstartcode` (start-time of the RegionServer process holding 
`hbase:meta`)
-
 [[arch.catalog.meta]]
 === hbase:meta
 
-The `hbase:meta` table (previously called `.META.`) keeps a list of all 
regions in the system.
-The location of `hbase:meta` was previously tracked within the `-ROOT-` table, 
but is now stored in ZooKeeper.
+The `hbase:meta` table (previously called `.META.`) keeps a list of all 
regions in the system, and the location of `hbase:meta` is stored in ZooKeeper.
 
 The `hbase:meta` table structure is as follows:
 
@@ -122,7 +101,7 @@ The `hbase:meta` table structure is as follows:
 
 .Values
 
-* `info:regioninfo` (serialized 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html[HRegionInfo]
 instance for this region)
+* `info:regioninfo` (serialized 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html[HRegionInfo]
 instance for this region)
 * `info:server` (server:port of the RegionServer containing this region)
 * `info:serverstartcode` (start-time of the RegionServer process containing 
this region)
 
@@ -140,9 +119,7 @@ If a region has both an empty start and an empty end key, 
it is the only region
 ====
 
 In the (hopefully unlikely) event that programmatic processing of catalog 
metadata
-is required, see the
-+++<a 
href="http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/Writables.html#getHRegionInfo%28byte%5B%5D%29";>Writables</a>+++
-utility.
+is required, see the 
link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/RegionInfo.html#parseFrom-byte:A-[RegionInfo.parseFrom]
 utility.
 
 [[arch.catalog.startup]]
 === Startup Sequencing
@@ -164,7 +141,7 @@ Should a region be reassigned either by the master load 
balancer or because a Re
 
 See <<master.runtime>> for more information about the impact of the Master on 
HBase Client communication.
 
-Administrative functions are done via an instance of 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html[Admin]
+Administrative functions are done via an instance of 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Admin.html[Admin]
 
 [[client.connections]]
 === Cluster Connections
@@ -180,12 +157,12 @@ Finally, be sure to cleanup your `Connection` instance 
before exiting.
 `Connections` are heavyweight objects but thread-safe so you can create one 
for your application and keep the instance around.
 `Table`, `Admin` and `RegionLocator` instances are lightweight.
 Create as you go and then let go as soon as you are done by closing them.
-See the 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/package-summary.html[Client
 Package Javadoc Description] for example usage of the new HBase 1.0 API.
+See the 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/package-summary.html[Client
 Package Javadoc Description] for example usage of the new HBase 1.0 API.
 
 ==== API before HBase 1.0.0
 
-Instances of `HTable` are the way to interact with an HBase cluster earlier 
than 1.0.0. 
_link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html[Table]
 instances are not thread-safe_. Only one thread can use an instance of Table 
at any given time.
-When creating Table instances, it is advisable to use the same 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HBaseConfiguration[HBaseConfiguration]
 instance.
+Instances of `HTable` are the way to interact with an HBase cluster earlier 
than 1.0.0. 
_link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html[Table]
 instances are not thread-safe_. Only one thread can use an instance of Table 
at any given time.
+When creating Table instances, it is advisable to use the same 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HBaseConfiguration[HBaseConfiguration]
 instance.
 This will ensure sharing of ZooKeeper and socket instances to the 
RegionServers which is usually what you want.
 For example, this is preferred:
 
@@ -206,7 +183,7 @@ HBaseConfiguration conf2 = HBaseConfiguration.create();
 HTable table2 = new HTable(conf2, "myTable");
 ----
 
-For more information about how connections are handled in the HBase client, 
see 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/ConnectionFactory.html[ConnectionFactory].
+For more information about how connections are handled in the HBase client, 
see 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/ConnectionFactory.html[ConnectionFactory].
 
 [[client.connection.pooling]]
 ===== Connection Pooling
@@ -230,19 +207,19 @@ try (Connection connection = 
ConnectionFactory.createConnection(conf);
 [WARNING]
 ====
 Previous versions of this guide discussed `HTablePool`, which was deprecated 
in HBase 0.94, 0.95, and 0.96, and removed in 0.98.1, by 
link:https://issues.apache.org/jira/browse/HBASE-6580[HBASE-6580], or 
`HConnection`, which is deprecated in HBase 1.0 by `Connection`.
-Please use 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html[Connection]
 instead.
+Please use 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Connection.html[Connection]
 instead.
 ====
 
 [[client.writebuffer]]
 === WriteBuffer and Batch Methods
 
-In HBase 1.0 and later, 
link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/HTable.html[HTable]
 is deprecated in favor of 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html[Table].
 `Table` does not use autoflush. To do buffered writes, use the BufferedMutator 
class.
+In HBase 1.0 and later, 
link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/HTable.html[HTable]
 is deprecated in favor of 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html[Table].
 `Table` does not use autoflush. To do buffered writes, use the BufferedMutator 
class.
 
-Before a `Table` or `HTable` instance is discarded, invoke either `close()` or 
`flushCommits()`, so `Put`s will not be lost.
+In HBase 2.0 and later, 
link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/client/HTable.html[HTable]
 does not use BufferedMutator to execute the ``Put`` operation. Refer to 
link:https://issues.apache.org/jira/browse/HBASE-18500[HBASE-18500] for more 
information.
 
 For additional information on write durability, review the 
link:/acid-semantics.html[ACID semantics] page.
 
-For fine-grained control of batching of ``Put``s or ``Delete``s, see the 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch%28java.util.List%29[batch]
 methods on Table.
+For fine-grained control of batching of ``Put``s or ``Delete``s, see the 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Table.html#batch-java.util.List-java.lang.Object:A-[batch]
 methods on Table.
 
 [[async.client]]
 === Asynchronous Client ===
@@ -259,11 +236,11 @@ There are several differences for scan:
 * There is a `scanAll` method which will return all the results at once. It 
aims to provide a simpler way for small scans which you want to get the whole 
results at once usually.
 * The Observer Pattern. There is a scan method which accepts a 
`ScanResultConsumer` as a parameter. It will pass the results to the consumer.
 
-Notice that there are two types of asynchronous table, one is `AsyncTable` and 
the other is `RawAsyncTable`.
+Notice that `AsyncTable` interface is templatized. The template parameter 
specifies the type of `ScanResultConsumerBase` used by scans, which means the 
observer style scan APIs are different. The two types of scan consumers are - 
`ScanResultConsumer` and `AdvancedScanResultConsumer`.
 
-For `AsyncTable`, you need to provide a thread pool when getting it. The 
callbacks registered to the returned CompletableFuture will be executed in that 
thread pool. It is designed for normal users. You are free to do anything in 
the callbacks.
+`ScanResultConsumer` needs a separate thread pool which is used to execute the 
callbacks registered to the returned CompletableFuture. Because the use of 
separate thread pool frees up RPC threads, callbacks are free to do anything. 
Use this if the callbacks are not quick, or when in doubt.
 
-For `RawAsyncTable`, all the callbacks are executed inside the framework 
thread so it is not allowed to do time consuming works in the callbacks 
otherwise you may block the framework thread and cause very bad performance 
impact. It is designed for advanced users who want to write high performance 
code. You can see the `org.apache.hadoop.hbase.client.example.HttpProxyExample` 
to see how to write fully asynchronous code with `RawAsyncTable`. And 
coprocessor related methods are only in `RawAsyncTable`.
+`AdvancedScanResultConsumer` executes callbacks inside the framework thread. 
It is not allowed to do time consuming work in the callbacks else it will 
likely block the framework threads and cause very bad performance impact. As 
its name, it is designed for advanced users who want to write high performance 
code. See `org.apache.hadoop.hbase.client.example.HttpProxyExample` for how to 
write fully asynchronous code with it.
 
 [[async.admin]]
 === Asynchronous Admin ===
@@ -286,7 +263,7 @@ Information on non-Java clients and custom protocols is 
covered in <<external_ap
 [[client.filter]]
 == Client Request Filters
 
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html[Get]
 and 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html[Scan]
 instances can be optionally configured with 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html[filters]
 which are applied on the RegionServer.
+link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Get.html[Get]
 and 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html[Scan]
 instances can be optionally configured with 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html[filters]
 which are applied on the RegionServer.
 
 Filters can be confusing because there are many different types, and it is 
best to approach them by understanding the groups of Filter functionality.
 
@@ -298,7 +275,7 @@ Structural Filters contain other Filters.
 [[client.filter.structural.fl]]
 ==== FilterList
 
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterList.html[FilterList]
 represents a list of Filters with a relationship of 
`FilterList.Operator.MUST_PASS_ALL` or `FilterList.Operator.MUST_PASS_ONE` 
between the Filters.
+link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterList.html[FilterList]
 represents a list of Filters with a relationship of 
`FilterList.Operator.MUST_PASS_ALL` or `FilterList.Operator.MUST_PASS_ONE` 
between the Filters.
 The following example shows an 'or' between two Filters (checking for either 
'my value' or 'my other value' on the same attribute).
 
 [source,java]
@@ -307,14 +284,14 @@ FilterList list = new 
FilterList(FilterList.Operator.MUST_PASS_ONE);
 SingleColumnValueFilter filter1 = new SingleColumnValueFilter(
   cf,
   column,
-  CompareOp.EQUAL,
+  CompareOperator.EQUAL,
   Bytes.toBytes("my value")
   );
 list.add(filter1);
 SingleColumnValueFilter filter2 = new SingleColumnValueFilter(
   cf,
   column,
-  CompareOp.EQUAL,
+  CompareOperator.EQUAL,
   Bytes.toBytes("my other value")
   );
 list.add(filter2);
@@ -328,9 +305,9 @@ scan.setFilter(list);
 ==== SingleColumnValueFilter
 
 A SingleColumnValueFilter (see:
-http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html)
-can be used to test column values for equivalence (`CompareOp.EQUAL`),
-inequality (`CompareOp.NOT_EQUAL`), or ranges (e.g., `CompareOp.GREATER`). The 
following is an
+https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SingleColumnValueFilter.html)
+can be used to test column values for equivalence (`CompareOperaor.EQUAL`),
+inequality (`CompareOperaor.NOT_EQUAL`), or ranges (e.g., 
`CompareOperaor.GREATER`). The following is an
 example of testing equivalence of a column to a String value "my value"...
 
 [source,java]
@@ -338,7 +315,7 @@ example of testing equivalence of a column to a String 
value "my value"...
 SingleColumnValueFilter filter = new SingleColumnValueFilter(
   cf,
   column,
-  CompareOp.EQUAL,
+  CompareOperaor.EQUAL,
   Bytes.toBytes("my value")
   );
 scan.setFilter(filter);
@@ -353,7 +330,7 @@ These Comparators are used in concert with other Filters, 
such as <<client.filte
 [[client.filter.cvp.rcs]]
 ==== RegexStringComparator
 
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RegexStringComparator.html[RegexStringComparator]
 supports regular expressions for value comparisons.
+link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RegexStringComparator.html[RegexStringComparator]
 supports regular expressions for value comparisons.
 
 [source,java]
 ----
@@ -361,7 +338,7 @@ RegexStringComparator comp = new 
RegexStringComparator("my.");   // any value th
 SingleColumnValueFilter filter = new SingleColumnValueFilter(
   cf,
   column,
-  CompareOp.EQUAL,
+  CompareOperaor.EQUAL,
   comp
   );
 scan.setFilter(filter);
@@ -372,7 +349,7 @@ See the Oracle JavaDoc for 
link:http://download.oracle.com/javase/6/docs/api/jav
 [[client.filter.cvp.substringcomparator]]
 ==== SubstringComparator
 
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SubstringComparator.html[SubstringComparator]
 can be used to determine if a given substring exists in a value.
+link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/SubstringComparator.html[SubstringComparator]
 can be used to determine if a given substring exists in a value.
 The comparison is case-insensitive.
 
 [source,java]
@@ -382,7 +359,7 @@ SubstringComparator comp = new SubstringComparator("y 
val");   // looking for 'm
 SingleColumnValueFilter filter = new SingleColumnValueFilter(
   cf,
   column,
-  CompareOp.EQUAL,
+  CompareOperaor.EQUAL,
   comp
   );
 scan.setFilter(filter);
@@ -391,12 +368,12 @@ scan.setFilter(filter);
 [[client.filter.cvp.bfp]]
 ==== BinaryPrefixComparator
 
-See 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/BinaryPrefixComparator.html[BinaryPrefixComparator].
+See 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/BinaryPrefixComparator.html[BinaryPrefixComparator].
 
 [[client.filter.cvp.bc]]
 ==== BinaryComparator
 
-See 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/BinaryComparator.html[BinaryComparator].
+See 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/BinaryComparator.html[BinaryComparator].
 
 [[client.filter.kvm]]
 === KeyValue Metadata
@@ -406,18 +383,18 @@ As HBase stores data internally as KeyValue pairs, 
KeyValue Metadata Filters eva
 [[client.filter.kvm.ff]]
 ==== FamilyFilter
 
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FamilyFilter.html[FamilyFilter]
 can be used to filter on the ColumnFamily.
+link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FamilyFilter.html[FamilyFilter]
 can be used to filter on the ColumnFamily.
 It is generally a better idea to select ColumnFamilies in the Scan than to do 
it with a Filter.
 
 [[client.filter.kvm.qf]]
 ==== QualifierFilter
 
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/QualifierFilter.html[QualifierFilter]
 can be used to filter based on Column (aka Qualifier) name.
+link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/QualifierFilter.html[QualifierFilter]
 can be used to filter based on Column (aka Qualifier) name.
 
 [[client.filter.kvm.cpf]]
 ==== ColumnPrefixFilter
 
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.html[ColumnPrefixFilter]
 can be used to filter based on the lead portion of Column (aka Qualifier) 
names.
+link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.html[ColumnPrefixFilter]
 can be used to filter based on the lead portion of Column (aka Qualifier) 
names.
 
 A ColumnPrefixFilter seeks ahead to the first column matching the prefix in 
each row and for each involved column family.
 It can be used to efficiently get a subset of the columns in very wide rows.
@@ -450,7 +427,7 @@ rs.close();
 [[client.filter.kvm.mcpf]]
 ==== MultipleColumnPrefixFilter
 
-link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/MultipleColumnPrefixFilter.html[MultipleColumnPrefixFilter]
 behaves like ColumnPrefixFilter but allows specifying multiple prefixes.
+link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/MultipleColumnPrefixFilter.html[MultipleColumnPrefixFilter]
 behaves like ColumnPrefixFilter but allows specifying multiple prefixes.
 
 Like ColumnPrefixFilter, MultipleColumnPrefixFilter efficiently seeks ahead to 
the first column matching the lowest prefix and also seeks past ranges of 
columns between prefixes.
 It can be used to efficiently get discontinuous sets of columns from very wide 
rows.
@@ -480,7 +457,7 @@ rs.close();
 [[client.filter.kvm.crf]]
 ==== ColumnRangeFilter
 
-A 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnRangeFilter.html[ColumnRangeFilter]
 allows efficient intra row scanning.
+A 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/ColumnRangeFilter.html[ColumnRangeFilter]
 allows efficient intra row scanning.
 
 A ColumnRangeFilter can seek ahead to the first matching column for each 
involved column family.
 It can be used to efficiently get a 'slice' of the columns of a very wide row.
@@ -521,7 +498,7 @@ Note:  Introduced in HBase 0.92
 [[client.filter.row.rf]]
 ==== RowFilter
 
-It is generally a better idea to use the startRow/stopRow methods on Scan for 
row selection, however 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RowFilter.html[RowFilter]
 can also be used.
+It is generally a better idea to use the startRow/stopRow methods on Scan for 
row selection, however 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/RowFilter.html[RowFilter]
 can also be used.
 
 [[client.filter.utility]]
 === Utility
@@ -530,7 +507,7 @@ It is generally a better idea to use the startRow/stopRow 
methods on Scan for ro
 ==== FirstKeyOnlyFilter
 
 This is primarily used for rowcount jobs.
-See 
link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html[FirstKeyOnlyFilter].
+See 
link:https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FirstKeyOnlyFilter.html[FirstKeyOnlyFilter].
 
 [[architecture.master]]
 == Master
@@ -580,7 +557,7 @@ See <<regions.arch.assignment>> for more information on 
region assignment.
 ==== CatalogJanitor
 
 Periodically checks and cleans up the `hbase:meta` table.
-See <arch.catalog.meta>> for more information on the meta table.
+See <<arch.catalog.meta>> for more information on the meta table.
 
 [[regionserver.arch]]
 == RegionServer
@@ -657,7 +634,7 @@ However, latencies tend to be less erratic across time, 
because there is less ga
 If the BucketCache is deployed in off-heap mode, this memory is not managed by 
the GC at all.
 This is why you'd use BucketCache, so your latencies are less erratic and to 
mitigate GCs and heap fragmentation.
 See Nick Dimiduk's link:http://www.n10k.com/blog/blockcache-101/[BlockCache 
101] for comparisons running on-heap vs off-heap tests.
-Also see link:http://people.apache.org/~stack/bc/[Comparing BlockCache 
Deploys] which finds that if your dataset fits inside your LruBlockCache 
deploy, use it otherwise if you are experiencing cache churn (or you want your 
cache to exist beyond the vagaries of java GC), use BucketCache.
+Also see link:https://people.apache.org/~stack/bc/[Comparing BlockCache 
Deploys] which finds that if your dataset fits inside your LruBlockCache 
deploy, use it otherwise if you are experiencing cache churn (or you want your 
cache to exist beyond the vagaries of java GC), use BucketCache.
 
 When you enable BucketCache, you are enabling a two tier caching system, an L1 
cache which is implemented by an instance of LruBlockCache and an off-heap L2 
cache which is implemented by BucketCache.
 Management of these two tiers and the policy that dictates how blocks move 
between them is done by `CombinedBlockCache`.
@@ -668,7 +645,7 @@ See <<offheap.blockcache>> for more detail on going 
off-heap.
 ==== General Cache Configurations
 
 Apart from the cache implementation itself, you can set some general 
configuration options to control how the cache performs.
-See 
http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html.
+See 
https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html.
 After setting any of these options, restart or rolling restart your cluster 
for the configuration to take effect.
 Check logs for errors or unexpected behavior.
 
@@ -730,9 +707,9 @@ Your data is not the only resident of the block cache.
 Here are others that you may have to take into account:
 
 Catalog Tables::
-  The `-ROOT-` (prior to HBase 0.96, see 
<<arch.catalog.root,arch.catalog.root>>) and `hbase:meta` tables are forced 
into the block cache and have the in-memory priority which means that they are 
harder to evict.
-  The former never uses more than a few hundred bytes while the latter can 
occupy a few MBs
-  (depending on the number of regions).
+  The `hbase:meta` table is forced into the block cache and have the in-memory 
priority which means that they are harder to evict.
+
+NOTE: The hbase:meta tables can occupy a few MBs depending on the number of 
regions.
 
 HFiles Indexes::
   An _HFile_ is the file format that HBase uses to store data in HDFS.
@@ -778,7 +755,7 @@ Since 
link:https://issues.apache.org/jira/browse/HBASE-4683[HBASE-4683 Always ca
 ===== How to Enable BucketCache
 
 The usual deploy of BucketCache is via a managing class that sets up two 
caching tiers: an L1 on-heap cache implemented by LruBlockCache and a second L2 
cache implemented with BucketCache.
-The managing class is 
link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.html[CombinedBlockCache]
 by default.
+The managing class is 
link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/hfile/CombinedBlockCache.html[CombinedBlockCache]
 by default.
 The previous link describes the caching 'policy' implemented by 
CombinedBlockCache.
 In short, it works by keeping meta blocks -- INDEX and BLOOM in the L1, 
on-heap LruBlockCache tier -- and DATA blocks are kept in the L2, BucketCache 
tier.
 It is possible to amend this behavior in HBase since version 1.0 and ask that 
a column family have both its meta and DATA blocks hosted on-heap in the L1 
tier by setting `cacheDataInL1` via `(HColumnDescriptor.setCacheDataInL1(true)` 
or in the shell, creating or amending column families setting 
`CACHE_DATA_IN_L1` to true: e.g.
@@ -904,7 +881,7 @@ The compressed BlockCache is disabled by default. To enable 
it, set `hbase.block
 
 As write requests are handled by the region server, they accumulate in an 
in-memory storage system called the _memstore_. Once the memstore fills, its 
content are written to disk as additional store files. This event is called a 
_memstore flush_. As store files accumulate, the RegionServer will 
<<compaction,compact>> them into fewer, larger files. After each flush or 
compaction finishes, the amount of data stored in the region has changed. The 
RegionServer consults the region split policy to determine if the region has 
grown too large or should be split for another policy-specific reason. A region 
split request is enqueued if the policy recommends it.
 
-Logically, the process of splitting a region is simple. We find a suitable 
point in the keyspace of the region where we should divide the region in half, 
then split the region's data into two new regions at that point. The details of 
the process however are not simple.  When a split happens, the newly created 
_daughter regions_ do not rewrite all the data into new files immediately. 
Instead, they create small files similar to symbolic link files, named 
link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/Reference.html[Reference
 files], which point to either the top or bottom part of the parent store file 
according to the split point. The reference file is used just like a regular 
data file, but only half of the records are considered. The region can only be 
split if there are no more references to the immutable data files of the parent 
region. Those reference files are cleaned gradually by compactions, so that the 
region will stop referring to its parents files, and c
 an be split further.
+Logically, the process of splitting a region is simple. We find a suitable 
point in the keyspace of the region where we should divide the region in half, 
then split the region's data into two new regions at that point. The details of 
the process however are not simple.  When a split happens, the newly created 
_daughter regions_ do not rewrite all the data into new files immediately. 
Instead, they create small files similar to symbolic link files, named 
link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/io/Reference.html[Reference
 files], which point to either the top or bottom part of the parent store file 
according to the split point. The reference file is used just like a regular 
data file, but only half of the records are considered. The region can only be 
split if there are no more references to the immutable data files of the parent 
region. Those reference files are cleaned gradually by compactions, so that the 
region will stop referring to its parents files, and 
 can be split further.
 
 Although splitting the region is a local decision made by the RegionServer, 
the split process itself must coordinate with many actors. The RegionServer 
notifies the Master before and after the split, updates the `.META.` table so 
that clients can discover the new daughter regions, and rearranges the 
directory structure and data files in HDFS. Splitting is a multi-task process. 
To enable rollback in case of an error, the RegionServer keeps an in-memory 
journal about the execution state. The steps taken by the RegionServer to 
execute the split are illustrated in <<regionserver_split_process_image>>. Each 
step is labeled with its step number. Actions from RegionServers or Master are 
shown in red, while actions from the clients are show in green.
 
@@ -938,7 +915,7 @@ Under normal operations, the WAL is not needed because data 
changes move from th
 However, if a RegionServer crashes or becomes unavailable before the MemStore 
is flushed, the WAL ensures that the changes to the data can be replayed.
 If writing to the WAL fails, the entire operation to modify the data fails.
 
-HBase uses an implementation of the 
link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/wal/WAL.html[WAL]
 interface.
+HBase uses an implementation of the 
link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/wal/WAL.html[WAL]
 interface.
 Usually, there is only one instance of a WAL per RegionServer.
 The RegionServer records Puts and Deletes to it, before recording them to the 
<<store.memstore>> for the affected <<store>>.
 
@@ -1191,30 +1168,21 @@ Due to an asynchronous implementation, in very rare 
cases, the split log manager
 For that reason, it periodically checks for remaining uncompleted task in its 
task map or ZooKeeper.
 If none are found, it throws an exception so that the log splitting can be 
retried right away instead of hanging there waiting for something that won't 
happen.
 
+[[wal.compression]]
+==== WAL Compression ====
 
-[[distributed.log.replay]]
-====== Distributed Log Replay
-
-After a RegionServer fails, its failed regions are assigned to another 
RegionServer, which are marked as "recovering" in ZooKeeper.
-A split log worker directly replays edits from the WAL of the failed 
RegionServer to the regions at its new location.
-When a region is in "recovering" state, it can accept writes but no reads 
(including Append and Increment), region splits or merges.
-
-Distributed Log Replay extends the <<distributed.log.splitting>> framework.
-It works by directly replaying WAL edits to another RegionServer instead of 
creating _recovered.edits_ files.
-It provides the following advantages over distributed log splitting alone:
-
-* It eliminates the overhead of writing and reading a large number of 
_recovered.edits_ files.
-  It is not unusual for thousands of _recovered.edits_ files to be created and 
written concurrently during a RegionServer recovery.
-  Many small random writes can degrade overall system performance.
-* It allows writes even when a region is in recovering state.
-  It only takes seconds for a recovering region to accept writes again.
+The content of the WAL can be compressed using LRU Dictionary compression.
+This can be used to speed up WAL replication to different datanodes.
+The dictionary can store up to 2^15^ elements; eviction starts after this 
number is exceeded.
 
-.Enabling Distributed Log Replay
-To enable distributed log replay, set `hbase.master.distributed.log.replay` to 
`true`.
-This will be the default for HBase 0.99 
(link:https://issues.apache.org/jira/browse/HBASE-10888[HBASE-10888]).
+To enable WAL compression, set the `hbase.regionserver.wal.enablecompression` 
property to `true`.
+The default value for this property is `false`.
+By default, WAL tag compression is turned on when WAL compression is enabled.
+You can turn off WAL tag compression by setting the 
`hbase.regionserver.wal.tags.enablecompression` property to 'false'.
 
-You must also enable HFile version 3 (which is the default HFile format 
starting in HBase 0.99.
-See link:https://issues.apache.org/jira/browse/HBASE-10855[HBASE-10855]). 
Distributed log replay is unsafe for rolling upgrades.
+A possible downside to WAL compression is that we lose more data from the last 
block in the WAL if it ill-terminated
+mid-write. If entries in this last block were added with new dictionary 
entries but we failed persist the amended
+dictionary because of an abrupt termination, a read of this last block may not 
be able to resolve last-written entries.
 
 [[wal.disable]]
 ==== Disabling the WAL
@@ -1396,12 +1364,12 @@ The HDFS client does the following by default when 
choosing locations to write r
 . Second replica is written to a random node on another rack
 . Third replica is written on the same rack as the second, but on a different 
node chosen randomly
 . Subsequent replicas are written on random nodes on the cluster.
-  See _Replica Placement: The First Baby Steps_ on this page: 
link:http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html[HDFS
 Architecture]
+  See _Replica Placement: The First Baby Steps_ on this page: 
link:https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html[HDFS
 Architecture]
 
 Thus, HBase eventually achieves locality for a region after a flush or a 
compaction.
 In a RegionServer failover situation a RegionServer may be assigned regions 
with non-local StoreFiles (because none of the replicas are local), however as 
new data is written in the region, or the table is compacted and StoreFiles are 
re-written, they will become "local" to the RegionServer.
 
-For more information, see _Replica Placement: The First Baby Steps_ on this 
page: 
link:http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html[HDFS
 Architecture] and also Lars George's blog on 
link:http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html[HBase 
and HDFS locality].
+For more information, see _Replica Placement: The First Baby Steps_ on this 
page: 
link:https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html[HDFS
 Architecture] and also Lars George's blog on 
link:http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html[HBase 
and HDFS locality].
 
 [[arch.region.splits]]
 === Region Splits
@@ -1416,9 +1384,9 @@ See <<disable.splitting>> for how to manually manage 
splits (and for why you mig
 
 ==== Custom Split Policies
 You can override the default split policy using a custom
-link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.html[RegionSplitPolicy](HBase
 0.94+).
+link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/RegionSplitPolicy.html[RegionSplitPolicy](HBase
 0.94+).
 Typically a custom split policy should extend HBase's default split policy:
-link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/IncreasingToUpperBoundRegionSplitPolicy.html[IncreasingToUpperBoundRegionSplitPolicy].
+link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/IncreasingToUpperBoundRegionSplitPolicy.html[IncreasingToUpperBoundRegionSplitPolicy].
 
 The policy can set globally through the HBase configuration or on a per-table
 basis.
@@ -1492,13 +1460,13 @@ Using a Custom Algorithm::
   As parameters, you give it the algorithm, desired number of regions, and 
column families.
   It includes two split algorithms.
   The first is the
-  
`link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.HexStringSplit.html[HexStringSplit]`
+  
`link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.HexStringSplit.html[HexStringSplit]`
   algorithm, which assumes the row keys are hexadecimal strings.
   The second,
-  
`link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.UniformSplit.html[UniformSplit]`,
+  
`link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.UniformSplit.html[UniformSplit]`,
   assumes the row keys are random byte arrays.
   You will probably need to develop your own
-  
`link:http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.SplitAlgorithm.html[SplitAlgorithm]`,
+  
`link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.SplitAlgorithm.html[SplitAlgorithm]`,
   using the provided ones as models.
 
 === Online Region Merges
@@ -1574,7 +1542,7 @@ StoreFiles are where your data lives.
 
 ===== HFile Format
 
-The _HFile_ file format is based on the SSTable file described in the 
link:http://research.google.com/archive/bigtable.html[BigTable [2006]] paper 
and on Hadoop's 
link:http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/file/tfile/TFile.html[TFile]
 (The unit test suite and the compression harness were taken directly from 
TFile). Schubert Zhang's blog post on 
link:http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html[HFile:
 A Block-Indexed File Format to Store Sorted Key-Value Pairs] makes for a 
thorough introduction to HBase's HFile.
+The _HFile_ file format is based on the SSTable file described in the 
link:http://research.google.com/archive/bigtable.html[BigTable [2006]] paper 
and on Hadoop's 
link:https://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/file/tfile/TFile.html[TFile]
 (The unit test suite and the compression harness were taken directly from 
TFile). Schubert Zhang's blog post on 
link:http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html[HFile:
 A Block-Indexed File Format to Store Sorted Key-Value Pairs] makes for a 
thorough introduction to HBase's HFile.
 Matteo Bertozzi has also put up a helpful description, 
link:http://th30z.blogspot.com/2011/02/hbase-io-hfile.html?spref=tw[HBase I/O: 
HFile].
 
 For more information, see the HFile source code.
@@ -2086,6 +2054,107 @@ Why?
 
 NOTE: This information is now included in the configuration parameter table in 
<<compaction.parameters>>.
 
+[[ops.date.tiered]]
+===== Date Tiered Compaction
+
+Date tiered compaction is a date-aware store file compaction strategy that is 
beneficial for time-range scans for time-series data.
+
+[[ops.date.tiered.when]]
+===== When To Use Date Tiered Compactions
+
+Consider using Date Tiered Compaction for reads for limited time ranges, 
especially scans of recent data
+
+Don't use it for
+
+* random gets without a limited time range
+* frequent deletes and updates
+* Frequent out of order data writes creating long tails, especially writes 
with future timestamps
+* frequent bulk loads with heavily overlapping time ranges
+
+.Performance Improvements
+Performance testing has shown that the performance of time-range scans improve 
greatly for limited time ranges, especially scans of recent data.
+
+[[ops.date.tiered.enable]]
+====== Enabling Date Tiered Compaction
+
+You can enable Date Tiered compaction for a table or a column family, by 
setting its `hbase.hstore.engine.class` to 
`org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine`.
+
+You also need to set `hbase.hstore.blockingStoreFiles` to a high number, such 
as 60, if using all default settings, rather than the default value of 12). Use 
1.5~2 x projected file count if changing the parameters, Projected file count = 
windows per tier x tier count + incoming window min + files older than max age
+
+You also need to set `hbase.hstore.compaction.max` to the same value as 
`hbase.hstore.blockingStoreFiles` to unblock major compaction.
+
+.Procedure: Enable Date Tiered Compaction
+. Run one of following commands in the HBase shell.
+  Replace the table name `orders_table` with the name of your table.
++
+[source,sql]
+----
+alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => 
'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine', 
'hbase.hstore.blockingStoreFiles' => '60', 'hbase.hstore.compaction.min'=>'2', 
'hbase.hstore.compaction.max'=>'60'}
+alter 'orders_table', {NAME => 'blobs_cf', CONFIGURATION => 
{'hbase.hstore.engine.class' => 
'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine', 
'hbase.hstore.blockingStoreFiles' => '60', 'hbase.hstore.compaction.min'=>'2', 
'hbase.hstore.compaction.max'=>'60'}}
+create 'orders_table', 'blobs_cf', CONFIGURATION => 
{'hbase.hstore.engine.class' => 
'org.apache.hadoop.hbase.regionserver.DateTieredStoreEngine', 
'hbase.hstore.blockingStoreFiles' => '60', 'hbase.hstore.compaction.min'=>'2', 
'hbase.hstore.compaction.max'=>'60'}
+----
+
+. Configure other options if needed.
+  See <<ops.date.tiered.config>> for more information.
+
+.Procedure: Disable Date Tiered Compaction
+. Set the `hbase.hstore.engine.class` option to either nil or 
`org.apache.hadoop.hbase.regionserver.DefaultStoreEngine`.
+  Either option has the same effect.
+  Make sure you set the other options you changed to the original settings too.
++
+[source,sql]
+----
+alter 'orders_table', CONFIGURATION => {'hbase.hstore.engine.class' => 
'org.apache.hadoop.hbase.regionserver.DefaultStoreEngine'ï¼ 
'hbase.hstore.blockingStoreFiles' => '12', 'hbase.hstore.compaction.min'=>'6', 
'hbase.hstore.compaction.max'=>'12'}}
+----
+
+When you change the store engine either way, a major compaction will likely be 
performed on most regions.
+This is not necessary on new tables.
+
+[[ops.date.tiered.config]]
+====== Configuring Date Tiered Compaction
+
+Each of the settings for date tiered compaction should be configured at the 
table or column family level.
+If you use HBase shell, the general command pattern is as follows:
+
+[source,sql]
+----
+alter 'orders_table', CONFIGURATION => {'key' => 'value', ..., 'key' => 
'value'}}
+----
+
+[[ops.date.tiered.config.parameters]]
+.Tier Parameters
+
+You can configure your date tiers by changing the settings for the following 
parameters:
+
+.Date Tier Parameters
+[cols="1,1a", frame="all", options="header"]
+|===
+| Setting
+| Notes
+
+|`hbase.hstore.compaction.date.tiered.max.storefile.age.millis`
+|Files with max-timestamp smaller than this will no longer be 
compacted.Default at Long.MAX_VALUE.
+
+| `hbase.hstore.compaction.date.tiered.base.window.millis`
+| Base window size in milliseconds. Default at 6 hours.
+
+| `hbase.hstore.compaction.date.tiered.windows.per.tier`
+| Number of windows per tier. Default at 4.
+
+| `hbase.hstore.compaction.date.tiered.incoming.window.min`
+| Minimal number of files to compact in the incoming window. Set it to 
expected number of files in the window to avoid wasteful compaction. Default at 
6.
+
+| `hbase.hstore.compaction.date.tiered.window.policy.class`
+| The policy to select store files within the same time window. It doesnât 
apply to the incoming window. Default at exploring compaction. This is to avoid 
wasteful compaction.
+|===
+
+[[ops.date.tiered.config.compaction.throttler]]
+.Compaction Throttler
+
+With tiered compaction all servers in the cluster will promote windows to 
higher tier at the same time, so using a compaction throttle is recommended:
+Set `hbase.regionserver.throughput.controller` to 
`org.apache.hadoop.hbase.regionserver.compactions.PressureAwareCompactionThroughputController`.
+
+NOTE: For more information about date tiered compaction, please refer to the 
design specification at 
https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8
 [[ops.stripe]]
 ===== Experimental: Stripe Compactions
 
@@ -2299,7 +2368,7 @@ See the `LoadIncrementalHFiles` class for more 
information.
 
 As HBase runs on HDFS (and each StoreFile is written as a file on HDFS), it is 
important to have an understanding of the HDFS Architecture especially in terms 
of how it stores files, handles failovers, and replicates blocks.
 
-See the Hadoop documentation on 
link:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html[HDFS
 Architecture] for more information.
+See the Hadoop documentation on 
link:https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html[HDFS
 Architecture] for more information.
 
 [[arch.hdfs.nn]]
 === NameNode
@@ -2703,7 +2772,7 @@ if (result.isStale()) {
 === Resources
 
 . More information about the design and implementation can be found at the 
jira issue: link:https://issues.apache.org/jira/browse/HBASE-10070[HBASE-10070]
-. HBaseCon 2014 link:http://hbasecon.com/sessions/#session15[talk] also 
contains some details and 
link:http://www.slideshare.net/enissoz/hbase-high-availability-for-reads-with-time[slides].
+. HBaseCon 2014 talk: 
link:https://hbase.apache.org/www.hbasecon.com/#2014-PresentationsRecordings[HBase
 Read High Availability Using Timeline-Consistent Region Replicas] also 
contains some details and 
link:http://www.slideshare.net/enissoz/hbase-high-availability-for-reads-with-time[slides].
 
 ifdef::backend-docbook[]
 [index]

http://git-wip-us.apache.org/repos/asf/hbase/blob/2e9a55be/src/main/asciidoc/_chapters/asf.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/asf.adoc 
b/src/main/asciidoc/_chapters/asf.adoc
index 47c29e5..18cf95a 100644
--- a/src/main/asciidoc/_chapters/asf.adoc
+++ b/src/main/asciidoc/_chapters/asf.adoc
@@ -35,13 +35,13 @@ HBase is a project in the Apache Software Foundation and as 
such there are respo
 [[asf.devprocess]]
 === ASF Development Process
 
-See the link:http://www.apache.org/dev/#committers[Apache Development Process 
page]            for all sorts of information on how the ASF is structured 
(e.g., PMC, committers, contributors), to tips on contributing and getting 
involved, and how open-source works at ASF.
+See the link:https://www.apache.org/dev/#committers[Apache Development Process 
page]            for all sorts of information on how the ASF is structured 
(e.g., PMC, committers, contributors), to tips on contributing and getting 
involved, and how open-source works at ASF.
 
 [[asf.reporting]]
 === ASF Board Reporting
 
 Once a quarter, each project in the ASF portfolio submits a report to the ASF 
board.
 This is done by the HBase project lead and the committers.
-See link:http://www.apache.org/foundation/board/reporting[ASF board reporting] 
for more information.
+See link:https://www.apache.org/foundation/board/reporting[ASF board 
reporting] for more information.
 
 :numbered:

[5/6] hbase git commit: updating docs from master

Reply via email to