Repository: hbase
Updated Branches:
  refs/heads/branch-2.0 84b54f45d -> d49fca134


http://git-wip-us.apache.org/repos/asf/hbase/blob/d49fca13/src/main/asciidoc/_chapters/ops_mgt.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/ops_mgt.adoc 
b/src/main/asciidoc/_chapters/ops_mgt.adoc
index 1eeaaa1..feccb5d 100644
--- a/src/main/asciidoc/_chapters/ops_mgt.adoc
+++ b/src/main/asciidoc/_chapters/ops_mgt.adoc
@@ -898,7 +898,8 @@ $ bin/hbase pre-upgrade validate-cp -table .*
 It validates every table level co-processors where the table name matches to 
`.*` regular expression.
 
 ==== DataBlockEncoding validation
-HBase 2.0 removed `PREFIX_TREE` Data Block Encoding from column families.
+HBase 2.0 removed `PREFIX_TREE` Data Block Encoding from column families. For 
further information
+please check <<upgrade2.0.prefix-tree.removed,_prefix-tree_ encoding removed>>.
 To verify that none of the column families are using incompatible Data Block 
Encodings in the cluster run the following command.
 
 [source, bash]
@@ -906,8 +907,127 @@ To verify that none of the column families are using 
incompatible Data Block Enc
 $ bin/hbase pre-upgrade validate-dbe
 ----
 
-This check validates all column families and print out any incompatibilities.
-To change `PREFIX_TREE` encoding to supported one check 
<<upgrade2.0.prefix-tree.removed,_prefix-tree_ encoding removed>>.
+This check validates all column families and print out any incompatibilities. 
For example:
+
+----
+2018-07-13 09:58:32,028 WARN  [main] tool.DataBlockEncodingValidator: 
Incompatible DataBlockEncoding for table: t, cf: f, encoding: PREFIX_TREE
+----
+
+Which means that Data Block Encoding of table `t`, column family `f` is 
incompatible. To fix, use `alter` command in HBase shell:
+
+----
+alter 't', { NAME => 'f', DATA_BLOCK_ENCODING => 'FAST_DIFF' }
+----
+
+Please also validate HFiles, which is described in the next section.
+
+==== HFile Content validation
+Even though Data Block Encoding is changed from `PREFIX_TREE` it is still 
possible to have HFiles that contain data encoded that way.
+To verify that HFiles are readable with HBase 2 please use _HFile content 
validator_.
+
+[source, bash]
+----
+$ bin/hbase pre-upgrade validate-hfile
+----
+
+The tool will log the corrupt HFiles and details about the root cause.
+If the problem is about PREFIX_TREE encoding it is necessary to change 
encodings before upgrading to HBase 2.
+
+The following log message shows an example of incorrect HFiles.
+
+----
+2018-06-05 16:20:46,976 WARN  [hfilevalidator-pool1-t3] 
hbck.HFileCorruptionChecker: Found corrupt HFile 
hdfs://example.com:8020/hbase/data/default/t/72ea7f7d625ee30f959897d1a3e2c350/prefix/7e6b3d73263c4851bf2b8590a9b3791e
+org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading HFile 
Trailer from file 
hdfs://example.com:8020/hbase/data/default/t/72ea7f7d625ee30f959897d1a3e2c350/prefix/7e6b3d73263c4851bf2b8590a9b3791e
+    ...
+Caused by: java.io.IOException: Invalid data block encoding type in file info: 
PREFIX_TREE
+    ...
+Caused by: java.lang.IllegalArgumentException: No enum constant 
org.apache.hadoop.hbase.io.encoding.DataBlockEncoding.PREFIX_TREE
+    ...
+2018-06-05 16:20:47,322 INFO  [main] tool.HFileContentValidator: Corrupted 
file: 
hdfs://example.com:8020/hbase/data/default/t/72ea7f7d625ee30f959897d1a3e2c350/prefix/7e6b3d73263c4851bf2b8590a9b3791e
+2018-06-05 16:20:47,383 INFO  [main] tool.HFileContentValidator: Corrupted 
file: 
hdfs://example.com:8020/hbase/archive/data/default/t/56be41796340b757eb7fff1eb5e2a905/f/29c641ae91c34fc3bee881f45436b6d1
+----
+
+===== Fixing PREFIX_TREE errors
+
+It's possible to get `PREFIX_TREE` errors after changing Data Block Encoding 
to a supported one. It can happen
+because there are some HFiles which still encoded with `PREFIX_TREE` or there 
are still some snapshots.
+
+For fixing HFiles, please run a major compaction on the table (it was 
`default:t` according to the log message):
+
+----
+major_compact 't'
+----
+
+HFiles can be referenced from snapshots, too. It's the case when the HFile is 
located under `archive/data`.
+The first step is to determine which snapshot references that HFile (the name 
of the file was `29c641ae91c34fc3bee881f45436b6d1`
+according to the logs):
+
+[source, bash]
+----
+for snapshot in $(hbase snapshotinfo -list-snapshots 2> /dev/null | tail -n -1 
| cut -f 1 -d \|);
+do
+  echo "checking snapshot named '${snapshot}'";
+  hbase snapshotinfo -snapshot "${snapshot}" -files 2> /dev/null | grep 
29c641ae91c34fc3bee881f45436b6d1;
+done
+----
+
+The output of this shell script is:
+
+----
+checking snapshot named 't_snap'
+   1.0 K t/56be41796340b757eb7fff1eb5e2a905/f/29c641ae91c34fc3bee881f45436b6d1 
(archive)
+----
+
+Which means `t_snap` snapshot references the incompatible HFile. If the 
snapshot is still needed,
+then it has to be recreated with HBase shell:
+
+----
+# creating a new namespace for the cleanup process
+create_namespace 'pre_upgrade_cleanup'
+
+# creating a new snapshot
+clone_snapshot 't_snap', 'pre_upgrade_cleanup:t'
+alter 'pre_upgrade_cleanup:t', { NAME => 'f', DATA_BLOCK_ENCODING => 
'FAST_DIFF' }
+major_compact 'pre_upgrade_cleanup:t'
+
+# removing the invalid snapshot
+delete_snapshot 't_snap'
+
+# creating a new snapshot
+snapshot 'pre_upgrade_cleanup:t', 't_snap'
+
+# removing temporary table
+disable 'pre_upgrade_cleanup:t'
+drop 'pre_upgrade_cleanup:t'
+drop_namespace 'pre_upgrade_cleanup'
+----
+
+For further information, please refer to
+link:https://issues.apache.org/jira/browse/HBASE-20649?focusedCommentId=16535476#comment-16535476[HBASE-20649].
+
+=== Data Block Encoding Tool
+
+Tests various compression algorithms with different data block encoder for key 
compression on an existing HFile.
+Useful for testing, debugging and benchmarking.
+
+You must specify `-f` which is the full path of the HFile.
+
+The result shows both the performance (MB/s) of compression/decompression and 
encoding/decoding, and the data savings on the HFile.
+
+----
+
+$ bin/hbase org.apache.hadoop.hbase.regionserver.DataBlockEncodingTool
+Usages: hbase org.apache.hadoop.hbase.regionserver.DataBlockEncodingTool
+Options:
+        -f HFile to analyse (REQUIRED)
+        -n Maximum number of key/value pairs to process in a single benchmark 
run.
+        -b Whether to run a benchmark to measure read throughput.
+        -c If this is specified, no correctness testing will be done.
+        -a What kind of compression algorithm use for test. Default value: GZ.
+        -t Number of times to run each benchmark. Default value: 12.
+        -omit Number of first runs of every benchmark to omit from statistics. 
Default value: 2.
+
+----
 
 [[ops.regionmgt]]
 == Region Management
@@ -1517,7 +1637,7 @@ How your application builds on top of the HBase API 
matters when replication is
 
 The combination of these two properties (at-least-once delivery and the lack 
of message ordering) means that some destination clusters may end up in a 
different state if your application makes use of operations that are not 
idempotent, e.g. Increments.
 
-To solve the problem, HBase now supports serial replication, which sends edits 
to destination cluster as the order of requests from client. See <<Serial 
Replication,Serial Replication>>.
+To solve the problem, HBase now supports serial replication -- but only 
versions 2.1.0 and later, not in this version of hbase -- which sends edits to 
destination cluster as the order of requests from client.
 
 ====
 
@@ -1559,9 +1679,6 @@ Instead of SQL statements, entire WALEdits (consisting of 
multiple cell inserts
 LOG.info("Replicating "+clusterId + " -> " + peerClusterId);
 ----
 
-.Serial Replication Configuration
-See <<Serial Replication,Serial Replication>>
-
 .Cluster Management Commands
 add_peer <ID> <CLUSTER_KEY>::
   Adds a replication relationship between two clusters. +
@@ -1583,32 +1700,6 @@ enable_table_replication <TABLE_NAME>::
 disable_table_replication <TABLE_NAME>::
   Disable the table replication switch for all its column families.
 
-=== Serial Replication
-
-Note: this feature is introduced in HBase 2.1 and is not available in 2.0.x, 
your current version.
-
-.Function of serial replication
-
-Serial replication supports to push logs to the destination cluster in the 
same order as logs reach to the source cluster.
-
-.Why need serial replication?
-In replication of HBase, we push mutations to destination cluster by reading 
WAL in each region server. We have a queue for WAL files so we can read them in 
order of creation time. However, when region-move or RS failure occurs in 
source cluster, the hlog entries that are not pushed before region-move or 
RS-failure will be pushed by original RS(for region move) or another RS which 
takes over the remained hlog of dead RS(for RS failure), and the new entries 
for the same region(s) will be pushed by the RS which now serves the region(s), 
but they push the hlog entries of a same region concurrently without 
coordination.
-
-This treatment can possibly lead to data inconsistency between source and 
destination clusters:
-
-1. there are put and then delete written to source cluster.
-
-2. due to region-move / RS-failure, they are pushed by different 
replication-source threads to peer cluster.
-
-3. if delete is pushed to peer cluster before put, and flush and major-compact 
occurs in peer cluster before put is pushed to peer cluster, the delete is 
collected and the put remains in peer cluster, but in source cluster the put is 
masked by the delete, hence data inconsistency between source and destination 
clusters.
-
-
-.Serial replication configuration
-
-. Set the serial flag to true for a repliation peer. You can either set it to 
true when creating a replication peer, or change it to true later.
-
-The serial replication feature had been done firstly in 
link:https://issues.apache.org/jira/browse/HBASE-9465[HBASE-9465] and then 
reverted and redone in 
link:https://issues.apache.org/jira/browse/HBASE-20046[HBASE-20046]. You can 
find more details in these issues.
-
 === Verifying Replicated Data
 
 The `VerifyReplication` MapReduce job, which is included in HBase, performs a 
systematic comparison of replicated data between two different clusters. Run 
the VerifyReplication job on the master cluster, supplying it with the peer ID 
and table name to use for validation. You can limit the verification further by 
specifying a time range or specific families. The job's short name is 
`verifyrep`. To run the job, use a command like the following:
@@ -2295,9 +2386,12 @@ Since the cluster is up, there is a risk that edits 
could be missed in the expor
 [[ops.snapshots]]
 == HBase Snapshots
 
-HBase Snapshots allow you to take a snapshot of a table without too much 
impact on Region Servers.
-Snapshot, Clone and restore operations don't involve data copying.
-Also, Exporting the snapshot to another cluster doesn't have impact on the 
Region Servers.
+HBase Snapshots allow you to take a copy of a table (both contents and 
metadata)with a very small performance impact. A Snapshot is an immutable
+collection of table metadata and a list of HFiles that comprised the table at 
the time the Snapshot was taken. A "clone"
+of a snapshot creates a new table from that snapshot, and a "restore" of a 
snapshot returns the contents of a table to
+what it was when the snapshot was created. The "clone" and "restore" 
operations do not require any data to be copied,
+as the underlying HFiles (the files which contain the data for an HBase table) 
are not modified with either action.
+Simiarly, exporting a snapshot to another cluster has little impact on 
RegionServers of the local cluster.
 
 Prior to version 0.94.6, the only way to backup or to clone a table is to use 
CopyTable/ExportTable, or to copy all the hfiles in HDFS after disabling the 
table.
 The disadvantages of these methods are that you can degrade region server 
performance (Copy/Export Table) or you need to disable the table, that means no 
reads or writes; and this is usually unacceptable.

http://git-wip-us.apache.org/repos/asf/hbase/blob/d49fca13/src/main/asciidoc/_chapters/preface.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/preface.adoc 
b/src/main/asciidoc/_chapters/preface.adoc
index 280f2d8..deebdd3 100644
--- a/src/main/asciidoc/_chapters/preface.adoc
+++ b/src/main/asciidoc/_chapters/preface.adoc
@@ -68,7 +68,7 @@ Yours, the HBase Community.
 
 Please use link:https://issues.apache.org/jira/browse/hbase[JIRA] to report 
non-security-related bugs.
 
-To protect existing HBase installations from new vulnerabilities, please *do 
not* use JIRA to report security-related bugs. Instead, send your report to the 
mailing list [email protected], which allows anyone to send messages, but 
restricts who can read them. Someone on that list will contact you to follow up 
on your report.
+To protect existing HBase installations from new vulnerabilities, please *do 
not* use JIRA to report security-related bugs. Instead, send your report to the 
mailing list [email protected], which allows anyone to send messages, 
but restricts who can read them. Someone on that list will contact you to 
follow up on your report.
 
 [[hbase_supported_tested_definitions]]
 .Support and Testing Expectations

http://git-wip-us.apache.org/repos/asf/hbase/blob/d49fca13/src/main/asciidoc/_chapters/pv2.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/pv2.adoc 
b/src/main/asciidoc/_chapters/pv2.adoc
new file mode 100644
index 0000000..5ecad3f
--- /dev/null
+++ b/src/main/asciidoc/_chapters/pv2.adoc
@@ -0,0 +1,163 @@
+////
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+////
+[[pv2]]
+= Procedure Framework (Pv2): 
link:https://issues.apache.org/jira/browse/HBASE-12439[HBASE-12439]
+:doctype: book
+:numbered:
+:toc: left
+:icons: font
+:experimental:
+
+
+_Procedure v2 ...aims to provide a unified way to build...multi-step 
procedures with a rollback/roll-forward ability in case of failure (e.g. 
create/delete table) -- Matteo Bertozzi, the author of Pv2._
+
+With Pv2 you can build and run state machines. It was built by Matteo to make 
distributed state transitions in HBase resilient in the face of process 
failures. Previous to Pv2, state transition handling was spread about the 
codebase with implementation varying by transition-type and context. Pv2 was 
inspired by 
link:https://accumulo.apache.org/1.8/accumulo_user_manual.html#_fault_tolerant_executor_fate[FATE],
 of Apache Accumulo. +
+
+Early Pv2 aspects have been shipping in HBase with a good while now but it has 
continued to evolve as it takes on more involved scenarios. What we have now is 
powerful but intricate in operation and incomplete, in need of cleanup and 
hardening. In this doc we have given overview on the system so you can make use 
of it (and help with its polishing).
+
+This system has the awkward name of Pv2 because HBase already had the notion 
of a Procedure used in snapshots (see hbase-server 
_org.apache.hadoop.hbase.procedure_ as opposed to hbase-procedure 
_org.apache.hadoop.hbase.procedure2_). Pv2 supercedes and is to replace 
Procedure.
+
+== Procedures
+
+A Procedure is a transform made on an HBase entity. Examples of HBase entities 
would be Regions and Tables. +
+Procedures are run by a ProcedureExecutor instance. Procedure current state is 
kept in the ProcedureStore. +
+The ProcedureExecutor has but a primitive view on what goes on inside a 
Procedure. From its PoV, Procedures are submitted and then the 
ProcedureExecutor keeps calling _#execute(Object)_ until the Procedure is done. 
Execute may be called multiple times in the case of failure or restart, so 
Procedure code must be idempotent yielding the same result each time it run. 
Procedure code can also implement _rollback_ so steps can be undone if failure. 
A call to _execute()_ can result in one of following possibilities:
+
+* _execute()_ returns
+** _null_: indicates we are done.
+** _this_: indicates there is more to do so, persist current procedure state 
and re-_execute()_.
+** _Array_ of sub-procedures: indicates a set of procedures needed to be run 
to completion before we can proceed (after which we expect the framework to 
call our execute again).
+* _execute()_ throws exception
+** _suspend_: indicates execution of procedure is suspended and can be resumed 
due to some external event. The procedure state is persisted.
+** _yield_: procedure is added back to scheduler. The procedure state is not 
persisted.
+** _interrupted_: currently same as _yield_.
+** Any _exception_ not listed above: Procedure _state_ is changed to _FAILED_ 
(after which we expect the framework will attempt rollback).
+
+The ProcedureExecutor stamps the frameworks notions of Procedure State into 
the Procedure itself; e.g. it marks Procedures as INITIALIZING on submit. It 
moves the state to RUNNABLE when it goes to execute. When done, a Procedure 
gets marked FAILED or SUCCESS depending. Here is the list of all states as of 
this writing:
+
+* *_INITIALIZING_* Procedure in construction, not yet added to the executor
+* *_RUNNABLE_* Procedure added to the executor, and ready to be executed.
+* *_WAITING_* The procedure is waiting on children (subprocedures) to be 
completed
+* *_WAITING_TIMEOUT_* The procedure is waiting a timeout or an external event
+* *_ROLLEDBACK_* The procedure failed and was rolledback.
+* *_SUCCESS_* The procedure execution completed successfully.
+* *_FAILED_* The procedure execution failed, may need to rollback.
+
+After each execute, the Procedure state is persisted to the ProcedureStore. 
Hooks are invoked on Procedures so they can preserve custom state. Post-fault, 
the ProcedureExecutor re-hydrates its pre-crash state by replaying the content 
of the ProcedureStore. This makes the Procedure Framework resilient against 
process failure.
+
+=== Implementation
+
+In implementation, Procedures tend to divide transforms into finer-grained 
tasks and while some of these work items are handed off to sub-procedures,
+the bulk are done as processing _steps_ in-Procedure; each invocation of the 
execute is used to perform a single step, and then the Procedure relinquishes 
returning to the framework. The Procedure does its own tracking of where it is 
in the processing.
+
+What comprises a sub-task, or _step_ in the execution is up to the Procedure 
author but generally it is a small piece of work that cannot be further 
decomposed and that moves the processing forward toward its end state. Having 
procedures made of many small steps rather than a few large ones allows the 
Procedure framework give out insight on where we are in the processing. It also 
allows the framework be more fair in its execution. As stated per above, each 
step may be called multiple times (failure/restart) so steps must be 
implemented idempotent. +
+It is easy to confuse the state that the Procedure itself is keeping with that 
of the Framework itself. Try to keep them distinct. +
+
+=== Rollback
+
+Rollback is called when the procedure or one of the sub-procedures has failed. 
The rollback step is supposed to cleanup the resources created during the 
execute() step. In case of failure and restart, rollback() may be called 
multiple times, so again the code must be idempotent.
+
+=== Metrics
+
+There are hooks for collecting metrics on submit of the procedure and on 
finish.
+
+* updateMetricsOnSubmit()
+* updateMetricsOnFinish()
+
+Individual procedures can override these methods to collect procedure specific 
metrics. The default implementations of these methods  try to get an object 
implementing an interface ProcedureMetrics which encapsulates following set of 
generic metrics:
+
+* SubmittedCount (Counter): Total number of procedure instances submitted of a 
type.
+* Time (Histogram): Histogram of runtime for procedure instances.
+* FailedCount (Counter): Total number of failed procedure instances.
+
+Individual procedures can implement this object and define these generic set 
of metrics.
+
+=== Baggage
+
+Procedures can carry baggage. One example is the _step_ the procedure last 
attained (see previous section); procedures persist the enum that marks where 
they are currently. Other examples might be the Region or Server name the 
Procedure is currently working. After each call to execute, the 
Procedure#serializeStateData is called. Procedures can persist whatever.
+
+=== Result/State and Queries
+
+(From Matteo’s 
https://issues.apache.org/jira/secure/attachment/12693273/Procedurev2Notification-Bus.pdf[ProcedureV2
 and Notification Bus] doc) +
+In the case of asynchronous operations, the result must be kept around until 
the client asks for it. Once we receive a “get” of the result we can 
schedule the delete of the record. For some operations the result may be 
“unnecessary” especially in case of failure (e.g. if the create table fail, 
we can query the operation result or we can just do a list table to see if it 
was created) so in some cases we can schedule the delete after a timeout. On 
the client side the operation will return a “Procedure ID”, this ID can be 
used to wait until the procedure is completed and get the result/exception. +
+
+[source]
+----
+Admin.doOperation() { longprocId=master.doOperation(); 
master.waitCompletion(procId); }  +
+----
+
+If the master goes down while performing the operation the backup master will 
pickup the half in­progress operation and complete it. The client will not 
notice the failure.
+
+== Subprocedures
+
+Subprocedures are _Procedure_ instances created and returned by 
_#execute(Object)_ method of a procedure instance (parent procedure). As 
subprocedures are of type _Procedure_, they can instantiate their own 
subprocedures. As its a recursive, procedure stack is maintained by the 
framework. The framework makes sure that the parent procedure does not proceed 
till all sub-procedures and their subprocedures in a procedure stack are 
successfully finished.
+
+== ProcedureExecutor
+
+_ProcedureExecutor_ uses _ProcedureStore_ and _ProcedureScheduler_ and 
executes procedures submitted to it. Some of the basic operations supported are:
+
+* _abort(procId)_: aborts specified procedure if its not finished
+* _submit(Procedure)_: submits procedure for execution
+* _retrieve:_ list of get methods to get _Procedure_ instances and results
+* _register/ unregister_ listeners: for listening on Procedure related 
notifications
+
+When _ProcedureExecutor_ starts it loads procedure instances persisted in 
_ProcedureStore_ from previous run. All unfinished procedures are resumed from 
the last stored state.
+
+== Nonces
+
+You can pass the nonce that came in with the RPC to the Procedure on submit at 
the executor. This nonce will then be serialized along w/ the Procedure on 
persist. If a crash, on reload, the nonce will be put back into a map of nonces 
to pid in case a client tries to run same procedure for a second time (it will 
be rejected). See the base Procedure and how nonce is a base data member.
+
+== Wait/Wake/Suspend/Yield
+
+‘suspend’ means stop processing a procedure because we can make no more 
progress until a condition changes; i.e. we sent RPC and need to wait on 
response. The way this works is that a Procedure throws a suspend exception 
from down in its guts as a GOTO the end-of-the-current-processing step. Suspend 
also puts the Procedure back on the scheduler. Problematic is we do some 
accounting on our way out even on suspend making it so it can take time exiting 
(We have to update state in the WAL).
+
+RegionTransitionProcedure#reportTransition is called on receipt of a report 
from a RS. For Assign and Unassign, this event response from the server we sent 
an RPC wakes up suspended Assign/Unassigns.
+
+== Locking
+
+Procedure Locks are not about concurrency! They are about giving a Procedure 
read/write access to an HBase Entity such as a Table or Region so that is 
possible to shut out other Procedures from making modifications to an HBase 
Entity state while the current one is running.
+
+Locking is optional, up to the Procedure implementor but if an entity is being 
operated on by a Procedure, all transforms need to be done via Procedures using 
the same locking scheme else havoc.
+
+Two ProcedureExecutor Worker threads can actually end up both processing the 
same Procedure instance. If it happens, the threads are meant to be running 
different parts of the one Procedure -- changes that do not stamp on each other 
(This gets awkward around the procedure frameworks notion of ‘suspend’. 
More on this below).
+
+Locks optionally may be held for the life of a Procedure. For example, if 
moving a Region, you probably want to have exclusive access to the HBase Region 
until the Region completes (or fails).  This is used in conjunction with {@link 
#holdLock(Object)}. If {@link #holdLock(Object)} returns true, the procedure 
executor will call acquireLock() once and thereafter not call {@link 
#releaseLock(Object)} until the Procedure is done (Normally, it calls 
release/acquire around each invocation of {@link #execute(Object)}.
+
+Locks also may live the life of a procedure; i.e. once an Assign Procedure 
starts, we do not want another procedure meddling w/ the region under 
assignment. Procedures that hold the lock for the life of the procedure set 
Procedure#holdLock to true. AssignProcedure does this as do Split and Move (If 
in the middle of a Region move, you do not want it Splitting).
+
+Locking can be for life of Procedure.
+
+Some locks have a hierarchy. For example, taking a region lock also takes 
(read) lock on its containing table and namespace to prevent another Procedure 
obtaining an exclusive lock on the hosting table (or namespace).
+
+== Procedure Types
+
+=== StateMachineProcedure
+
+One can consider each call to _#execute(Object)_ method as transitioning from 
one state to another in a state machine. Abstract class _StateMachineProcedure_ 
is wrapper around base _Procedure_ class which provides constructs for 
implementing a state machine as a _Procedure_. After each state transition 
current state is persisted so that, in case of crash/ restart, the state 
transition can be resumed from the previous state of a procedure before crash/ 
restart. Individual procedures need to define initial and terminus states and 
hooks _executeFromState()_ and _setNextState()_ are provided for state 
transitions.
+
+=== RemoteProcedureDispatcher
+
+A new RemoteProcedureDispatcher (+ subclass RSProcedureDispatcher) primitive 
takes care of running the Procedure-based Assignments ‘remote’ component. 
This dispatcher knows about ‘servers’. It does aggregation of assignments 
by time on a time/count basis so can send procedures in batches rather than one 
per RPC. Procedure status comes back on the back of the RegionServer heartbeat 
reporting online/offline regions (No more notifications via ZK). The response 
is passed to the AMv2 to ‘process’. It will check against the in-memory 
state. If there is a mismatch, it fences out the RegionServer on the assumption 
that something went wrong on the RS side. Timeouts trigger retries (Not Yet 
Implemented!). The Procedure machine ensures only one operation at a time on 
any one Region/Table using entity _locking_ and smarts about what is serial and 
what can be run concurrently (Locking was zk-based -- you’d put a znode in zk 
for a table -- but now has been converted to be procedure-
 based as part of this project).
+
+== References
+
+* Matteo had a slide deck on what it the Procedure Framework would look like 
and the problems it addresses initially 
link:https://issues.apache.org/jira/secure/attachment/12845124/ProcedureV2b.pdf[attached
 to the Pv2 issue.]
+* 
link:https://issues.apache.org/jira/secure/attachment/12693273/Procedurev2Notification-Bus.pdf[A
 good doc by Matteo] on problem and how Pv2 addresses it w/ roadmap (from the 
Pv2 JIRA). We should go back to the roadmap to do the Notification Bus, 
convertion of log splitting to Pv2, etc.

http://git-wip-us.apache.org/repos/asf/hbase/blob/d49fca13/src/main/asciidoc/_chapters/security.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/security.adoc 
b/src/main/asciidoc/_chapters/security.adoc
index dae6c53..53e8dbf 100644
--- a/src/main/asciidoc/_chapters/security.adoc
+++ b/src/main/asciidoc/_chapters/security.adoc
@@ -30,7 +30,7 @@
 [IMPORTANT]
 .Reporting Security Bugs
 ====
-NOTE: To protect existing HBase installations from exploitation, please *do 
not* use JIRA to report security-related bugs. Instead, send your report to the 
mailing list [email protected], which allows anyone to send messages, but 
restricts who can read them. Someone on that list will contact you to follow up 
on your report.
+NOTE: To protect existing HBase installations from exploitation, please *do 
not* use JIRA to report security-related bugs. Instead, send your report to the 
mailing list [email protected], which allows anyone to send messages, 
but restricts who can read them. Someone on that list will contact you to 
follow up on your report.
 
 HBase adheres to the Apache Software Foundation's policy on reported 
vulnerabilities, available at http://apache.org/security/.
 
@@ -179,7 +179,25 @@ Add the following to the `hbase-site.xml` file on every 
client:
 </property>
 ----
 
-The client environment must be logged in to Kerberos from KDC or keytab via 
the `kinit` command before communication with the HBase cluster will be 
possible.
+Before 2.2.0 version, the client environment must be logged in to Kerberos 
from KDC or keytab via the `kinit` command before communication with the HBase 
cluster will be possible.
+
+Since 2.2.0, client can specify the following configurations in 
`hbase-site.xml`:
+[source,xml]
+----
+<property>
+  <name>hbase.client.keytab.file</name>
+  <value>/local/path/to/client/keytab</value>
+</property>
+
+<property>
+  <name>hbase.client.keytab.principal</name>
+  <value>[email protected]</value>
+</property>
+----
+Then application can automatically do the login and credential renewal jobs 
without client interference.
+
+It's optional feature, client, who upgrades to 2.2.0, can still keep their 
login and credential renewal logic already did in older version, as long as 
keeping `hbase.client.keytab.file`
+and `hbase.client.keytab.principal` are unset.
 
 Be advised that if the `hbase.security.authentication` in the client- and 
server-side site files do not match, the client will not be able to communicate 
with the cluster.
 

http://git-wip-us.apache.org/repos/asf/hbase/blob/d49fca13/src/main/asciidoc/_chapters/sync_replication.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/sync_replication.adoc 
b/src/main/asciidoc/_chapters/sync_replication.adoc
new file mode 100644
index 0000000..d28b9a9
--- /dev/null
+++ b/src/main/asciidoc/_chapters/sync_replication.adoc
@@ -0,0 +1,125 @@
+////
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+////
+
+[[syncreplication]]
+= Synchronous Replication
+:doctype: book
+:numbered:
+:toc: left
+:icons: font
+:experimental:
+:source-language: java
+
+== Background
+
+The current <<Cluster Replication, replication>> in HBase in asynchronous. So 
if the master cluster crashes, the slave cluster may not have the
+newest data. If users want strong consistency then they can not switch to the 
slave cluster.
+
+== Design
+
+Please see the design doc on 
link:https://issues.apache.org/jira/browse/HBASE-19064[HBASE-19064]
+
+== Operation and maintenance
+
+Case.1 Setup two synchronous replication clusters::
+
+* Add a synchronous peer in both source cluster and peer cluster.
+
+For source cluster:
+[source,ruby]
+----
+hbase> add_peer  '1', CLUSTER_KEY => 
'lg-hadoop-tst-st01.bj:10010,lg-hadoop-tst-st02.bj:10010,lg-hadoop-tst-st03.bj:10010:/hbase/test-hbase-slave',
 
REMOTE_WAL_DIR=>'hdfs://lg-hadoop-tst-st01.bj:20100/hbase/test-hbase-slave/remoteWALs',
 TABLE_CFS => {"ycsb-test"=>[]}
+----
+
+For peer cluster:
+[source,ruby]
+----
+hbase> add_peer  '1', CLUSTER_KEY => 
'lg-hadoop-tst-st01.bj:10010,lg-hadoop-tst-st02.bj:10010,lg-hadoop-tst-st03.bj:10010:/hbase/test-hbase',
 
REMOTE_WAL_DIR=>'hdfs://lg-hadoop-tst-st01.bj:20100/hbase/test-hbase/remoteWALs',
 TABLE_CFS => {"ycsb-test"=>[]}
+----
+
+NOTE: For synchronous replication, the current implementation require that we 
have the same peer id for both source
+and peer cluster. Another thing that need attention is: the peer does not 
support cluster-level, namespace-level, or
+cf-level replication, only support table-level replication now.
+
+* Transit the peer cluster to be STANDBY state
+
+[source,ruby]
+----
+hbase> transit_peer_sync_replication_state '1', 'STANDBY'
+----
+
+* Transit the source cluster to be ACTIVE state
+
+[source,ruby]
+----
+hbase> transit_peer_sync_replication_state '1', 'ACTIVE'
+----
+
+Now, the synchronous replication has been set up successfully. the HBase 
client can only request to source cluster, if
+request to peer cluster, the peer cluster which is STANDBY state now will 
reject the read/write requests.
+
+Case.2 How to operate when standby cluster crashed::
+
+If the standby cluster has been crashed, it will fail to write remote WAL for 
the active cluster. So we need to transit
+the source cluster to DOWNGRANDE_ACTIVE state, which means source cluster 
won't write any remote WAL any more, but
+the normal replication (asynchronous Replication) can still work fine, it 
queue the newly written WALs, but the
+replication block until the peer cluster come back.
+
+[source,ruby]
+----
+hbase> transit_peer_sync_replication_state '1', 'DOWNGRADE_ACTIVE'
+----
+
+Once the peer cluster come back, we can just transit the source cluster to 
ACTIVE, to ensure that the replication will be
+synchronous.
+
+[source,ruby]
+----
+hbase> transit_peer_sync_replication_state '1', 'ACTIVE'
+----
+
+Case.3 How to operate when active cluster crashed::
+
+If the active cluster has been crashed (it may be not reachable now), so let's 
just transit the standby cluster to
+DOWNGRANDE_ACTIVE state, and after that, we should redirect all the requests 
from client to the DOWNGRADE_ACTIVE cluster.
+
+[source,ruby]
+----
+hbase> transit_peer_sync_replication_state '1', 'DOWNGRADE_ACTIVE'
+----
+
+If the crashed cluster come back again, we just need to transit it to STANDBY 
directly. Otherwise if you transit the
+cluster to DOWNGRADE_ACTIVE, the original ACTIVE cluster may have redundant 
data compared to the current ACTIVE
+cluster. Because we designed to write source cluster WALs and remote cluster 
WALs concurrently, so it's possible that
+the source cluster WALs has more data than the remote cluster, which result in 
data inconsistency. The procedure of
+transiting ACTIVE to STANDBY has no problem, because we'll skip to replay the 
original WALs.
+
+[source,ruby]
+----
+hbase> transit_peer_sync_replication_state '1', 'STANDBY'
+----
+
+After that, we can promote the DOWNGRADE_ACTIVE cluster to ACTIVE now, to 
ensure that the replication will be synchronous.
+
+[source,ruby]
+----
+hbase> transit_peer_sync_replication_state '1', 'ACTIVE'
+----

http://git-wip-us.apache.org/repos/asf/hbase/blob/d49fca13/src/main/asciidoc/_chapters/troubleshooting.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/troubleshooting.adoc 
b/src/main/asciidoc/_chapters/troubleshooting.adoc
index 52c0860..8da3014 100644
--- a/src/main/asciidoc/_chapters/troubleshooting.adoc
+++ b/src/main/asciidoc/_chapters/troubleshooting.adoc
@@ -850,9 +850,9 @@ Snapshots::
   When you create a snapshot, HBase retains everything it needs to recreate 
the table's
   state at that time of the snapshot. This includes deleted cells or expired 
versions.
   For this reason, your snapshot usage pattern should be well-planned, and you 
should
-  prune snapshots that you no longer need. Snapshots are stored in 
`/hbase/.snapshots`,
+  prune snapshots that you no longer need. Snapshots are stored in 
`/hbase/.hbase-snapshot`,
   and archives needed to restore snapshots are stored in
-  `/hbase/.archive/<_tablename_>/<_region_>/<_column_family_>/`.
+  `/hbase/archive/<_tablename_>/<_region_>/<_column_family_>/`.
 
   *Do not* manage snapshots or archives manually via HDFS. HBase provides APIs 
and
   HBase Shell commands for managing them. For more information, see 
<<ops.snapshots>>.
@@ -981,6 +981,57 @@ Caused by: 
org.apache.hadoop.hbase.util.CommonFSUtils$StreamLacksCapabilityExcep
 
 If you are attempting to run in standalone mode and see this error, please 
walk back through the section <<quickstart>> and ensure you have included *all* 
the given configuration settings.
 
+[[trouble.rs.startup.asyncfs]]
+==== RegionServer aborts due to can not initialize access to HDFS
+
+We will try to use _AsyncFSWAL_ for HBase-2.x as it has better performance 
while consuming less resources. But the problem for _AsyncFSWAL_ is that it 
hacks into the internal of the DFSClient implementation, so it will easily be 
broken when upgrading hadoop, even for a simple patch release.
+
+If you do not specify the wal provider, we will try to fall back to the old 
_FSHLog_ if we fail to initialize _AsyncFSWAL_, but it may not always work. The 
failure will show up in logs like this:
+
+----
+18/07/02 18:51:06 WARN concurrent.DefaultPromise: An exception was
+thrown by 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete()
+java.lang.Error: Couldn't properly initialize access to HDFS
+internals. Please update your WAL Provider to not make use of the
+'asyncfs' provider. See HBASE-16110 for more information.
+     at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.<clinit>(FanOutOneBlockAsyncDFSOutputSaslHelper.java:268)
+     at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.initialize(FanOutOneBlockAsyncDFSOutputHelper.java:661)
+     at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.access$300(FanOutOneBlockAsyncDFSOutputHelper.java:118)
+     at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:720)
+     at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:715)
+     at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
+     at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
+     at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
+     at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
+     at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
+     at 
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
+     at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.fulfillConnectPromise(AbstractEpollChannel.java:638)
+     at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:676)
+     at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:552)
+     at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:394)
+     at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:304)
+     at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
+     at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
+     at java.lang.Thread.run(Thread.java:748)
+ Caused by: java.lang.NoSuchMethodException:
+org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(org.apache.hadoop.fs.FileEncryptionInfo)
+     at java.lang.Class.getDeclaredMethod(Class.java:2130)
+     at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.createTransparentCryptoHelper(FanOutOneBlockAsyncDFSOutputSaslHelper.java:232)
+     at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.<clinit>(FanOutOneBlockAsyncDFSOutputSaslHelper.java:262)
+     ... 18 more
+----
+
+If you hit this error, please specify _FSHLog_, i.e, _filesystem_, explicitly 
in your config file.
+
+[source,xml]
+----
+<property>
+  <name>hbase.wal.provider</name>
+  <value>filesystem</value>
+</property>
+----
+
+And do not forget to send an email to the [email protected] or 
[email protected] to report the failure and also your hadoop version, we 
will try to fix the problem ASAP in the next release.
 
 [[trouble.rs.runtime]]
 === Runtime Errors

http://git-wip-us.apache.org/repos/asf/hbase/blob/d49fca13/src/main/asciidoc/_chapters/upgrading.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/_chapters/upgrading.adoc 
b/src/main/asciidoc/_chapters/upgrading.adoc
index a082014..a556123 100644
--- a/src/main/asciidoc/_chapters/upgrading.adoc
+++ b/src/main/asciidoc/_chapters/upgrading.adoc
@@ -611,6 +611,19 @@ Performance is also an area that is now under active 
review so look forward to
 improvement in coming releases (See
 link:https://issues.apache.org/jira/browse/HBASE-20188[HBASE-20188 TESTING 
Performance]).
 
+[[upgrade2.0.it.kerberos]]
+.Integration Tests and Kerberos
+Integration Tests (`IntegrationTests*`) used to rely on the Kerberos 
credential cache
+for authentication against secured clusters. This used to lead to tests 
failing due
+to authentication failures when the tickets in the credential cache expired.
+As of hbase-2.0.0 (and hbase-1.3.0+), the integration test clients will make 
use
+of the configuration properties `hbase.client.keytab.file` and
+`hbase.client.kerberos.principal`. They are required. The clients will perform 
a
+login from the configured keytab file and automatically refresh the credentials
+in the background for the process lifetime (See
+link:https://issues.apache.org/jira/browse/HBASE-16231[HBASE-16231]).
+
+
 ////
 This would be a good place to link to an appendix on migrating applications
 ////
@@ -659,7 +672,34 @@ good justification to add it back, bring it our notice 
([email protected]).
 
 [[upgrade2.0.rolling.upgrades]]
 ==== Rolling Upgrade from 1.x to 2.x
-There is no rolling upgrade from HBase 1.x+ to HBase 2.x+. In order to perform 
a zero downtime upgrade, you will need to run an additional cluster in parallel 
and handle failover in application logic.
+
+Rolling upgrades are currently an experimental feature.
+They have had limited testing. There are likely corner
+cases as yet uncovered in our
+limited experience so you should be careful if you go this
+route. The stop/upgrade/start described in the next section,
+<<upgrade2.0.process>>, is the safest route.
+
+That said, the below is a prescription for a
+rolling upgrade of a 1.4 cluster.
+
+.Pre-Requirements
+* Upgrade to the latest 1.4.x release. Pre 1.4 releases may also work but are 
not tested, so please upgrade to 1.4.3+ before upgrading to 2.x, unless you are 
an expert and familiar with the region assignment and crash processing. See the 
section <<upgrade1.4>> on how to upgrade to 1.4.x.
+* Make sure that the zk-less assignment is enabled, i.e, set 
`hbase.assignment.usezk` to `false`. This is the most important thing. It 
allows the 1.x master to assign/unassign regions to/from 2.x region servers. 
See the release note section of 
link:https://issues.apache.org/jira/browse/HBASE-11059[HBASE-11059] on how to 
migrate from zk based assignment to zk less assignment.
+* We have tested rolling upgrading from 1.4.3 to 2.1.0, but it should also 
work if you want to upgrade to 2.0.x.
+
+.Instructions
+. Unload a region server and upgrade it to 2.1.0. With 
link:https://issues.apache.org/jira/browse/HBASE-17931[HBASE-17931] in place, 
the meta region and regions for other system tables will be moved to this 
region server immediately. If not, please move them manually to the new region 
server. This is very important because
+** The schema of meta region is hard coded, if meta is on an old region 
server, then the new region servers can not access it as it does not have some 
families, for example, table state.
+** Client with lower version can communicate with server with higher version, 
but not vice versa. If the meta region is on an old region server, the new 
region server will use a client with higher version to communicate with a 
server with lower version, this may introduce strange problems.
+. Rolling upgrade all other region servers.
+. Upgrading masters.
+
+It is OK that during the rolling upgrading there are region server crashes. 
The 1.x master can assign regions to both 1.x and 2.x region servers, and 
link:https://issues.apache.org/jira/browse/HBASE-19166[HBASE-19166] fixed a 
problem so that 1.x region server can also read the WALs written by 2.x region 
server and split them.
+
+NOTE: please read the <<Changes of Note!,Changes of Note!>> section carefully 
before rolling upgrading. Make sure that you do not use the removed features in 
2.0, for example, the prefix-tree encoding, the old hfile format, etc. They 
could both fail the upgrading and leave the cluster in an intermediate state 
and hard to recover.
+
+NOTE: If you have success running this prescription, please notify the dev 
list with a note on your experience and/or update the above with any deviations 
you may have taken so others going this route can benefit from your efforts.
 
 [[upgrade2.0.process]]
 ==== Upgrade process from 1.x to 2.x
@@ -704,6 +744,11 @@ Notes:
 
 Doing a raw scan will now return results that have expired according to TTL 
settings.
 
+[[upgrade1.3]]
+=== Upgrading from pre-1.3 to 1.3+
+If running Integration Tests under Kerberos, see <<upgrade2.0.it.kerberos>>.
+
+
 [[upgrade1.0]]
 === Upgrading to 1.x
 

http://git-wip-us.apache.org/repos/asf/hbase/blob/d49fca13/src/main/asciidoc/book.adoc
----------------------------------------------------------------------
diff --git a/src/main/asciidoc/book.adoc b/src/main/asciidoc/book.adoc
index 8c3890f..764d7b4 100644
--- a/src/main/asciidoc/book.adoc
+++ b/src/main/asciidoc/book.adoc
@@ -74,6 +74,8 @@ include::_chapters/ops_mgt.adoc[]
 include::_chapters/developer.adoc[]
 include::_chapters/unit_testing.adoc[]
 include::_chapters/protobuf.adoc[]
+include::_chapters/pv2.adoc[]
+include::_chapters/amv2.adoc[]
 include::_chapters/zookeeper.adoc[]
 include::_chapters/community.adoc[]
 
@@ -93,3 +95,4 @@ include::_chapters/asf.adoc[]
 include::_chapters/orca.adoc[]
 include::_chapters/tracing.adoc[]
 include::_chapters/rpc.adoc[]
+include::_chapters/appendix_hbase_incompatibilities.adoc[]

Reply via email to