saintstack commented on a change in pull request #2665:
URL: https://github.com/apache/hbase/pull/2665#discussion_r524793518
##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is
introduced in HBase-1.0+. S
For turning this feature on, you should configure
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See
Configuration section below.
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via
“Async WAL Replication” feature and is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated through w
al replication to secondaries. When they observe the flush/compaction or bulk
load event, the secondary regions replay the event to pick up the new files and
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated throu
gh wal replication to secondaries. When they observe the flush/compaction or
bulk load event, the secondary regions replay the event to pick up the new
files and drop the old ones.
Committing writes in the same order as in primary ensures that the secondaries
won’t diverge from the primary regions data, but since the log replication is
asynchronous, the data might still be stale in secondary regions. Since this
feature works as a replication endpoint, the performance and latency
characteristics is expected to be similar to inter-cluster replication.
Async WAL Replication is *disabled* by default. You can enable this feature by
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you create a table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you createa table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
Review comment:
I specified the order documented here (before my time).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]