[GitHub] [hbase] ndimiduk commented on a change in pull request #2665: HBASE-25291 Document how to enable the meta replica load balance mode…

GitBox Mon, 16 Nov 2020 15:55:51 -0800


ndimiduk commented on a change in pull request #2665:
URL: https://github.com/apache/hbase/pull/2665#discussion_r524765800




##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is 
introduced in HBase-1.0+. S
 
 For turning this feature on, you should configure 
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See 
Configuration section below.
 
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via 
“Async WAL Replication” feature and is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated through w
 al replication to secondaries. When they observe the flush/compaction or bulk 
load event, the secondary regions replay the event to pick up the new files and 
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the 
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated throu
 gh wal replication to secondaries. When they observe the flush/compaction or 
bulk load event, the secondary regions replay the event to pick up the new 
files and drop the old ones.

Review comment:
       nit: mind adding some line breaks at <100 chars?

##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is 
introduced in HBase-1.0+. S
 
 For turning this feature on, you should configure 
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See 
Configuration section below.
 
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via 
“Async WAL Replication” feature and is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated through w
 al replication to secondaries. When they observe the flush/compaction or bulk 
load event, the secondary regions replay the event to pick up the new files and 
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the 
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated throu
 gh wal replication to secondaries. When they observe the flush/compaction or 
bulk load event, the secondary regions replay the event to pick up the new 
files and drop the old ones.
 
 Committing writes in the same order as in primary ensures that the secondaries 
won’t diverge from the primary regions data, but since the log replication is 
asynchronous, the data might still be stale in secondary regions. Since this 
feature works as a replication endpoint, the performance and latency 
characteristics is expected to be similar to inter-cluster replication.
 
 Async WAL Replication is *disabled* by default. You can enable this feature by 
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you create a table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you createa table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:

Review comment:
       These two actions must be performed in the order specified here? or is 
it okay to perform them in the opposite order?

##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is 
introduced in HBase-1.0+. S
 
 For turning this feature on, you should configure 
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See 
Configuration section below.
 
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via 
“Async WAL Replication” feature and is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated through w
 al replication to secondaries. When they observe the flush/compaction or bulk 
load event, the secondary regions replay the event to pick up the new files and 
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the 
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated throu
 gh wal replication to secondaries. When they observe the flush/compaction or 
bulk load event, the secondary regions replay the event to pick up the new 
files and drop the old ones.
 
 Committing writes in the same order as in primary ensures that the secondaries 
won’t diverge from the primary regions data, but since the log replication is 
asynchronous, the data might still be stale in secondary regions. Since this 
feature works as a replication endpoint, the performance and latency 
characteristics is expected to be similar to inter-cluster replication.
 
 Async WAL Replication is *disabled* by default. You can enable this feature by 
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you create a table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you createa table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:

Review comment:
       s/createa/create a/

##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is 
introduced in HBase-1.0+. S
 
 For turning this feature on, you should configure 
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See 
Configuration section below.
 
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via 
“Async WAL Replication” feature and is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated through w
 al replication to secondaries. When they observe the flush/compaction or bulk 
load event, the secondary regions replay the event to pick up the new files and 
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the 
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated throu
 gh wal replication to secondaries. When they observe the flush/compaction or 
bulk load event, the secondary regions replay the event to pick up the new 
files and drop the old ones.
 
 Committing writes in the same order as in primary ensures that the secondaries 
won’t diverge from the primary regions data, but since the log replication is 
asynchronous, the data might still be stale in secondary regions. Since this 
feature works as a replication endpoint, the performance and latency 
characteristics is expected to be similar to inter-cluster replication.
 
 Async WAL Replication is *disabled* by default. You can enable this feature by 
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you create a table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you createa table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
 * Set configuration property `hbase.region.replica.replication.enabled` to 
false in `hbase-site.xml` (see Configuration section below)
 * Disable the replication peer named `region_replica_replication` in the 
cluster using hbase shell or `Admin` class:
 [source,bourne]
 ----
        hbase> disable_peer 'region_replica_replication'
 ----
 
+Async WAL Replication and the `hbase:meta` table is a little more involved and 
gets its own section below; see <<async.wal.replication.meta>>
+
 === Store File TTL
 In both of the write propagation approaches mentioned above, store files of 
the primary will be opened in secondaries independent of the primary region. So 
for files that the primary compacted away, the secondaries might still be 
referring to these files for reading. Both features are using HFileLinks to 
refer to files, but there is no protection (yet) for guaranteeing that the file 
will not be deleted prematurely. Thus, as a guard, you should set the 
configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such 
as 1 hour to guarantee that you will not receive IOExceptions for requests 
going to replicas.
 
+[[async.wal.replication.meta]]
 === Region replication for META table’s region
-Currently, Async WAL Replication is not done for the META table’s WAL. The 
meta table’s secondary replicas still refreshes themselves from the persistent 
store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs 
to be set to a certain non-zero value for refreshing the meta store files. Note 
that this configuration is configured differently than
-`hbase.regionserver.storefile.refresh.period`.
+Up until hbase-2.4.0, Async WAL Replication did not work for the META table’s 
WAL. The meta table’s secondary replicas refreshed themselves from the 
persistent store files every `hbase.regionserver.meta.storefile.refresh.period`,
+(a non-zero value). Note how the META replication period is distinct from the 
user-space `hbase.regionserver.storefile.refresh.period` value.
+
+Async WAL replication for META is a new feature in 2.4.0 still under active 
development. Use with caution.
+Set `hbase.region.replica.replication.catalog.enabled` to enable async WAL 
Replication for META region replicas.

Review comment:
       Instead of going back and forth with instructions for < 2.4.0 and 
2.4.0+, please write two sections, one for < 2.4.0 and the other for 2.4.0+. 
The words might be 80% duplicated, but it makes it crystal clear, what's 
applicable to which versions.

##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is 
introduced in HBase-1.0+. S
 
 For turning this feature on, you should configure 
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See 
Configuration section below.
 
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via 
“Async WAL Replication” feature and is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated through w
 al replication to secondaries. When they observe the flush/compaction or bulk 
load event, the secondary regions replay the event to pick up the new files and 
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the 
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated throu
 gh wal replication to secondaries. When they observe the flush/compaction or 
bulk load event, the secondary regions replay the event to pick up the new 
files and drop the old ones.
 
 Committing writes in the same order as in primary ensures that the secondaries 
won’t diverge from the primary regions data, but since the log replication is 
asynchronous, the data might still be stale in secondary regions. Since this 
feature works as a replication endpoint, the performance and latency 
characteristics is expected to be similar to inter-cluster replication.
 
 Async WAL Replication is *disabled* by default. You can enable this feature by 
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you create a table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you createa table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
 * Set configuration property `hbase.region.replica.replication.enabled` to 
false in `hbase-site.xml` (see Configuration section below)
 * Disable the replication peer named `region_replica_replication` in the 
cluster using hbase shell or `Admin` class:
 [source,bourne]
 ----
        hbase> disable_peer 'region_replica_replication'
 ----
 
+Async WAL Replication and the `hbase:meta` table is a little more involved and 
gets its own section below; see <<async.wal.replication.meta>>
+
 === Store File TTL
 In both of the write propagation approaches mentioned above, store files of 
the primary will be opened in secondaries independent of the primary region. So 
for files that the primary compacted away, the secondaries might still be 
referring to these files for reading. Both features are using HFileLinks to 
refer to files, but there is no protection (yet) for guaranteeing that the file 
will not be deleted prematurely. Thus, as a guard, you should set the 
configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such 
as 1 hour to guarantee that you will not receive IOExceptions for requests 
going to replicas.
 
+[[async.wal.replication.meta]]
 === Region replication for META table’s region
-Currently, Async WAL Replication is not done for the META table’s WAL. The 
meta table’s secondary replicas still refreshes themselves from the persistent 
store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs 
to be set to a certain non-zero value for refreshing the meta store files. Note 
that this configuration is configured differently than
-`hbase.regionserver.storefile.refresh.period`.
+Up until hbase-2.4.0, Async WAL Replication did not work for the META table’s 
WAL. The meta table’s secondary replicas refreshed themselves from the 
persistent store files every `hbase.regionserver.meta.storefile.refresh.period`,
+(a non-zero value). Note how the META replication period is distinct from the 
user-space `hbase.regionserver.storefile.refresh.period` value.
+
+Async WAL replication for META is a new feature in 2.4.0 still under active 
development. Use with caution.
+Set `hbase.region.replica.replication.catalog.enabled` to enable async WAL 
Replication for META region replicas.
+Its off by default.
+
+Regards the META replicas count, up to hbase-2.4.0, you would set the special 
property 'hbase.meta.replica.count'.

Review comment:
       nit: s/Regards the META replicas count/Regarding the META replicas count/

##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is 
introduced in HBase-1.0+. S
 
 For turning this feature on, you should configure 
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See 
Configuration section below.
 
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via 
“Async WAL Replication” feature and is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated through w
 al replication to secondaries. When they observe the flush/compaction or bulk 
load event, the secondary regions replay the event to pick up the new files and 
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the 
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated throu
 gh wal replication to secondaries. When they observe the flush/compaction or 
bulk load event, the secondary regions replay the event to pick up the new 
files and drop the old ones.
 
 Committing writes in the same order as in primary ensures that the secondaries 
won’t diverge from the primary regions data, but since the log replication is 
asynchronous, the data might still be stale in secondary regions. Since this 
feature works as a replication endpoint, the performance and latency 
characteristics is expected to be similar to inter-cluster replication.
 
 Async WAL Replication is *disabled* by default. You can enable this feature by 
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you create a table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you createa table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
 * Set configuration property `hbase.region.replica.replication.enabled` to 
false in `hbase-site.xml` (see Configuration section below)
 * Disable the replication peer named `region_replica_replication` in the 
cluster using hbase shell or `Admin` class:
 [source,bourne]
 ----
        hbase> disable_peer 'region_replica_replication'
 ----
 
+Async WAL Replication and the `hbase:meta` table is a little more involved and 
gets its own section below; see <<async.wal.replication.meta>>
+
 === Store File TTL
 In both of the write propagation approaches mentioned above, store files of 
the primary will be opened in secondaries independent of the primary region. So 
for files that the primary compacted away, the secondaries might still be 
referring to these files for reading. Both features are using HFileLinks to 
refer to files, but there is no protection (yet) for guaranteeing that the file 
will not be deleted prematurely. Thus, as a guard, you should set the 
configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such 
as 1 hour to guarantee that you will not receive IOExceptions for requests 
going to replicas.
 
+[[async.wal.replication.meta]]
 === Region replication for META table’s region
-Currently, Async WAL Replication is not done for the META table’s WAL. The 
meta table’s secondary replicas still refreshes themselves from the persistent 
store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs 
to be set to a certain non-zero value for refreshing the meta store files. Note 
that this configuration is configured differently than
-`hbase.regionserver.storefile.refresh.period`.
+Up until hbase-2.4.0, Async WAL Replication did not work for the META table’s 
WAL. The meta table’s secondary replicas refreshed themselves from the 
persistent store files every `hbase.regionserver.meta.storefile.refresh.period`,
+(a non-zero value). Note how the META replication period is distinct from the 
user-space `hbase.regionserver.storefile.refresh.period` value.
+
+Async WAL replication for META is a new feature in 2.4.0 still under active 
development. Use with caution.
+Set `hbase.region.replica.replication.catalog.enabled` to enable async WAL 
Replication for META region replicas.
+Its off by default.
+
+Regards the META replicas count, up to hbase-2.4.0, you would set the special 
property 'hbase.meta.replica.count'.
+Now you can alter the META table as you would a user-space table (if 
`hbase.meta.replica.count` is set, it
+will take precedent over what is set for replica count in the META table 
updating META replica count to
+match).
+
+==== Load Balancing META table load ====
+
+hbase-2.4.0 adds a new client-side `LoadBalance` mode. When enabled 
client-side, clients will try to read META replicas first before falling back 
on the primary. Before this,
+the lookup mode -- now named `HedgedRead` -- had clients read the primary and 
if no response after a configurable amount of time had elapsed, it would start 
up reads against the replicas.
+The new 'LoadBalance' mode helps alleviate hotspotting on the META table 
distributing the META read load.

Review comment:
       You give a name to AND describe the old behavior, but you only give a 
name to the new behavior. Would be a kindness to describe and name both, OR 
just name them and link off to the documentation on this enum in code.

##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is 
introduced in HBase-1.0+. S
 
 For turning this feature on, you should configure 
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See 
Configuration section below.
 
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via 
“Async WAL Replication” feature and is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated through w
 al replication to secondaries. When they observe the flush/compaction or bulk 
load event, the secondary regions replay the event to pick up the new files and 
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the 
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works 
similarly to HBase’s multi-datacenter replication, but instead the data from a 
region is replicated to the secondary regions. Each secondary replica always 
receives and observes the writes in the same order that the primary region 
committed them. In some sense, this design can be thought of as “in-cluster 
replication”, where instead of replicating to a different datacenter, the data 
goes to secondary regions to keep secondary region’s in-memory state up to 
date. The data files are shared between the primary region and the other 
replicas, so that there is no extra storage overhead. However, the secondary 
regions will have recent non-flushed data in their memstores, which increases 
the memory overhead. The primary region writes flush, compaction, and bulk load 
events to its WAL as well, which are also replicated throu
 gh wal replication to secondaries. When they observe the flush/compaction or 
bulk load event, the secondary regions replay the event to pick up the new 
files and drop the old ones.
 
 Committing writes in the same order as in primary ensures that the secondaries 
won’t diverge from the primary regions data, but since the log replication is 
asynchronous, the data might still be stale in secondary regions. Since this 
feature works as a replication endpoint, the performance and latency 
characteristics is expected to be similar to inter-cluster replication.
 
 Async WAL Replication is *disabled* by default. You can enable this feature by 
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you create a table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named 
`region_replica_replication` as a replication peer when you createa table with 
region replication > 1 for the first time. Once enabled, if you want to disable 
this feature, you need to do two actions:
 * Set configuration property `hbase.region.replica.replication.enabled` to 
false in `hbase-site.xml` (see Configuration section below)
 * Disable the replication peer named `region_replica_replication` in the 
cluster using hbase shell or `Admin` class:
 [source,bourne]
 ----
        hbase> disable_peer 'region_replica_replication'
 ----
 
+Async WAL Replication and the `hbase:meta` table is a little more involved and 
gets its own section below; see <<async.wal.replication.meta>>
+
 === Store File TTL
 In both of the write propagation approaches mentioned above, store files of 
the primary will be opened in secondaries independent of the primary region. So 
for files that the primary compacted away, the secondaries might still be 
referring to these files for reading. Both features are using HFileLinks to 
refer to files, but there is no protection (yet) for guaranteeing that the file 
will not be deleted prematurely. Thus, as a guard, you should set the 
configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such 
as 1 hour to guarantee that you will not receive IOExceptions for requests 
going to replicas.
 
+[[async.wal.replication.meta]]
 === Region replication for META table’s region
-Currently, Async WAL Replication is not done for the META table’s WAL. The 
meta table’s secondary replicas still refreshes themselves from the persistent 
store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs 
to be set to a certain non-zero value for refreshing the meta store files. Note 
that this configuration is configured differently than
-`hbase.regionserver.storefile.refresh.period`.
+Up until hbase-2.4.0, Async WAL Replication did not work for the META table’s 
WAL. The meta table’s secondary replicas refreshed themselves from the 
persistent store files every `hbase.regionserver.meta.storefile.refresh.period`,
+(a non-zero value). Note how the META replication period is distinct from the 
user-space `hbase.regionserver.storefile.refresh.period` value.
+
+Async WAL replication for META is a new feature in 2.4.0 still under active 
development. Use with caution.
+Set `hbase.region.replica.replication.catalog.enabled` to enable async WAL 
Replication for META region replicas.
+Its off by default.
+
+Regards the META replicas count, up to hbase-2.4.0, you would set the special 
property 'hbase.meta.replica.count'.
+Now you can alter the META table as you would a user-space table (if 
`hbase.meta.replica.count` is set, it
+will take precedent over what is set for replica count in the META table 
updating META replica count to
+match).
+
+==== Load Balancing META table load ====
+
+hbase-2.4.0 adds a new client-side `LoadBalance` mode. When enabled 
client-side, clients will try to read META replicas first before falling back 
on the primary. Before this,
+the lookup mode -- now named `HedgedRead` -- had clients read the primary and 
if no response after a configurable amount of time had elapsed, it would start 
up reads against the replicas.
+The new 'LoadBalance' mode helps alleviate hotspotting on the META table 
distributing the META read load.
+
+To enable the meta replica locator's load balance mode, please set the 
following configuration at on the client-side (only): set 
'hbase.locator.meta.replicas.mode' to "LoadBalance".
+Valid options for this configuration are `None`, `HedgedRead`, and 
`LoadBalance`. Option parse is case insensitive.
+The default mode is `None` (which falls through to `HedgedRead`, the current 
default).  Please do not put this configuration in any hbase server's 
configuration, master or region server.

Review comment:
       "please do not" is this language strong enough? What happens if i ship 
out this configuration accidentally? How badly will we muck up meta? Should the 
the system proactively defend against this configuration?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] ndimiduk commented on a change in pull request #2665: HBASE-25291 Document how to enable the meta replica load balance mode…

Reply via email to