ndimiduk commented on a change in pull request #2665:
URL: https://github.com/apache/hbase/pull/2665#discussion_r524765800
##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is
introduced in HBase-1.0+. S
For turning this feature on, you should configure
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See
Configuration section below.
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via
“Async WAL Replication” feature and is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated through w
al replication to secondaries. When they observe the flush/compaction or bulk
load event, the secondary regions replay the event to pick up the new files and
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated throu
gh wal replication to secondaries. When they observe the flush/compaction or
bulk load event, the secondary regions replay the event to pick up the new
files and drop the old ones.
Review comment:
nit: mind adding some line breaks at <100 chars?
##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is
introduced in HBase-1.0+. S
For turning this feature on, you should configure
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See
Configuration section below.
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via
“Async WAL Replication” feature and is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated through w
al replication to secondaries. When they observe the flush/compaction or bulk
load event, the secondary regions replay the event to pick up the new files and
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated throu
gh wal replication to secondaries. When they observe the flush/compaction or
bulk load event, the secondary regions replay the event to pick up the new
files and drop the old ones.
Committing writes in the same order as in primary ensures that the secondaries
won’t diverge from the primary regions data, but since the log replication is
asynchronous, the data might still be stale in secondary regions. Since this
feature works as a replication endpoint, the performance and latency
characteristics is expected to be similar to inter-cluster replication.
Async WAL Replication is *disabled* by default. You can enable this feature by
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you create a table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you createa table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
Review comment:
These two actions must be performed in the order specified here? or is
it okay to perform them in the opposite order?
##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is
introduced in HBase-1.0+. S
For turning this feature on, you should configure
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See
Configuration section below.
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via
“Async WAL Replication” feature and is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated through w
al replication to secondaries. When they observe the flush/compaction or bulk
load event, the secondary regions replay the event to pick up the new files and
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated throu
gh wal replication to secondaries. When they observe the flush/compaction or
bulk load event, the secondary regions replay the event to pick up the new
files and drop the old ones.
Committing writes in the same order as in primary ensures that the secondaries
won’t diverge from the primary regions data, but since the log replication is
asynchronous, the data might still be stale in secondary regions. Since this
feature works as a replication endpoint, the performance and latency
characteristics is expected to be similar to inter-cluster replication.
Async WAL Replication is *disabled* by default. You can enable this feature by
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you create a table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you createa table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
Review comment:
s/createa/create a/
##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is
introduced in HBase-1.0+. S
For turning this feature on, you should configure
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See
Configuration section below.
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via
“Async WAL Replication” feature and is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated through w
al replication to secondaries. When they observe the flush/compaction or bulk
load event, the secondary regions replay the event to pick up the new files and
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated throu
gh wal replication to secondaries. When they observe the flush/compaction or
bulk load event, the secondary regions replay the event to pick up the new
files and drop the old ones.
Committing writes in the same order as in primary ensures that the secondaries
won’t diverge from the primary regions data, but since the log replication is
asynchronous, the data might still be stale in secondary regions. Since this
feature works as a replication endpoint, the performance and latency
characteristics is expected to be similar to inter-cluster replication.
Async WAL Replication is *disabled* by default. You can enable this feature by
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you create a table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you createa table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
* Set configuration property `hbase.region.replica.replication.enabled` to
false in `hbase-site.xml` (see Configuration section below)
* Disable the replication peer named `region_replica_replication` in the
cluster using hbase shell or `Admin` class:
[source,bourne]
----
hbase> disable_peer 'region_replica_replication'
----
+Async WAL Replication and the `hbase:meta` table is a little more involved and
gets its own section below; see <<async.wal.replication.meta>>
+
=== Store File TTL
In both of the write propagation approaches mentioned above, store files of
the primary will be opened in secondaries independent of the primary region. So
for files that the primary compacted away, the secondaries might still be
referring to these files for reading. Both features are using HFileLinks to
refer to files, but there is no protection (yet) for guaranteeing that the file
will not be deleted prematurely. Thus, as a guard, you should set the
configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such
as 1 hour to guarantee that you will not receive IOExceptions for requests
going to replicas.
+[[async.wal.replication.meta]]
=== Region replication for META table’s region
-Currently, Async WAL Replication is not done for the META table’s WAL. The
meta table’s secondary replicas still refreshes themselves from the persistent
store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs
to be set to a certain non-zero value for refreshing the meta store files. Note
that this configuration is configured differently than
-`hbase.regionserver.storefile.refresh.period`.
+Up until hbase-2.4.0, Async WAL Replication did not work for the META table’s
WAL. The meta table’s secondary replicas refreshed themselves from the
persistent store files every `hbase.regionserver.meta.storefile.refresh.period`,
+(a non-zero value). Note how the META replication period is distinct from the
user-space `hbase.regionserver.storefile.refresh.period` value.
+
+Async WAL replication for META is a new feature in 2.4.0 still under active
development. Use with caution.
+Set `hbase.region.replica.replication.catalog.enabled` to enable async WAL
Replication for META region replicas.
Review comment:
Instead of going back and forth with instructions for < 2.4.0 and
2.4.0+, please write two sections, one for < 2.4.0 and the other for 2.4.0+.
The words might be 80% duplicated, but it makes it crystal clear, what's
applicable to which versions.
##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is
introduced in HBase-1.0+. S
For turning this feature on, you should configure
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See
Configuration section below.
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via
“Async WAL Replication” feature and is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated through w
al replication to secondaries. When they observe the flush/compaction or bulk
load event, the secondary regions replay the event to pick up the new files and
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated throu
gh wal replication to secondaries. When they observe the flush/compaction or
bulk load event, the secondary regions replay the event to pick up the new
files and drop the old ones.
Committing writes in the same order as in primary ensures that the secondaries
won’t diverge from the primary regions data, but since the log replication is
asynchronous, the data might still be stale in secondary regions. Since this
feature works as a replication endpoint, the performance and latency
characteristics is expected to be similar to inter-cluster replication.
Async WAL Replication is *disabled* by default. You can enable this feature by
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you create a table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you createa table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
* Set configuration property `hbase.region.replica.replication.enabled` to
false in `hbase-site.xml` (see Configuration section below)
* Disable the replication peer named `region_replica_replication` in the
cluster using hbase shell or `Admin` class:
[source,bourne]
----
hbase> disable_peer 'region_replica_replication'
----
+Async WAL Replication and the `hbase:meta` table is a little more involved and
gets its own section below; see <<async.wal.replication.meta>>
+
=== Store File TTL
In both of the write propagation approaches mentioned above, store files of
the primary will be opened in secondaries independent of the primary region. So
for files that the primary compacted away, the secondaries might still be
referring to these files for reading. Both features are using HFileLinks to
refer to files, but there is no protection (yet) for guaranteeing that the file
will not be deleted prematurely. Thus, as a guard, you should set the
configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such
as 1 hour to guarantee that you will not receive IOExceptions for requests
going to replicas.
+[[async.wal.replication.meta]]
=== Region replication for META table’s region
-Currently, Async WAL Replication is not done for the META table’s WAL. The
meta table’s secondary replicas still refreshes themselves from the persistent
store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs
to be set to a certain non-zero value for refreshing the meta store files. Note
that this configuration is configured differently than
-`hbase.regionserver.storefile.refresh.period`.
+Up until hbase-2.4.0, Async WAL Replication did not work for the META table’s
WAL. The meta table’s secondary replicas refreshed themselves from the
persistent store files every `hbase.regionserver.meta.storefile.refresh.period`,
+(a non-zero value). Note how the META replication period is distinct from the
user-space `hbase.regionserver.storefile.refresh.period` value.
+
+Async WAL replication for META is a new feature in 2.4.0 still under active
development. Use with caution.
+Set `hbase.region.replica.replication.catalog.enabled` to enable async WAL
Replication for META region replicas.
+Its off by default.
+
+Regards the META replicas count, up to hbase-2.4.0, you would set the special
property 'hbase.meta.replica.count'.
Review comment:
nit: s/Regards the META replicas count/Regarding the META replicas count/
##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is
introduced in HBase-1.0+. S
For turning this feature on, you should configure
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See
Configuration section below.
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via
“Async WAL Replication” feature and is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated through w
al replication to secondaries. When they observe the flush/compaction or bulk
load event, the secondary regions replay the event to pick up the new files and
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated throu
gh wal replication to secondaries. When they observe the flush/compaction or
bulk load event, the secondary regions replay the event to pick up the new
files and drop the old ones.
Committing writes in the same order as in primary ensures that the secondaries
won’t diverge from the primary regions data, but since the log replication is
asynchronous, the data might still be stale in secondary regions. Since this
feature works as a replication endpoint, the performance and latency
characteristics is expected to be similar to inter-cluster replication.
Async WAL Replication is *disabled* by default. You can enable this feature by
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you create a table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you createa table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
* Set configuration property `hbase.region.replica.replication.enabled` to
false in `hbase-site.xml` (see Configuration section below)
* Disable the replication peer named `region_replica_replication` in the
cluster using hbase shell or `Admin` class:
[source,bourne]
----
hbase> disable_peer 'region_replica_replication'
----
+Async WAL Replication and the `hbase:meta` table is a little more involved and
gets its own section below; see <<async.wal.replication.meta>>
+
=== Store File TTL
In both of the write propagation approaches mentioned above, store files of
the primary will be opened in secondaries independent of the primary region. So
for files that the primary compacted away, the secondaries might still be
referring to these files for reading. Both features are using HFileLinks to
refer to files, but there is no protection (yet) for guaranteeing that the file
will not be deleted prematurely. Thus, as a guard, you should set the
configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such
as 1 hour to guarantee that you will not receive IOExceptions for requests
going to replicas.
+[[async.wal.replication.meta]]
=== Region replication for META table’s region
-Currently, Async WAL Replication is not done for the META table’s WAL. The
meta table’s secondary replicas still refreshes themselves from the persistent
store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs
to be set to a certain non-zero value for refreshing the meta store files. Note
that this configuration is configured differently than
-`hbase.regionserver.storefile.refresh.period`.
+Up until hbase-2.4.0, Async WAL Replication did not work for the META table’s
WAL. The meta table’s secondary replicas refreshed themselves from the
persistent store files every `hbase.regionserver.meta.storefile.refresh.period`,
+(a non-zero value). Note how the META replication period is distinct from the
user-space `hbase.regionserver.storefile.refresh.period` value.
+
+Async WAL replication for META is a new feature in 2.4.0 still under active
development. Use with caution.
+Set `hbase.region.replica.replication.catalog.enabled` to enable async WAL
Replication for META region replicas.
+Its off by default.
+
+Regards the META replicas count, up to hbase-2.4.0, you would set the special
property 'hbase.meta.replica.count'.
+Now you can alter the META table as you would a user-space table (if
`hbase.meta.replica.count` is set, it
+will take precedent over what is set for replica count in the META table
updating META replica count to
+match).
+
+==== Load Balancing META table load ====
+
+hbase-2.4.0 adds a new client-side `LoadBalance` mode. When enabled
client-side, clients will try to read META replicas first before falling back
on the primary. Before this,
+the lookup mode -- now named `HedgedRead` -- had clients read the primary and
if no response after a configurable amount of time had elapsed, it would start
up reads against the replicas.
+The new 'LoadBalance' mode helps alleviate hotspotting on the META table
distributing the META read load.
Review comment:
You give a name to AND describe the old behavior, but you only give a
name to the new behavior. Would be a kindness to describe and name both, OR
just name them and link off to the documentation on this enum in code.
##########
File path: src/main/asciidoc/_chapters/architecture.adoc
##########
@@ -2865,26 +2865,51 @@ The first mechanism is store file refresher which is
introduced in HBase-1.0+. S
For turning this feature on, you should configure
`hbase.regionserver.storefile.refresh.period` to a non-zero value. See
Configuration section below.
-==== Asnyc WAL replication
-The second mechanism for propagation of writes to secondaries is done via
“Async WAL Replication” feature and is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated through w
al replication to secondaries. When they observe the flush/compaction or bulk
load event, the secondary regions replay the event to pick up the new files and
drop the old ones.
+[[async.wal.replication]]
+==== Async WAL replication
+The second mechanism for propagation of writes to secondaries is done via the
“Async WAL Replication” feature. It is only available in HBase-1.1+. This works
similarly to HBase’s multi-datacenter replication, but instead the data from a
region is replicated to the secondary regions. Each secondary replica always
receives and observes the writes in the same order that the primary region
committed them. In some sense, this design can be thought of as “in-cluster
replication”, where instead of replicating to a different datacenter, the data
goes to secondary regions to keep secondary region’s in-memory state up to
date. The data files are shared between the primary region and the other
replicas, so that there is no extra storage overhead. However, the secondary
regions will have recent non-flushed data in their memstores, which increases
the memory overhead. The primary region writes flush, compaction, and bulk load
events to its WAL as well, which are also replicated throu
gh wal replication to secondaries. When they observe the flush/compaction or
bulk load event, the secondary regions replay the event to pick up the new
files and drop the old ones.
Committing writes in the same order as in primary ensures that the secondaries
won’t diverge from the primary regions data, but since the log replication is
asynchronous, the data might still be stale in secondary regions. Since this
feature works as a replication endpoint, the performance and latency
characteristics is expected to be similar to inter-cluster replication.
Async WAL Replication is *disabled* by default. You can enable this feature by
setting `hbase.region.replica.replication.enabled` to `true`.
-Asyn WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you create a table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
+The Async WAL Replication feature will add a new replication peer named
`region_replica_replication` as a replication peer when you createa table with
region replication > 1 for the first time. Once enabled, if you want to disable
this feature, you need to do two actions:
* Set configuration property `hbase.region.replica.replication.enabled` to
false in `hbase-site.xml` (see Configuration section below)
* Disable the replication peer named `region_replica_replication` in the
cluster using hbase shell or `Admin` class:
[source,bourne]
----
hbase> disable_peer 'region_replica_replication'
----
+Async WAL Replication and the `hbase:meta` table is a little more involved and
gets its own section below; see <<async.wal.replication.meta>>
+
=== Store File TTL
In both of the write propagation approaches mentioned above, store files of
the primary will be opened in secondaries independent of the primary region. So
for files that the primary compacted away, the secondaries might still be
referring to these files for reading. Both features are using HFileLinks to
refer to files, but there is no protection (yet) for guaranteeing that the file
will not be deleted prematurely. Thus, as a guard, you should set the
configuration property `hbase.master.hfilecleaner.ttl` to a larger value, such
as 1 hour to guarantee that you will not receive IOExceptions for requests
going to replicas.
+[[async.wal.replication.meta]]
=== Region replication for META table’s region
-Currently, Async WAL Replication is not done for the META table’s WAL. The
meta table’s secondary replicas still refreshes themselves from the persistent
store files. Hence the `hbase.regionserver.meta.storefile.refresh.period` needs
to be set to a certain non-zero value for refreshing the meta store files. Note
that this configuration is configured differently than
-`hbase.regionserver.storefile.refresh.period`.
+Up until hbase-2.4.0, Async WAL Replication did not work for the META table’s
WAL. The meta table’s secondary replicas refreshed themselves from the
persistent store files every `hbase.regionserver.meta.storefile.refresh.period`,
+(a non-zero value). Note how the META replication period is distinct from the
user-space `hbase.regionserver.storefile.refresh.period` value.
+
+Async WAL replication for META is a new feature in 2.4.0 still under active
development. Use with caution.
+Set `hbase.region.replica.replication.catalog.enabled` to enable async WAL
Replication for META region replicas.
+Its off by default.
+
+Regards the META replicas count, up to hbase-2.4.0, you would set the special
property 'hbase.meta.replica.count'.
+Now you can alter the META table as you would a user-space table (if
`hbase.meta.replica.count` is set, it
+will take precedent over what is set for replica count in the META table
updating META replica count to
+match).
+
+==== Load Balancing META table load ====
+
+hbase-2.4.0 adds a new client-side `LoadBalance` mode. When enabled
client-side, clients will try to read META replicas first before falling back
on the primary. Before this,
+the lookup mode -- now named `HedgedRead` -- had clients read the primary and
if no response after a configurable amount of time had elapsed, it would start
up reads against the replicas.
+The new 'LoadBalance' mode helps alleviate hotspotting on the META table
distributing the META read load.
+
+To enable the meta replica locator's load balance mode, please set the
following configuration at on the client-side (only): set
'hbase.locator.meta.replicas.mode' to "LoadBalance".
+Valid options for this configuration are `None`, `HedgedRead`, and
`LoadBalance`. Option parse is case insensitive.
+The default mode is `None` (which falls through to `HedgedRead`, the current
default). Please do not put this configuration in any hbase server's
configuration, master or region server.
Review comment:
"please do not" is this language strong enough? What happens if i ship
out this configuration accidentally? How badly will we muck up meta? Should the
the system proactively defend against this configuration?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]