ndimiduk commented on a change in pull request #1786:
URL: https://github.com/apache/hbase/pull/1786#discussion_r431437782
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -1911,43 +1912,51 @@ public boolean normalizeRegions() throws IOException {
return false;
}
- synchronized (this.normalizer) {
+ if (!normalizationInProgressLock.tryLock()) {
Review comment:
It hangs out here on the master because, sadly, a bunch of logic is
implemented in `HMaster`. All the stuff about deciding which tables to
normalize and actual execution of the generated plans happens in
`Master#normalizeRegions` (under the same lock we were just discussing). I
don't love this design, but I think it does make sense that a normalizer
algorithm exist mostly outside of the context of the master context in which it
runs. If I were to make the master more of a composition, I would implement a
class called something like `NormalizationService`, which was responsible for
instantiating the instance of the `Normalizer` algorithm, registering the
`Chore` and being the delegation target for invocations coming from RPC. The
lock would be internal to this `NormalizationService`.
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -400,6 +401,8 @@ public void run() {
private final LockManager lockManager = new LockManager(this);
private RSGroupBasedLoadBalancer balancer;
+ // a lock to prevent concurrent normalization actions.
+ private final ReentrantLock normalizationInProgressLock = new
ReentrantLock();
Review comment:
There is a `RegionNormalizerChore`, yes. It invokes
`master.normalizeRegions()`. However, the same method can also be called from
`MasterRpcServices`, so this lock (previously a `synchronized(normalizer)
block`) prevents the chore and an operator invocation from colliding.
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -797,7 +800,6 @@ protected void initializeZKBasedSystemTrackers()
this.balancer.setConf(conf);
this.normalizer = RegionNormalizerFactory.getRegionNormalizer(conf);
this.normalizer.setMasterServices(this);
- this.normalizer.setMasterRpcServices((MasterRpcServices)rpcServices);
Review comment:
It was using `masterRpcServices` to reach out to check the split and
merge switches. I've replaced that with direct calls using `masterServices`.
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -1911,43 +1912,51 @@ public boolean normalizeRegions() throws IOException {
return false;
}
- synchronized (this.normalizer) {
+ if (!normalizationInProgressLock.tryLock()) {
// Don't run the normalizer concurrently
+ LOG.info("Normalization already in progress. Skipping request.");
+ } else {
+ try {
+ List<TableName> allEnabledTables = new ArrayList<>(
+ tableStateManager.getTablesInStates(TableState.State.ENABLED));
+ Collections.shuffle(allEnabledTables);
- List<TableName> allEnabledTables = new ArrayList<>(
- this.tableStateManager.getTablesInStates(TableState.State.ENABLED));
-
- Collections.shuffle(allEnabledTables);
-
- for (TableName table : allEnabledTables) {
- TableDescriptor tblDesc = getTableDescriptors().get(table);
- if (table.isSystemTable() || (tblDesc != null &&
- !tblDesc.isNormalizationEnabled())) {
- LOG.trace("Skipping normalization for {}, as it's either system"
- + " table or doesn't have auto normalization turned on", table);
- continue;
- }
+ for (TableName table : allEnabledTables) {
+ if (table.isSystemTable()) {
+ continue;
+ }
+ final TableDescriptor tblDesc = getTableDescriptors().get(table);
+ if (tblDesc != null && !tblDesc.isNormalizationEnabled()) {
+ LOG.debug(
+ "Skipping {} because normalization is disabled in its table
properties.", table);
+ continue;
+ }
- // make one last check that the cluster isn't shutting down before
proceeding.
- if (skipRegionManagementAction("region normalizer")) {
- return false;
- }
+ // make one last check that the cluster isn't shutting down before
proceeding.
+ if (skipRegionManagementAction("region normalizer")) {
+ return false;
+ }
- final List<NormalizationPlan> plans =
this.normalizer.computePlanForTable(table);
- if (CollectionUtils.isEmpty(plans)) {
- return true;
- }
+ final List<NormalizationPlan> plans =
normalizer.computePlansForTable(table);
+ if (CollectionUtils.isEmpty(plans)) {
+ return true;
Review comment:
Yikes! good point.
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
##########
@@ -1911,43 +1912,51 @@ public boolean normalizeRegions() throws IOException {
return false;
}
- synchronized (this.normalizer) {
+ if (!normalizationInProgressLock.tryLock()) {
// Don't run the normalizer concurrently
+ LOG.info("Normalization already in progress. Skipping request.");
+ } else {
+ try {
+ List<TableName> allEnabledTables = new ArrayList<>(
+ tableStateManager.getTablesInStates(TableState.State.ENABLED));
+ Collections.shuffle(allEnabledTables);
- List<TableName> allEnabledTables = new ArrayList<>(
- this.tableStateManager.getTablesInStates(TableState.State.ENABLED));
-
- Collections.shuffle(allEnabledTables);
-
- for (TableName table : allEnabledTables) {
- TableDescriptor tblDesc = getTableDescriptors().get(table);
- if (table.isSystemTable() || (tblDesc != null &&
- !tblDesc.isNormalizationEnabled())) {
- LOG.trace("Skipping normalization for {}, as it's either system"
- + " table or doesn't have auto normalization turned on", table);
- continue;
- }
+ for (TableName table : allEnabledTables) {
+ if (table.isSystemTable()) {
+ continue;
+ }
+ final TableDescriptor tblDesc = getTableDescriptors().get(table);
+ if (tblDesc != null && !tblDesc.isNormalizationEnabled()) {
+ LOG.debug(
+ "Skipping {} because normalization is disabled in its table
properties.", table);
+ continue;
+ }
- // make one last check that the cluster isn't shutting down before
proceeding.
- if (skipRegionManagementAction("region normalizer")) {
- return false;
- }
+ // make one last check that the cluster isn't shutting down before
proceeding.
+ if (skipRegionManagementAction("region normalizer")) {
+ return false;
+ }
- final List<NormalizationPlan> plans =
this.normalizer.computePlanForTable(table);
- if (CollectionUtils.isEmpty(plans)) {
- return true;
- }
+ final List<NormalizationPlan> plans =
normalizer.computePlansForTable(table);
+ if (CollectionUtils.isEmpty(plans)) {
+ return true;
+ }
- try (final Admin admin =
asyncClusterConnection.toConnection().getAdmin()) {
- for (NormalizationPlan plan : plans) {
- plan.execute(admin);
- if (plan.getType() == PlanType.SPLIT) {
- splitPlanCount++;
- } else if (plan.getType() == PlanType.MERGE) {
- mergePlanCount++;
+ try (final Admin admin =
asyncClusterConnection.toConnection().getAdmin()) {
+ // as of this writing, `plan.execute()` is non-blocking, so
there's no artificial rate-
+ // limiting of merge requests due to this serial loop.
Review comment:
Well, sort of you can limit the work done. There no way to say "execute
at most N actions per table and M actions in a single invocation." You think we
need something like this?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]