[
https://issues.apache.org/jira/browse/HBASE-25549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791010#comment-17791010
]
Zhuoyue Huang commented on HBASE-25549:
---------------------------------------
{quote}I wonder if https://issues.apache.org/jira/browse/HBASE-28215 is better
for handling RIT storms, since the hmaster procedure framework already has
error handling and state recovery built in, just not throttling (yet).
{quote}
I agree that HBASE-28215 is another good way to handle RIT storms.
In our scenario, we once faced a critical bug that occurred when we added a
feature's configuration to a table, causing regions can never leave the
'opening' state. In this case, all regions are 'dead' finally. So even if
HMaster could throttle the speed of this progress, we still need a way to
carefully alter a table. e.g, altering a table but not reopening regions, so
that we can check if our altering is safe enough to go forward. Otherwise, we
can roll back and affect only one region.
> Provide a switch that allows avoiding reopening all regions when modifying a
> table to prevent RIT storms.
> ---------------------------------------------------------------------------------------------------------
>
> Key: HBASE-25549
> URL: https://issues.apache.org/jira/browse/HBASE-25549
> Project: HBase
> Issue Type: Improvement
> Components: master, shell
> Affects Versions: 3.0.0-alpha-1
> Reporter: Zhuoyue Huang
> Assignee: Zhuoyue Huang
> Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1, 2.5.7
>
>
> Under normal circumstances, modifying a table will cause all regions
> belonging to the table to enter RIT. Imagine the following two scenarios:
> # Someone entered the wrong configuration (e.g. negative
> 'hbase.busy.wait.multiplier.max' value) when altering the table, causing
> thousands of online regions to fail to open, leading to online accidents.
> # Modify the configuration of a table, but this modification is not urgent,
> the regions are not expected to enter RIT immediately.
> -'alter_lazy' is a new command to modify a table without reopening any online
> regions except those regions were assigned by other threads or split etc.-
>
> Provide an optional lazy_mode for the alter command to modify the
> TableDescriptor without the region entering the RIT. The modification will
> take effect when the region is reopened.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)