[ 
https://issues.apache.org/jira/browse/HBASE-28151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17774989#comment-17774989
 ] 

Andrew Kyle Purtell edited comment on HBASE-28151 at 10/13/23 4:46 PM:
-----------------------------------------------------------------------

hbck2 is meant to be used by people who know what they are doing. Preventing 
the operator from doing something removes the value. 
Adding a config variable does not add any safety. Operators will see that in 
order for '-o' to be useful as promised they must set that config, which 
renders the change moot.  

bq. It is important to keep "unset of the procedure from RegionStateNode" and 
"bypassing preTransitCheck" separate so that when the cluster state is bad, we 
don't explicitly deteriorate it furtherI would approach this by asking for 
operator confirmation if preTransitCheck 

So make these things separate options or hbck2 commands, and that solves the 
problem without reducing the freedom of an operator to solve an operational 
challenge.  


was (Author: apurtell):
hbck2 is meant to be used by people who know what they are doing. Preventing 
the operator from doing something removes the value. 
Adding a config variable does not add any safety. Operators will see that in 
order for '-o' to be useful as promised they must set that config, which 
renders the change moot.  

bq. It is important to keep "unset of the procedure from RegionStateNode" and 
"bypassing preTransitCheck" separate so that when the cluster state is bad, we 
don't explicitly deteriorate it furtherI would approach this by asking for 
operator confirmation if preTransitCheck 

So make these things separate options or hbck2 commands, and that solves the 
problem without reducing the freedom of an operator to solve a problem. 

> hbck -o should not allow bypassing pre transit check by default
> ---------------------------------------------------------------
>
>                 Key: HBASE-28151
>                 URL: https://issues.apache.org/jira/browse/HBASE-28151
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 2.4.17, 2.5.5
>            Reporter: Viraj Jasani
>            Priority: Major
>
> When operator uses hbck assigns or unassigns with "-o", the override will 
> also skip pre transit checks. While this is one of the intentions with "-o", 
> the primary purpose should still be to only unattach existing procedure from 
> RegionStateNode so that newly scheduled assign proc can take exclusive region 
> level lock.
> We should restrict bypassing preTransitCheck by only providing it as site 
> config.
> If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
> allowed to bypass this check, otherwise by default they should go through the 
> check.
>  
> It is important to keep "unset of the procedure from RegionStateNode" and 
> "bypassing preTransitCheck" separate so that when the cluster state is bad, 
> we don't explicitly deteriorate it further e.g. if a region was successfully 
> split and now if operator performs "hbck assigns \{region} -o" and if it 
> bypasses the transit check, master would bring the region online and it could 
> compact store files and archive the store file which is referenced by 
> daughter region. This would not allow daughter region to come online.
> Let's introduce hbase site config to allow bypassing preTransitCheck, it 
> should not be doable only by operator using hbck alone.
>  
> "-o" should mean "override" the procedure that is attached to the 
> RegionStateNode, it should not mean forcefully skip any region transition 
> validation checks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to