[ 
https://issues.apache.org/jira/browse/HBASE-28151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17776353#comment-17776353
 ] 

Andrew Kyle Purtell commented on HBASE-28151:
---------------------------------------------

Let's not bring back hbck1 style complex arguments. That was part of the design 
mistakes we made with hbck1. 

Consider the hbck2 documentation 
(https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/README.md#philosophy):
 
{quote}
HBCK2 performs a single, discrete task each time it is run. It does not presume 
a tool can analyze all about the running cluster and then repair 'all problems' 
found as hbck1 used suggest.
HBCK2 is for fixes.
{quote}

Requirements:
- Simplicity and predictability. Each command does a single, discrete task each 
time it is run. Options are only added if absolutely necessary.
- Commands have clear and simple names.
- Command arguments have simple names. We can see from current implementation 
they are all UNIX like. Maintain this naming philosophy. A short name, a single 
character, and a long name. Consider this resource: 
https://nullprogram.com/blog/2020/08/01/

If we want to optionally keep the preflight checks even when bypassing, provide 
a simple argument like '-f' (force) in addition to '-o' (bypass). And when the 
-f option is not provided, keep the preflight check.

The procedure framework and implementations will require updates to incorporate 
the distinction between bypass with preflight checks and bypass without 
preflight checks. 

> hbck -o should not allow bypassing pre transit check by default
> ---------------------------------------------------------------
>
>                 Key: HBASE-28151
>                 URL: https://issues.apache.org/jira/browse/HBASE-28151
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 2.4.17, 2.5.5
>            Reporter: Viraj Jasani
>            Priority: Major
>
> When operator uses hbck assigns or unassigns with "-o", the override will 
> also skip pre transit checks. While this is one of the intentions with "-o", 
> the primary purpose should still be to only unattach existing procedure from 
> RegionStateNode so that newly scheduled assign proc can take exclusive region 
> level lock.
> We should restrict bypassing preTransitCheck by only providing it as site 
> config.
> If bypassing preTransitCheck is configured, only then any hbck "-o" should be 
> allowed to bypass this check, otherwise by default they should go through the 
> check.
>  
> It is important to keep "unset of the procedure from RegionStateNode" and 
> "bypassing preTransitCheck" separate so that when the cluster state is bad, 
> we don't explicitly deteriorate it further e.g. if a region was successfully 
> split and now if operator performs "hbck assigns \{region} -o" and if it 
> bypasses the transit check, master would bring the region online and it could 
> compact store files and archive the store file which is referenced by 
> daughter region. This would not allow daughter region to come online.
> Let's introduce hbase site config to allow bypassing preTransitCheck, it 
> should not be doable only by operator using hbck alone.
>  
> "-o" should mean "override" the procedure that is attached to the 
> RegionStateNode, it should not mean forcefully skip any region transition 
> validation checks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to