[
https://issues.apache.org/jira/browse/HBASE-28151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17776353#comment-17776353
]
Andrew Kyle Purtell commented on HBASE-28151:
---------------------------------------------
Let's not bring back hbck1 style complex arguments. That was part of the design
mistakes we made with hbck1.
Consider the hbck2 documentation
(https://github.com/apache/hbase-operator-tools/blob/master/hbase-hbck2/README.md#philosophy):
{quote}
HBCK2 performs a single, discrete task each time it is run. It does not presume
a tool can analyze all about the running cluster and then repair 'all problems'
found as hbck1 used suggest.
HBCK2 is for fixes.
{quote}
Requirements:
- Simplicity and predictability. Each command does a single, discrete task each
time it is run. Options are only added if absolutely necessary.
- Commands have clear and simple names.
- Command arguments have simple names. We can see from current implementation
they are all UNIX like. Maintain this naming philosophy. A short name, a single
character, and a long name. Consider this resource:
https://nullprogram.com/blog/2020/08/01/
If we want to optionally keep the preflight checks even when bypassing, provide
a simple argument like '-f' (force) in addition to '-o' (bypass). And when the
-f option is not provided, keep the preflight check.
The procedure framework and implementations will require updates to incorporate
the distinction between bypass with preflight checks and bypass without
preflight checks.
> hbck -o should not allow bypassing pre transit check by default
> ---------------------------------------------------------------
>
> Key: HBASE-28151
> URL: https://issues.apache.org/jira/browse/HBASE-28151
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 2.4.17, 2.5.5
> Reporter: Viraj Jasani
> Priority: Major
>
> When operator uses hbck assigns or unassigns with "-o", the override will
> also skip pre transit checks. While this is one of the intentions with "-o",
> the primary purpose should still be to only unattach existing procedure from
> RegionStateNode so that newly scheduled assign proc can take exclusive region
> level lock.
> We should restrict bypassing preTransitCheck by only providing it as site
> config.
> If bypassing preTransitCheck is configured, only then any hbck "-o" should be
> allowed to bypass this check, otherwise by default they should go through the
> check.
>
> It is important to keep "unset of the procedure from RegionStateNode" and
> "bypassing preTransitCheck" separate so that when the cluster state is bad,
> we don't explicitly deteriorate it further e.g. if a region was successfully
> split and now if operator performs "hbck assigns \{region} -o" and if it
> bypasses the transit check, master would bring the region online and it could
> compact store files and archive the store file which is referenced by
> daughter region. This would not allow daughter region to come online.
> Let's introduce hbase site config to allow bypassing preTransitCheck, it
> should not be doable only by operator using hbck alone.
>
> "-o" should mean "override" the procedure that is attached to the
> RegionStateNode, it should not mean forcefully skip any region transition
> validation checks.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)