Re: [DISCUSS] Direction of HBCK2

Andrew Purtell Thu, 30 May 2019 09:59:12 -0700

I did a both barrels type response to a suggestion Wellington made that I hope 
communicates the right level of dismay at the prevailing line of thought in 
this thread.


Let me say I agree hbck 1 was sometimes oversold as a magic tool. 

However if you analyze all of its options and then look to branch 2, where are 
the gaps. In branch 1 there is a command line tool that can be executed by 
operations and first level support. Its options can be described in a runbook 
with cut and paste examples. In branch 2 ... ?

There appears no ready solution for detecting and deploying undeployed 
“missing” regions. 

There appears no ready solution for fixing a failed split or merge or other 
corruption producing a hole or overlap in the region chain. 

There appears no tool capable of rebuilding meta from scratch from HDFS level 
metadata; a last but crucial resort as this is what holds the line against a 
complete and time intensive restore from backup. 

I may have an incorrect impression of some of this. If so that would be a big 
relief. If not these are suggested areas of focus. 

I’m not saying that 2 needs Hbck exactly as it is in 1. However the lack of 
simple recovery tools or actions that can be taken by a non expert guided by a 
runbook means the risk to operations when there is the inevitable problem is 
higher. And I don’t mean theoretical problems. I mean the commonly occurring 
issues Hbck 1 was coded up to address in a mostly automated way, like failed 
splits or failed deployments or simple HDFS level corruptions like loss of meta 
region hfiles. Lacking simple tooling our operations will have to do 
<something> more complex, labor intensive, and or risky. This factors in to the 
major version upgrade risk analysis. 

What I would advise is an analysis that enumerates all of the risks and 
specific conditions Hbck 1 addresses, then excludes those not relevant for the 
2 code base, then excludes those which have easy and simple tools existing 
right now to solve. What you have left is a list of action items. Then there 
should be an analysis of the new risks in 2 given AMv2s theory of operation, 
for example for each procedure based action if the procedure is always failing 
how can the operator recover the prerequisites for successful completion, and 
provide a simple tool or option for applying a fix or remediation to cluster 
state. 


> On May 30, 2019, at 7:16 AM, Josh Elser <els...@apache.org> wrote:
> 
> Right, this discuss isn't meant to be implying that any of this exists -- 
> instead, I wanted to make sure we're focused on building tooling which both 
> devs and users will find usable and effective.
> 
> What's your gut-reaction to what I suggested? I think you're saying you see 
> operators having to apply more understanding/insight to fix a "complex 
> problem" as taking on more risk which you'd have to weigh. In other words, 
> anything less than the verbatim "fix these problems" flags you mentioned 
> earlier would require you to do the risk-analysis math if moving to HBase2?
> 
> Thanks for your insights.
> 
>> On 5/29/19 4:45 PM, Andrew Purtell wrote:
>> I have yet to see essential HBCK functions in 1 replaced by anything -
>> documentation, script, hbck2, whatever.
>> Do we have a tool or script in HBase 2 that can rebuild meta from HDFS
>> state? This would be faster than a complete restore from backup. It would
>> be useful and important to offer this option to operators, but not
>> essential, because it could be valid to say if meta is screwed so are you
>> and you have to restore completely from backup. Meta is small, a fraction
>> of total data footprint. Seems a real shame to impose such a high cost when
>> there could be an alternative. I'd have to think for a while about
>> accepting this kind of operational risk when HBase 1 has such tooling.
>> What I am more worried about is this: Do we have a tool or script in HBase
>> 2 that can fix errors in the region chain caused by failed splits, failed
>> merges, or double assignment? It seems not, and the implications for
>> service availability are not good when compared with HBase 1. With HBase 1,
>> hbck is an option. Sure, it has a lot of problematic aspects, but I have
>> seen it recover a cluster's total availability with fairly fast execution.
>> It could be valid, not saying I agree with this point, to clearly document
>> that all aspects of recovery from corrupted metadata is the responsibility
>> of the operator, at least this is full disclosure. We can then weigh the
>> cost and risk associated with this policy when deciding if ever to upgrade.
>>> On Wed, May 29, 2019 at 1:13 PM Josh Elser <els...@apache.org> wrote:
>>> My understanding was that recreating sweeping "fix it" flags was an
>>> anti-goal of HBCK2, but I'm surprised a grey-beard hasn't come in to say
>>> confirm/dispute that :). I could be taking that out of context or my dog
>>> remembers things better than I do.
>>> 
>>> The reasoning behind this line of thinking for HBCK2 is:
>>> 
>>> * Smaller actions are easier to implement correctly and be well-tested
>>> * The more complex the action, the more likely it is for something we
>>> (as devs) didn't expect to happen which results in a bug.
>>> 
>>> The "stretch" in my mind is that we can string together small actions to
>>> recreate the bigger ones (the fix* type commands from hbck1), *but*
>>> teach operators to apply knowledge about their cluster instead of
>>> treating hbck like a black box.
>>> 
>>> For example, if we try to decompose something like fixAssignments into
>>> something like: `for region in $(list non-open regions); do assign
>>> $region; end`. As developers, we don't have to catch every edge case of
>>> _something_ that might be specific to the admin's actual situation (e.g.
>>> what if a table is disabled and we don't want to assign those regions)
>>> and it lets us write better test cases.
>>> 
>>> Again, this is what I have floating around in my head -- nothing more
>>> than that at present.
>>> 
>>>> On 5/29/19 11:54 AM, Andrew Purtell wrote:
>>>> To me this is a succinct specification of minimum functionality for a
>>>> recovery tool: using on disk bits, rebuild meta table, with end result a
>>>> working cluster that did not miss any data during the reconstruction.
>>>> 
>>>> Of course focusing on root causes of metadata mismanagement is
>>> appropriate
>>>> when investigating a specific incident, but this is orthogonal from the
>>>> question of whether or not recovery is possible after a bug corrupts
>>>> metadata. It is customary for filesystems and databases to ship with a
>>> tool
>>>> that attempts recovery after corruption, on the (correct, IMHO)
>>> assumption
>>>> that corruption is inevitable, either due to logic bug, hardware
>>> problems,
>>>> or operator error.
>>>> 
>>>> The features of hbck in HBase 1 that have resolved availability problems
>>>> where I work are: fixMeta, fixAssignments, fixHdfsHoles, fixHdfsOverlaps.
>>>> In HBaseFsck.java in branch-2 these are all in the unsupported options
>>> set.
>>>> Because these are all lacking in HBase 2 I will not certify it ready for
>>>> production to my employer. If there is some other tool which offers these
>>>> recovery options I'm not aware of it nor documentation for it and would
>>>> appreciate a pointer if you have one.
>>>> 
>>>> 
>>>> On Wed, May 29, 2019 at 7:11 AM Toshihiro Suzuki <brfrn...@apache.org>
>>>> wrote:
>>>> 
>>>>> Thanks Wellington.
>>>>> 
>>>>>> I guess those can still be fixed with some combinations of commands
>>>>> today,
>>>>>> such as merge/assign.
>>>>> 
>>>>> Let me explain the situation I faced in the customer's cluster a little
>>> bit
>>>>> more.
>>>>> It seemed like the table data in HDFS was intact but they lost some meta
>>>>> data
>>>>> (in hbase:meta) of the table. So I needed to rebuild the meta from HDFS
>>>>> data.
>>>>> In this case, we can still fix with some combinations of commands
>>> today? If
>>>>> so,
>>>>> I would appreciate it if you could suggest the steps to me.
>>>>> 
>>>>>> And focus on fixing the main root cause of such problems, as a mean to
>>>>>> soften the need of use such commands.
>>>>> 
>>>>> Yes, correct. Actually I usually do that. But I didn't do that in that
>>>>> case..
>>>>> 
>>>>> 
>>>>> On Wed, May 29, 2019 at 5:47 AM Wellington Chevreuil <
>>>>> wellington.chevre...@gmail.com> wrote:
>>>>> 
>>>>>> Thanks Toshihiro! I guess those can still be fixed with some
>>> combinations
>>>>>> of commands today, such as merge/assign. Of course, it requires some
>>>>> extra
>>>>>> scripting and log reading on cases where many regions are in an
>>>>>> inconsistent state, maybe we should work on provide a one liner command
>>>>>> that relies on the current existing ones. And focus on fixing the main
>>>>> root
>>>>>> cause of such problems, as a mean to soften the need of use such
>>>>> commands.
>>>>>> 
>>>>>> I'm not really a fan of offlinemetarepair, nor hbck1 fix
>>> holes/overlaps,
>>>>>> would rather not have those back. Sure those are easy and convenient to
>>>>>> trigger, but hbck1 reports are sometimes misleading (for instance, it
>>>>>> reports holes when region(s) on the chain is/are simply not online),
>>> and
>>>>>> that, combined with availability of such heavy hammers had led
>>>>>> unexperienced operators to fall into running it and getting into a
>>> worse
>>>>>> state.
>>>>>> 
>>>>>> Em qua, 29 de mai de 2019 às 13:22, Toshihiro Suzuki <
>>>>> brfrn...@apache.org>
>>>>>> escreveu:
>>>>>> 
>>>>>>> Hi Wellington,
>>>>>>> 
>>>>>>> I saw table holes in a customer's cluster actually, and I just fixed
>>>>> the
>>>>>>> issues
>>>>>>> by the workaround I mentioned in HBASE-21665
>>>>>>> <https://issues.apache.org/jira/browse/HBASE-21665> and I didn't dig
>>>>> the
>>>>>>> reason
>>>>>>> why the table holes happened at that time because the customer didn't
>>>>>> want.
>>>>>>> 
>>>>>>> However, IMO, whatever the reason I think we should have a direct way
>>>>> to
>>>>>>> fix
>>>>>>> holes and overlaps.
>>>>>>> 
>>>>>>> On Wed, May 29, 2019 at 4:57 AM Wellington Chevreuil <
>>>>>>> wellington.chevre...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> So JMS, Toshihiro, seems like upgrading from some 1.x to 2.x
>>>>>> consistently
>>>>>>>> triggers this problem? Do you guys know if there are any bug jiras
>>>>> open
>>>>>>>> that would cover these scenarios? If not, and if you guys have enough
>>>>>>>> resources for investigating it, maybe worth open a specific jira?
>>>>>>>> 
>>>>>>>> Em qua, 29 de mai de 2019 às 11:40, Jean-Marc Spaggiari <
>>>>>>>> jean-m...@spaggiari.org> escreveu:
>>>>>>>> 
>>>>>>>>> Personnaly, when I tried to upgrade from 1.4.x to 2.2.x I end up
>>>>> in a
>>>>>>>>> situation where my meta was empty and had to get it repaired, but
>>>>>>> lacked
>>>>>>>>> OfflineMetaRepair for 2.2.x so I just had to delete all my tables,
>>>>>> get
>>>>>>> a
>>>>>>>>> brand new installation, recreate the tables and bulkload back the
>>>>>> data
>>>>>>>> into
>>>>>>>>> them. Would have been happy to have a OfflineMetaRepair.
>>>>>>>>> 
>>>>>>>>> But it's more like an experimental cluster than a production one...
>>>>>>>>> 
>>>>>>>>> JMS
>>>>>>>>> 
>>>>>>>>> Le mer. 29 mai 2019 à 06:36, Wellington Chevreuil <
>>>>>>>>> wellington.chevre...@gmail.com> a écrit :
>>>>>>>>> 
>>>>>>>>>> Interesting, I haven't seen any cases where OfflineMetaRepair was
>>>>>>>> really
>>>>>>>>>> required, among our customer base (running cdh6.1.x/hbase2.1.1,
>>>>>>>>>> cdh6.2/hbase2.1.2). Majority of RITs issue I had came with on
>>>>> hbase
>>>>>>> 2.x
>>>>>>>>>> were related to APs/SCPs failures, most of which could be sorted
>>>>>> with
>>>>>>>>> hbck2
>>>>>>>>>> commands available by then (in some cases, required some CLI
>>>>>>> scripting
>>>>>>>> to
>>>>>>>>>> build up a "bulk" assign command).
>>>>>>>>>> 
>>>>>>>>>> Em qua, 29 de mai de 2019 às 00:55, Toshihiro Suzuki <
>>>>>>>>> brfrn...@apache.org>
>>>>>>>>>> escreveu:
>>>>>>>>>> 
>>>>>>>>>>> Hi Josh,
>>>>>>>>>>> 
>>>>>>>>>>> Thank you for the explanation. I agree with the direction for
>>>>>>> HBCK2.
>>>>>>>>>>> 
>>>>>>>>>>> The problem I wanted to tell you in the Jira is that until we
>>>>>>>> implement
>>>>>>>>>> the
>>>>>>>>>>> features
>>>>>>>>>>> you mentioned, we don't have any direct way how to fix holes
>>>>> and
>>>>>>>>>> overlaps.
>>>>>>>>>>> The holes and overlaps can be created by bugs or operation
>>>>>> errors,
>>>>>>>> so I
>>>>>>>>>>> think we
>>>>>>>>>>> should be able to fix these issues.
>>>>>>>>>>> 
>>>>>>>>>>> I thought OfflineMetaRepair could be a workaround for the
>>>>> issues
>>>>>>>> until
>>>>>>>>> we
>>>>>>>>>>> implement
>>>>>>>>>>> the features of HBCK2.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Toshi
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, May 28, 2019 at 9:12 AM Josh Elser <els...@apache.org>
>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Context: https://issues.apache.org/jira/browse/HBASE-21665
>>>>>>>>>>>> 
>>>>>>>>>>>> I left a comment on the above issue about what I thought good
>>>>>>>> things
>>>>>>>>> to
>>>>>>>>>>>> build into HBCK2 would be -- a focus on specific "primitive"
>>>>>>>>> operations
>>>>>>>>>>>> that an admin/operator could use to help repair an otherwise
>>>>>>> broken
>>>>>>>>>>>> HBase installation. Some examples I had in my head were:
>>>>>>>>>>>> 
>>>>>>>>>>>> * Create an empty region (to plug a hole)
>>>>>>>>>>>> * Report holes in a region chain
>>>>>>>>>>>> 
>>>>>>>>>>>> In my head, the difference for HBCK2 was that we want to give
>>>>>>> folks
>>>>>>>>> the
>>>>>>>>>>>> tools to fix their cluster, but we did not want to own the
>>>>>> "just
>>>>>>>> fix
>>>>>>>>>>>> everything" kind of tool that HBCK1 had become. That problem
>>>>>> with
>>>>>>>>> HBCK1
>>>>>>>>>>>> was that it was often difficult/problematic for us to know
>>>>> how
>>>>>> to
>>>>>>>>>>>> correctly fix a problem (the same problem could be corrected
>>>>> in
>>>>>>>>>>>> different ways).
>>>>>>>>>>>> 
>>>>>>>>>>>> Andrew had some confusion about this, so I'm not sure if I'm
>>>>>>>> off-base
>>>>>>>>>> or
>>>>>>>>>>>> if we're all in agreement on direction and we just need to
>>>>> do a
>>>>>>>>> better
>>>>>>>>>>>> job documenting things. Thanks for keeping me honest either
>>>>> way
>>>>>>> :)
>>>>>>>>>>>> 
>>>>>>>>>>>> And just in case it doesn't go without saying, HBCK2 would be
>>>>>>>>> something
>>>>>>>>>>>> that helps fix a system, while we want to always understand
>>>>> the
>>>>>>>> root
>>>>>>>>>>>> cause of how/why we got into a situation where we needed
>>>>> HBCK2
>>>>>>> and
>>>>>>>>> also
>>>>>>>>>>>> address that.
>>>>>>>>>>>> 
>>>>>>>>>>>> - Josh
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>

Re: [DISCUSS] Direction of HBCK2

Reply via email to