Re: [DISCUSS] Direction of HBCK2

Andrew Purtell Thu, 30 May 2019 14:07:35 -0700

Composable tools are fine if simple and scriptable. 

If you read the thread I think my complaint justifiable. It is not that they 
are lacking. It is that they are lacking and the response to the concern is 
breezy “oh just do this <thing that requires dev with deep knowledge>”. Just so 
we are clear what I am criticizing. Someone needs to call out in no uncertain 
terms how operator unfriendly this position is whether intentional or not.


Thanks for the consideration. 

> On May 30, 2019, at 2:00 PM, Josh Elser <els...@apache.org> wrote:
> 
> It sounds to me like you're saying: "No, I don't think compose-able tools are 
> a sufficient substitute in HBCK2 for what HBCK1 did".
> 
> I'm going to just delete everything else I want to write because it's going 
> to turn into a massive argument and de-rail this further. For a second time, 
> please stop the complaints about things that don't exist on this thread. We 
> all know this already.
> 
>> On 5/30/19 12:58 PM, Andrew Purtell wrote:
>> I did a both barrels type response to a suggestion Wellington made that I 
>> hope communicates the right level of dismay at the prevailing line of 
>> thought in this thread.
>> Let me say I agree hbck 1 was sometimes oversold as a magic tool.
>> However if you analyze all of its options and then look to branch 2, where 
>> are the gaps. In branch 1 there is a command line tool that can be executed 
>> by operations and first level support. Its options can be described in a 
>> runbook with cut and paste examples. In branch 2 ... ?
>> There appears no ready solution for detecting and deploying undeployed 
>> “missing” regions.
>> There appears no ready solution for fixing a failed split or merge or other 
>> corruption producing a hole or overlap in the region chain.
>> There appears no tool capable of rebuilding meta from scratch from HDFS 
>> level metadata; a last but crucial resort as this is what holds the line 
>> against a complete and time intensive restore from backup.
>> I may have an incorrect impression of some of this. If so that would be a 
>> big relief. If not these are suggested areas of focus.
>> I’m not saying that 2 needs Hbck exactly as it is in 1. However the lack of 
>> simple recovery tools or actions that can be taken by a non expert guided by 
>> a runbook means the risk to operations when there is the inevitable problem 
>> is higher. And I don’t mean theoretical problems. I mean the commonly 
>> occurring issues Hbck 1 was coded up to address in a mostly automated way, 
>> like failed splits or failed deployments or simple HDFS level corruptions 
>> like loss of meta region hfiles. Lacking simple tooling our operations will 
>> have to do <something> more complex, labor intensive, and or risky. This 
>> factors in to the major version upgrade risk analysis.
>> What I would advise is an analysis that enumerates all of the risks and 
>> specific conditions Hbck 1 addresses, then excludes those not relevant for 
>> the 2 code base, then excludes those which have easy and simple tools 
>> existing right now to solve. What you have left is a list of action items. 
>> Then there should be an analysis of the new risks in 2 given AMv2s theory of 
>> operation, for example for each procedure based action if the procedure is 
>> always failing how can the operator recover the prerequisites for successful 
>> completion, and provide a simple tool or option for applying a fix or 
>> remediation to cluster state.
>>> On May 30, 2019, at 7:16 AM, Josh Elser <els...@apache.org> wrote:
>>> 
>>> Right, this discuss isn't meant to be implying that any of this exists -- 
>>> instead, I wanted to make sure we're focused on building tooling which both 
>>> devs and users will find usable and effective.
>>> 
>>> What's your gut-reaction to what I suggested? I think you're saying you see 
>>> operators having to apply more understanding/insight to fix a "complex 
>>> problem" as taking on more risk which you'd have to weigh. In other words, 
>>> anything less than the verbatim "fix these problems" flags you mentioned 
>>> earlier would require you to do the risk-analysis math if moving to HBase2?
>>> 
>>> Thanks for your insights.
>>> 
>>>> On 5/29/19 4:45 PM, Andrew Purtell wrote:
>>>> I have yet to see essential HBCK functions in 1 replaced by anything -
>>>> documentation, script, hbck2, whatever.
>>>> Do we have a tool or script in HBase 2 that can rebuild meta from HDFS
>>>> state? This would be faster than a complete restore from backup. It would
>>>> be useful and important to offer this option to operators, but not
>>>> essential, because it could be valid to say if meta is screwed so are you
>>>> and you have to restore completely from backup. Meta is small, a fraction
>>>> of total data footprint. Seems a real shame to impose such a high cost when
>>>> there could be an alternative. I'd have to think for a while about
>>>> accepting this kind of operational risk when HBase 1 has such tooling.
>>>> What I am more worried about is this: Do we have a tool or script in HBase
>>>> 2 that can fix errors in the region chain caused by failed splits, failed
>>>> merges, or double assignment? It seems not, and the implications for
>>>> service availability are not good when compared with HBase 1. With HBase 1,
>>>> hbck is an option. Sure, it has a lot of problematic aspects, but I have
>>>> seen it recover a cluster's total availability with fairly fast execution.
>>>> It could be valid, not saying I agree with this point, to clearly document
>>>> that all aspects of recovery from corrupted metadata is the responsibility
>>>> of the operator, at least this is full disclosure. We can then weigh the
>>>> cost and risk associated with this policy when deciding if ever to upgrade.
>>>>> On Wed, May 29, 2019 at 1:13 PM Josh Elser <els...@apache.org> wrote:
>>>>> My understanding was that recreating sweeping "fix it" flags was an
>>>>> anti-goal of HBCK2, but I'm surprised a grey-beard hasn't come in to say
>>>>> confirm/dispute that :). I could be taking that out of context or my dog
>>>>> remembers things better than I do.
>>>>> 
>>>>> The reasoning behind this line of thinking for HBCK2 is:
>>>>> 
>>>>> * Smaller actions are easier to implement correctly and be well-tested
>>>>> * The more complex the action, the more likely it is for something we
>>>>> (as devs) didn't expect to happen which results in a bug.
>>>>> 
>>>>> The "stretch" in my mind is that we can string together small actions to
>>>>> recreate the bigger ones (the fix* type commands from hbck1), *but*
>>>>> teach operators to apply knowledge about their cluster instead of
>>>>> treating hbck like a black box.
>>>>> 
>>>>> For example, if we try to decompose something like fixAssignments into
>>>>> something like: `for region in $(list non-open regions); do assign
>>>>> $region; end`. As developers, we don't have to catch every edge case of
>>>>> _something_ that might be specific to the admin's actual situation (e.g.
>>>>> what if a table is disabled and we don't want to assign those regions)
>>>>> and it lets us write better test cases.
>>>>> 
>>>>> Again, this is what I have floating around in my head -- nothing more
>>>>> than that at present.
>>>>> 
>>>>>> On 5/29/19 11:54 AM, Andrew Purtell wrote:
>>>>>> To me this is a succinct specification of minimum functionality for a
>>>>>> recovery tool: using on disk bits, rebuild meta table, with end result a
>>>>>> working cluster that did not miss any data during the reconstruction.
>>>>>> 
>>>>>> Of course focusing on root causes of metadata mismanagement is
>>>>> appropriate
>>>>>> when investigating a specific incident, but this is orthogonal from the
>>>>>> question of whether or not recovery is possible after a bug corrupts
>>>>>> metadata. It is customary for filesystems and databases to ship with a
>>>>> tool
>>>>>> that attempts recovery after corruption, on the (correct, IMHO)
>>>>> assumption
>>>>>> that corruption is inevitable, either due to logic bug, hardware
>>>>> problems,
>>>>>> or operator error.
>>>>>> 
>>>>>> The features of hbck in HBase 1 that have resolved availability problems
>>>>>> where I work are: fixMeta, fixAssignments, fixHdfsHoles, fixHdfsOverlaps.
>>>>>> In HBaseFsck.java in branch-2 these are all in the unsupported options
>>>>> set.
>>>>>> Because these are all lacking in HBase 2 I will not certify it ready for
>>>>>> production to my employer. If there is some other tool which offers these
>>>>>> recovery options I'm not aware of it nor documentation for it and would
>>>>>> appreciate a pointer if you have one.
>>>>>> 
>>>>>> 
>>>>>> On Wed, May 29, 2019 at 7:11 AM Toshihiro Suzuki <brfrn...@apache.org>
>>>>>> wrote:
>>>>>> 
>>>>>>> Thanks Wellington.
>>>>>>> 
>>>>>>>> I guess those can still be fixed with some combinations of commands
>>>>>>> today,
>>>>>>>> such as merge/assign.
>>>>>>> 
>>>>>>> Let me explain the situation I faced in the customer's cluster a little
>>>>> bit
>>>>>>> more.
>>>>>>> It seemed like the table data in HDFS was intact but they lost some meta
>>>>>>> data
>>>>>>> (in hbase:meta) of the table. So I needed to rebuild the meta from HDFS
>>>>>>> data.
>>>>>>> In this case, we can still fix with some combinations of commands
>>>>> today? If
>>>>>>> so,
>>>>>>> I would appreciate it if you could suggest the steps to me.
>>>>>>> 
>>>>>>>> And focus on fixing the main root cause of such problems, as a mean to
>>>>>>>> soften the need of use such commands.
>>>>>>> 
>>>>>>> Yes, correct. Actually I usually do that. But I didn't do that in that
>>>>>>> case..
>>>>>>> 
>>>>>>> 
>>>>>>> On Wed, May 29, 2019 at 5:47 AM Wellington Chevreuil <
>>>>>>> wellington.chevre...@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Thanks Toshihiro! I guess those can still be fixed with some
>>>>> combinations
>>>>>>>> of commands today, such as merge/assign. Of course, it requires some
>>>>>>> extra
>>>>>>>> scripting and log reading on cases where many regions are in an
>>>>>>>> inconsistent state, maybe we should work on provide a one liner command
>>>>>>>> that relies on the current existing ones. And focus on fixing the main
>>>>>>> root
>>>>>>>> cause of such problems, as a mean to soften the need of use such
>>>>>>> commands.
>>>>>>>> 
>>>>>>>> I'm not really a fan of offlinemetarepair, nor hbck1 fix
>>>>> holes/overlaps,
>>>>>>>> would rather not have those back. Sure those are easy and convenient to
>>>>>>>> trigger, but hbck1 reports are sometimes misleading (for instance, it
>>>>>>>> reports holes when region(s) on the chain is/are simply not online),
>>>>> and
>>>>>>>> that, combined with availability of such heavy hammers had led
>>>>>>>> unexperienced operators to fall into running it and getting into a
>>>>> worse
>>>>>>>> state.
>>>>>>>> 
>>>>>>>> Em qua, 29 de mai de 2019 às 13:22, Toshihiro Suzuki <
>>>>>>> brfrn...@apache.org>
>>>>>>>> escreveu:
>>>>>>>> 
>>>>>>>>> Hi Wellington,
>>>>>>>>> 
>>>>>>>>> I saw table holes in a customer's cluster actually, and I just fixed
>>>>>>> the
>>>>>>>>> issues
>>>>>>>>> by the workaround I mentioned in HBASE-21665
>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-21665> and I didn't dig
>>>>>>> the
>>>>>>>>> reason
>>>>>>>>> why the table holes happened at that time because the customer didn't
>>>>>>>> want.
>>>>>>>>> 
>>>>>>>>> However, IMO, whatever the reason I think we should have a direct way
>>>>>>> to
>>>>>>>>> fix
>>>>>>>>> holes and overlaps.
>>>>>>>>> 
>>>>>>>>> On Wed, May 29, 2019 at 4:57 AM Wellington Chevreuil <
>>>>>>>>> wellington.chevre...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>>> So JMS, Toshihiro, seems like upgrading from some 1.x to 2.x
>>>>>>>> consistently
>>>>>>>>>> triggers this problem? Do you guys know if there are any bug jiras
>>>>>>> open
>>>>>>>>>> that would cover these scenarios? If not, and if you guys have enough
>>>>>>>>>> resources for investigating it, maybe worth open a specific jira?
>>>>>>>>>> 
>>>>>>>>>> Em qua, 29 de mai de 2019 às 11:40, Jean-Marc Spaggiari <
>>>>>>>>>> jean-m...@spaggiari.org> escreveu:
>>>>>>>>>> 
>>>>>>>>>>> Personnaly, when I tried to upgrade from 1.4.x to 2.2.x I end up
>>>>>>> in a
>>>>>>>>>>> situation where my meta was empty and had to get it repaired, but
>>>>>>>>> lacked
>>>>>>>>>>> OfflineMetaRepair for 2.2.x so I just had to delete all my tables,
>>>>>>>> get
>>>>>>>>> a
>>>>>>>>>>> brand new installation, recreate the tables and bulkload back the
>>>>>>>> data
>>>>>>>>>> into
>>>>>>>>>>> them. Would have been happy to have a OfflineMetaRepair.
>>>>>>>>>>> 
>>>>>>>>>>> But it's more like an experimental cluster than a production one...
>>>>>>>>>>> 
>>>>>>>>>>> JMS
>>>>>>>>>>> 
>>>>>>>>>>> Le mer. 29 mai 2019 à 06:36, Wellington Chevreuil <
>>>>>>>>>>> wellington.chevre...@gmail.com> a écrit :
>>>>>>>>>>> 
>>>>>>>>>>>> Interesting, I haven't seen any cases where OfflineMetaRepair was
>>>>>>>>>> really
>>>>>>>>>>>> required, among our customer base (running cdh6.1.x/hbase2.1.1,
>>>>>>>>>>>> cdh6.2/hbase2.1.2). Majority of RITs issue I had came with on
>>>>>>> hbase
>>>>>>>>> 2.x
>>>>>>>>>>>> were related to APs/SCPs failures, most of which could be sorted
>>>>>>>> with
>>>>>>>>>>> hbck2
>>>>>>>>>>>> commands available by then (in some cases, required some CLI
>>>>>>>>> scripting
>>>>>>>>>> to
>>>>>>>>>>>> build up a "bulk" assign command).
>>>>>>>>>>>> 
>>>>>>>>>>>> Em qua, 29 de mai de 2019 às 00:55, Toshihiro Suzuki <
>>>>>>>>>>> brfrn...@apache.org>
>>>>>>>>>>>> escreveu:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Josh,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thank you for the explanation. I agree with the direction for
>>>>>>>>> HBCK2.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The problem I wanted to tell you in the Jira is that until we
>>>>>>>>>> implement
>>>>>>>>>>>> the
>>>>>>>>>>>>> features
>>>>>>>>>>>>> you mentioned, we don't have any direct way how to fix holes
>>>>>>> and
>>>>>>>>>>>> overlaps.
>>>>>>>>>>>>> The holes and overlaps can be created by bugs or operation
>>>>>>>> errors,
>>>>>>>>>> so I
>>>>>>>>>>>>> think we
>>>>>>>>>>>>> should be able to fix these issues.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I thought OfflineMetaRepair could be a workaround for the
>>>>>>> issues
>>>>>>>>>> until
>>>>>>>>>>> we
>>>>>>>>>>>>> implement
>>>>>>>>>>>>> the features of HBCK2.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Toshi
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Tue, May 28, 2019 at 9:12 AM Josh Elser <els...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Context: https://issues.apache.org/jira/browse/HBASE-21665
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I left a comment on the above issue about what I thought good
>>>>>>>>>> things
>>>>>>>>>>> to
>>>>>>>>>>>>>> build into HBCK2 would be -- a focus on specific "primitive"
>>>>>>>>>>> operations
>>>>>>>>>>>>>> that an admin/operator could use to help repair an otherwise
>>>>>>>>> broken
>>>>>>>>>>>>>> HBase installation. Some examples I had in my head were:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> * Create an empty region (to plug a hole)
>>>>>>>>>>>>>> * Report holes in a region chain
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> In my head, the difference for HBCK2 was that we want to give
>>>>>>>>> folks
>>>>>>>>>>> the
>>>>>>>>>>>>>> tools to fix their cluster, but we did not want to own the
>>>>>>>> "just
>>>>>>>>>> fix
>>>>>>>>>>>>>> everything" kind of tool that HBCK1 had become. That problem
>>>>>>>> with
>>>>>>>>>>> HBCK1
>>>>>>>>>>>>>> was that it was often difficult/problematic for us to know
>>>>>>> how
>>>>>>>> to
>>>>>>>>>>>>>> correctly fix a problem (the same problem could be corrected
>>>>>>> in
>>>>>>>>>>>>>> different ways).
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Andrew had some confusion about this, so I'm not sure if I'm
>>>>>>>>>> off-base
>>>>>>>>>>>> or
>>>>>>>>>>>>>> if we're all in agreement on direction and we just need to
>>>>>>> do a
>>>>>>>>>>> better
>>>>>>>>>>>>>> job documenting things. Thanks for keeping me honest either
>>>>>>> way
>>>>>>>>> :)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> And just in case it doesn't go without saying, HBCK2 would be
>>>>>>>>>>> something
>>>>>>>>>>>>>> that helps fix a system, while we want to always understand
>>>>>>> the
>>>>>>>>>> root
>>>>>>>>>>>>>> cause of how/why we got into a situation where we needed
>>>>>>> HBCK2
>>>>>>>>> and
>>>>>>>>>>> also
>>>>>>>>>>>>>> address that.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> - Josh
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>

Re: [DISCUSS] Direction of HBCK2

Reply via email to