Re: [DISCUSS] Direction of HBCK2

Wellington Chevreuil Mon, 03 Jun 2019 05:19:56 -0700

My take from this discussion is that some problems are worth adding "simple
to use" hbck2 commands (such as the meta missing regions one), while some
simpler/less critical problems could have recipes for using current
available commands documented. I suppose HBASE-21745 should be use for
triage of recurring/potential problems and discuss ideal solution (if it
should either demand a new hbck2 command, or can be a documented recipe).


Em qui, 30 de mai de 2019 às 22:17, Josh Elser <els...@apache.org> escreveu:

> Great! Thanks for clarifying.
>
> Script-able (and recipes for the common-ish problems -- both those we
> know and those we don't) are definitely goals in my head.
>
> On 5/30/19 5:06 PM, Andrew Purtell wrote:
> > Composable tools are fine if simple and scriptable.
> >
> > If you read the thread I think my complaint justifiable. It is not that
> they are lacking. It is that they are lacking and the response to the
> concern is breezy “oh just do this <thing that requires dev with deep
> knowledge>”. Just so we are clear what I am criticizing. Someone needs to
> call out in no uncertain terms how operator unfriendly this position is
> whether intentional or not.
> >
> > Thanks for the consideration.
> >
> >> On May 30, 2019, at 2:00 PM, Josh Elser <els...@apache.org> wrote:
> >>
> >> It sounds to me like you're saying: "No, I don't think compose-able
> tools are a sufficient substitute in HBCK2 for what HBCK1 did".
> >>
> >> I'm going to just delete everything else I want to write because it's
> going to turn into a massive argument and de-rail this further. For a
> second time, please stop the complaints about things that don't exist on
> this thread. We all know this already.
> >>
> >>> On 5/30/19 12:58 PM, Andrew Purtell wrote:
> >>> I did a both barrels type response to a suggestion Wellington made
> that I hope communicates the right level of dismay at the prevailing line
> of thought in this thread.
> >>> Let me say I agree hbck 1 was sometimes oversold as a magic tool.
> >>> However if you analyze all of its options and then look to branch 2,
> where are the gaps. In branch 1 there is a command line tool that can be
> executed by operations and first level support. Its options can be
> described in a runbook with cut and paste examples. In branch 2 ... ?
> >>> There appears no ready solution for detecting and deploying undeployed
> “missing” regions.
> >>> There appears no ready solution for fixing a failed split or merge or
> other corruption producing a hole or overlap in the region chain.
> >>> There appears no tool capable of rebuilding meta from scratch from
> HDFS level metadata; a last but crucial resort as this is what holds the
> line against a complete and time intensive restore from backup.
> >>> I may have an incorrect impression of some of this. If so that would
> be a big relief. If not these are suggested areas of focus.
> >>> I’m not saying that 2 needs Hbck exactly as it is in 1. However the
> lack of simple recovery tools or actions that can be taken by a non expert
> guided by a runbook means the risk to operations when there is the
> inevitable problem is higher. And I don’t mean theoretical problems. I mean
> the commonly occurring issues Hbck 1 was coded up to address in a mostly
> automated way, like failed splits or failed deployments or simple HDFS
> level corruptions like loss of meta region hfiles. Lacking simple tooling
> our operations will have to do <something> more complex, labor intensive,
> and or risky. This factors in to the major version upgrade risk analysis.
> >>> What I would advise is an analysis that enumerates all of the risks
> and specific conditions Hbck 1 addresses, then excludes those not relevant
> for the 2 code base, then excludes those which have easy and simple tools
> existing right now to solve. What you have left is a list of action items.
> Then there should be an analysis of the new risks in 2 given AMv2s theory
> of operation, for example for each procedure based action if the procedure
> is always failing how can the operator recover the prerequisites for
> successful completion, and provide a simple tool or option for applying a
> fix or remediation to cluster state.
> >>>> On May 30, 2019, at 7:16 AM, Josh Elser <els...@apache.org> wrote:
> >>>>
> >>>> Right, this discuss isn't meant to be implying that any of this
> exists -- instead, I wanted to make sure we're focused on building tooling
> which both devs and users will find usable and effective.
> >>>>
> >>>> What's your gut-reaction to what I suggested? I think you're saying
> you see operators having to apply more understanding/insight to fix a
> "complex problem" as taking on more risk which you'd have to weigh. In
> other words, anything less than the verbatim "fix these problems" flags you
> mentioned earlier would require you to do the risk-analysis math if moving
> to HBase2?
> >>>>
> >>>> Thanks for your insights.
> >>>>
> >>>>> On 5/29/19 4:45 PM, Andrew Purtell wrote:
> >>>>> I have yet to see essential HBCK functions in 1 replaced by anything
> -
> >>>>> documentation, script, hbck2, whatever.
> >>>>> Do we have a tool or script in HBase 2 that can rebuild meta from
> HDFS
> >>>>> state? This would be faster than a complete restore from backup. It
> would
> >>>>> be useful and important to offer this option to operators, but not
> >>>>> essential, because it could be valid to say if meta is screwed so
> are you
> >>>>> and you have to restore completely from backup. Meta is small, a
> fraction
> >>>>> of total data footprint. Seems a real shame to impose such a high
> cost when
> >>>>> there could be an alternative. I'd have to think for a while about
> >>>>> accepting this kind of operational risk when HBase 1 has such
> tooling.
> >>>>> What I am more worried about is this: Do we have a tool or script in
> HBase
> >>>>> 2 that can fix errors in the region chain caused by failed splits,
> failed
> >>>>> merges, or double assignment? It seems not, and the implications for
> >>>>> service availability are not good when compared with HBase 1. With
> HBase 1,
> >>>>> hbck is an option. Sure, it has a lot of problematic aspects, but I
> have
> >>>>> seen it recover a cluster's total availability with fairly fast
> execution.
> >>>>> It could be valid, not saying I agree with this point, to clearly
> document
> >>>>> that all aspects of recovery from corrupted metadata is the
> responsibility
> >>>>> of the operator, at least this is full disclosure. We can then weigh
> the
> >>>>> cost and risk associated with this policy when deciding if ever to
> upgrade.
> >>>>>> On Wed, May 29, 2019 at 1:13 PM Josh Elser <els...@apache.org>
> wrote:
> >>>>>> My understanding was that recreating sweeping "fix it" flags was an
> >>>>>> anti-goal of HBCK2, but I'm surprised a grey-beard hasn't come in
> to say
> >>>>>> confirm/dispute that :). I could be taking that out of context or
> my dog
> >>>>>> remembers things better than I do.
> >>>>>>
> >>>>>> The reasoning behind this line of thinking for HBCK2 is:
> >>>>>>
> >>>>>> * Smaller actions are easier to implement correctly and be
> well-tested
> >>>>>> * The more complex the action, the more likely it is for something
> we
> >>>>>> (as devs) didn't expect to happen which results in a bug.
> >>>>>>
> >>>>>> The "stretch" in my mind is that we can string together small
> actions to
> >>>>>> recreate the bigger ones (the fix* type commands from hbck1), *but*
> >>>>>> teach operators to apply knowledge about their cluster instead of
> >>>>>> treating hbck like a black box.
> >>>>>>
> >>>>>> For example, if we try to decompose something like fixAssignments
> into
> >>>>>> something like: `for region in $(list non-open regions); do assign
> >>>>>> $region; end`. As developers, we don't have to catch every edge
> case of
> >>>>>> _something_ that might be specific to the admin's actual situation
> (e.g.
> >>>>>> what if a table is disabled and we don't want to assign those
> regions)
> >>>>>> and it lets us write better test cases.
> >>>>>>
> >>>>>> Again, this is what I have floating around in my head -- nothing
> more
> >>>>>> than that at present.
> >>>>>>
> >>>>>>> On 5/29/19 11:54 AM, Andrew Purtell wrote:
> >>>>>>> To me this is a succinct specification of minimum functionality
> for a
> >>>>>>> recovery tool: using on disk bits, rebuild meta table, with end
> result a
> >>>>>>> working cluster that did not miss any data during the
> reconstruction.
> >>>>>>>
> >>>>>>> Of course focusing on root causes of metadata mismanagement is
> >>>>>> appropriate
> >>>>>>> when investigating a specific incident, but this is orthogonal
> from the
> >>>>>>> question of whether or not recovery is possible after a bug
> corrupts
> >>>>>>> metadata. It is customary for filesystems and databases to ship
> with a
> >>>>>> tool
> >>>>>>> that attempts recovery after corruption, on the (correct, IMHO)
> >>>>>> assumption
> >>>>>>> that corruption is inevitable, either due to logic bug, hardware
> >>>>>> problems,
> >>>>>>> or operator error.
> >>>>>>>
> >>>>>>> The features of hbck in HBase 1 that have resolved availability
> problems
> >>>>>>> where I work are: fixMeta, fixAssignments, fixHdfsHoles,
> fixHdfsOverlaps.
> >>>>>>> In HBaseFsck.java in branch-2 these are all in the unsupported
> options
> >>>>>> set.
> >>>>>>> Because these are all lacking in HBase 2 I will not certify it
> ready for
> >>>>>>> production to my employer. If there is some other tool which
> offers these
> >>>>>>> recovery options I'm not aware of it nor documentation for it and
> would
> >>>>>>> appreciate a pointer if you have one.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, May 29, 2019 at 7:11 AM Toshihiro Suzuki <
> brfrn...@apache.org>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks Wellington.
> >>>>>>>>
> >>>>>>>>> I guess those can still be fixed with some combinations of
> commands
> >>>>>>>> today,
> >>>>>>>>> such as merge/assign.
> >>>>>>>>
> >>>>>>>> Let me explain the situation I faced in the customer's cluster a
> little
> >>>>>> bit
> >>>>>>>> more.
> >>>>>>>> It seemed like the table data in HDFS was intact but they lost
> some meta
> >>>>>>>> data
> >>>>>>>> (in hbase:meta) of the table. So I needed to rebuild the meta
> from HDFS
> >>>>>>>> data.
> >>>>>>>> In this case, we can still fix with some combinations of commands
> >>>>>> today? If
> >>>>>>>> so,
> >>>>>>>> I would appreciate it if you could suggest the steps to me.
> >>>>>>>>
> >>>>>>>>> And focus on fixing the main root cause of such problems, as a
> mean to
> >>>>>>>>> soften the need of use such commands.
> >>>>>>>>
> >>>>>>>> Yes, correct. Actually I usually do that. But I didn't do that in
> that
> >>>>>>>> case..
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, May 29, 2019 at 5:47 AM Wellington Chevreuil <
> >>>>>>>> wellington.chevre...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks Toshihiro! I guess those can still be fixed with some
> >>>>>> combinations
> >>>>>>>>> of commands today, such as merge/assign. Of course, it requires
> some
> >>>>>>>> extra
> >>>>>>>>> scripting and log reading on cases where many regions are in an
> >>>>>>>>> inconsistent state, maybe we should work on provide a one liner
> command
> >>>>>>>>> that relies on the current existing ones. And focus on fixing
> the main
> >>>>>>>> root
> >>>>>>>>> cause of such problems, as a mean to soften the need of use such
> >>>>>>>> commands.
> >>>>>>>>>
> >>>>>>>>> I'm not really a fan of offlinemetarepair, nor hbck1 fix
> >>>>>> holes/overlaps,
> >>>>>>>>> would rather not have those back. Sure those are easy and
> convenient to
> >>>>>>>>> trigger, but hbck1 reports are sometimes misleading (for
> instance, it
> >>>>>>>>> reports holes when region(s) on the chain is/are simply not
> online),
> >>>>>> and
> >>>>>>>>> that, combined with availability of such heavy hammers had led
> >>>>>>>>> unexperienced operators to fall into running it and getting into
> a
> >>>>>> worse
> >>>>>>>>> state.
> >>>>>>>>>
> >>>>>>>>> Em qua, 29 de mai de 2019 às 13:22, Toshihiro Suzuki <
> >>>>>>>> brfrn...@apache.org>
> >>>>>>>>> escreveu:
> >>>>>>>>>
> >>>>>>>>>> Hi Wellington,
> >>>>>>>>>>
> >>>>>>>>>> I saw table holes in a customer's cluster actually, and I just
> fixed
> >>>>>>>> the
> >>>>>>>>>> issues
> >>>>>>>>>> by the workaround I mentioned in HBASE-21665
> >>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-21665> and I
> didn't dig
> >>>>>>>> the
> >>>>>>>>>> reason
> >>>>>>>>>> why the table holes happened at that time because the customer
> didn't
> >>>>>>>>> want.
> >>>>>>>>>>
> >>>>>>>>>> However, IMO, whatever the reason I think we should have a
> direct way
> >>>>>>>> to
> >>>>>>>>>> fix
> >>>>>>>>>> holes and overlaps.
> >>>>>>>>>>
> >>>>>>>>>> On Wed, May 29, 2019 at 4:57 AM Wellington Chevreuil <
> >>>>>>>>>> wellington.chevre...@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> So JMS, Toshihiro, seems like upgrading from some 1.x to 2.x
> >>>>>>>>> consistently
> >>>>>>>>>>> triggers this problem? Do you guys know if there are any bug
> jiras
> >>>>>>>> open
> >>>>>>>>>>> that would cover these scenarios? If not, and if you guys have
> enough
> >>>>>>>>>>> resources for investigating it, maybe worth open a specific
> jira?
> >>>>>>>>>>>
> >>>>>>>>>>> Em qua, 29 de mai de 2019 às 11:40, Jean-Marc Spaggiari <
> >>>>>>>>>>> jean-m...@spaggiari.org> escreveu:
> >>>>>>>>>>>
> >>>>>>>>>>>> Personnaly, when I tried to upgrade from 1.4.x to 2.2.x I end
> up
> >>>>>>>> in a
> >>>>>>>>>>>> situation where my meta was empty and had to get it repaired,
> but
> >>>>>>>>>> lacked
> >>>>>>>>>>>> OfflineMetaRepair for 2.2.x so I just had to delete all my
> tables,
> >>>>>>>>> get
> >>>>>>>>>> a
> >>>>>>>>>>>> brand new installation, recreate the tables and bulkload back
> the
> >>>>>>>>> data
> >>>>>>>>>>> into
> >>>>>>>>>>>> them. Would have been happy to have a OfflineMetaRepair.
> >>>>>>>>>>>>
> >>>>>>>>>>>> But it's more like an experimental cluster than a production
> one...
> >>>>>>>>>>>>
> >>>>>>>>>>>> JMS
> >>>>>>>>>>>>
> >>>>>>>>>>>> Le mer. 29 mai 2019 à 06:36, Wellington Chevreuil <
> >>>>>>>>>>>> wellington.chevre...@gmail.com> a écrit :
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Interesting, I haven't seen any cases where
> OfflineMetaRepair was
> >>>>>>>>>>> really
> >>>>>>>>>>>>> required, among our customer base (running
> cdh6.1.x/hbase2.1.1,
> >>>>>>>>>>>>> cdh6.2/hbase2.1.2). Majority of RITs issue I had came with on
> >>>>>>>> hbase
> >>>>>>>>>> 2.x
> >>>>>>>>>>>>> were related to APs/SCPs failures, most of which could be
> sorted
> >>>>>>>>> with
> >>>>>>>>>>>> hbck2
> >>>>>>>>>>>>> commands available by then (in some cases, required some CLI
> >>>>>>>>>> scripting
> >>>>>>>>>>> to
> >>>>>>>>>>>>> build up a "bulk" assign command).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Em qua, 29 de mai de 2019 às 00:55, Toshihiro Suzuki <
> >>>>>>>>>>>> brfrn...@apache.org>
> >>>>>>>>>>>>> escreveu:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Josh,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thank you for the explanation. I agree with the direction
> for
> >>>>>>>>>> HBCK2.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The problem I wanted to tell you in the Jira is that until
> we
> >>>>>>>>>>> implement
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>> features
> >>>>>>>>>>>>>> you mentioned, we don't have any direct way how to fix holes
> >>>>>>>> and
> >>>>>>>>>>>>> overlaps.
> >>>>>>>>>>>>>> The holes and overlaps can be created by bugs or operation
> >>>>>>>>> errors,
> >>>>>>>>>>> so I
> >>>>>>>>>>>>>> think we
> >>>>>>>>>>>>>> should be able to fix these issues.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I thought OfflineMetaRepair could be a workaround for the
> >>>>>>>> issues
> >>>>>>>>>>> until
> >>>>>>>>>>>> we
> >>>>>>>>>>>>>> implement
> >>>>>>>>>>>>>> the features of HBCK2.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Toshi
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Tue, May 28, 2019 at 9:12 AM Josh Elser <
> els...@apache.org>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Context: https://issues.apache.org/jira/browse/HBASE-21665
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I left a comment on the above issue about what I thought
> good
> >>>>>>>>>>> things
> >>>>>>>>>>>> to
> >>>>>>>>>>>>>>> build into HBCK2 would be -- a focus on specific
> "primitive"
> >>>>>>>>>>>> operations
> >>>>>>>>>>>>>>> that an admin/operator could use to help repair an
> otherwise
> >>>>>>>>>> broken
> >>>>>>>>>>>>>>> HBase installation. Some examples I had in my head were:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> * Create an empty region (to plug a hole)
> >>>>>>>>>>>>>>> * Report holes in a region chain
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In my head, the difference for HBCK2 was that we want to
> give
> >>>>>>>>>> folks
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> tools to fix their cluster, but we did not want to own the
> >>>>>>>>> "just
> >>>>>>>>>>> fix
> >>>>>>>>>>>>>>> everything" kind of tool that HBCK1 had become. That
> problem
> >>>>>>>>> with
> >>>>>>>>>>>> HBCK1
> >>>>>>>>>>>>>>> was that it was often difficult/problematic for us to know
> >>>>>>>> how
> >>>>>>>>> to
> >>>>>>>>>>>>>>> correctly fix a problem (the same problem could be
> corrected
> >>>>>>>> in
> >>>>>>>>>>>>>>> different ways).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Andrew had some confusion about this, so I'm not sure if
> I'm
> >>>>>>>>>>> off-base
> >>>>>>>>>>>>> or
> >>>>>>>>>>>>>>> if we're all in agreement on direction and we just need to
> >>>>>>>> do a
> >>>>>>>>>>>> better
> >>>>>>>>>>>>>>> job documenting things. Thanks for keeping me honest either
> >>>>>>>> way
> >>>>>>>>>> :)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> And just in case it doesn't go without saying, HBCK2 would
> be
> >>>>>>>>>>>> something
> >>>>>>>>>>>>>>> that helps fix a system, while we want to always understand
> >>>>>>>> the
> >>>>>>>>>>> root
> >>>>>>>>>>>>>>> cause of how/why we got into a situation where we needed
> >>>>>>>> HBCK2
> >>>>>>>>>> and
> >>>>>>>>>>>> also
> >>>>>>>>>>>>>>> address that.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Josh
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
>

Re: [DISCUSS] Direction of HBCK2

Reply via email to