Re: [DISCUSS] Direction of HBCK2

Josh Elser Thu, 30 May 2019 14:17:29 -0700

Great! Thanks for clarifying.

Script-able (and recipes for the common-ish problems -- both those weknow and those we don't) are definitely goals in my head.


On 5/30/19 5:06 PM, Andrew Purtell wrote:

Composable tools are fine if simple and scriptable.

If you read the thread I think my complaint justifiable. It is not that they are 
lacking. It is that they are lacking and the response to the concern is breezy “oh 
just do this <thing that requires dev with deep knowledge>”. Just so we are 
clear what I am criticizing. Someone needs to call out in no uncertain terms how 
operator unfriendly this position is whether intentional or not.

Thanks for the consideration.

On May 30, 2019, at 2:00 PM, Josh Elser <els...@apache.org> wrote:

It sounds to me like you're saying: "No, I don't think compose-able tools are a 
sufficient substitute in HBCK2 for what HBCK1 did".

I'm going to just delete everything else I want to write because it's going to 
turn into a massive argument and de-rail this further. For a second time, 
please stop the complaints about things that don't exist on this thread. We all 
know this already.

On 5/30/19 12:58 PM, Andrew Purtell wrote:
I did a both barrels type response to a suggestion Wellington made that I hope 
communicates the right level of dismay at the prevailing line of thought in 
this thread.
Let me say I agree hbck 1 was sometimes oversold as a magic tool.
However if you analyze all of its options and then look to branch 2, where are 
the gaps. In branch 1 there is a command line tool that can be executed by 
operations and first level support. Its options can be described in a runbook 
with cut and paste examples. In branch 2 ... ?
There appears no ready solution for detecting and deploying undeployed 
“missing” regions.
There appears no ready solution for fixing a failed split or merge or other 
corruption producing a hole or overlap in the region chain.
There appears no tool capable of rebuilding meta from scratch from HDFS level 
metadata; a last but crucial resort as this is what holds the line against a 
complete and time intensive restore from backup.
I may have an incorrect impression of some of this. If so that would be a big 
relief. If not these are suggested areas of focus.
I’m not saying that 2 needs Hbck exactly as it is in 1. However the lack of simple 
recovery tools or actions that can be taken by a non expert guided by a runbook means 
the risk to operations when there is the inevitable problem is higher. And I don’t 
mean theoretical problems. I mean the commonly occurring issues Hbck 1 was coded up 
to address in a mostly automated way, like failed splits or failed deployments or 
simple HDFS level corruptions like loss of meta region hfiles. Lacking simple tooling 
our operations will have to do <something> more complex, labor intensive, and 
or risky. This factors in to the major version upgrade risk analysis.
What I would advise is an analysis that enumerates all of the risks and 
specific conditions Hbck 1 addresses, then excludes those not relevant for the 
2 code base, then excludes those which have easy and simple tools existing 
right now to solve. What you have left is a list of action items. Then there 
should be an analysis of the new risks in 2 given AMv2s theory of operation, 
for example for each procedure based action if the procedure is always failing 
how can the operator recover the prerequisites for successful completion, and 
provide a simple tool or option for applying a fix or remediation to cluster 
state.

On May 30, 2019, at 7:16 AM, Josh Elser <els...@apache.org> wrote:

Right, this discuss isn't meant to be implying that any of this exists -- 
instead, I wanted to make sure we're focused on building tooling which both 
devs and users will find usable and effective.

What's your gut-reaction to what I suggested? I think you're saying you see operators having to 
apply more understanding/insight to fix a "complex problem" as taking on more risk which 
you'd have to weigh. In other words, anything less than the verbatim "fix these problems" 
flags you mentioned earlier would require you to do the risk-analysis math if moving to HBase2?

Thanks for your insights.

On 5/29/19 4:45 PM, Andrew Purtell wrote:
I have yet to see essential HBCK functions in 1 replaced by anything -
documentation, script, hbck2, whatever.
Do we have a tool or script in HBase 2 that can rebuild meta from HDFS
state? This would be faster than a complete restore from backup. It would
be useful and important to offer this option to operators, but not
essential, because it could be valid to say if meta is screwed so are you
and you have to restore completely from backup. Meta is small, a fraction
of total data footprint. Seems a real shame to impose such a high cost when
there could be an alternative. I'd have to think for a while about
accepting this kind of operational risk when HBase 1 has such tooling.
What I am more worried about is this: Do we have a tool or script in HBase
2 that can fix errors in the region chain caused by failed splits, failed
merges, or double assignment? It seems not, and the implications for
service availability are not good when compared with HBase 1. With HBase 1,
hbck is an option. Sure, it has a lot of problematic aspects, but I have
seen it recover a cluster's total availability with fairly fast execution.
It could be valid, not saying I agree with this point, to clearly document
that all aspects of recovery from corrupted metadata is the responsibility
of the operator, at least this is full disclosure. We can then weigh the
cost and risk associated with this policy when deciding if ever to upgrade.

On Wed, May 29, 2019 at 1:13 PM Josh Elser <els...@apache.org> wrote:
My understanding was that recreating sweeping "fix it" flags was an
anti-goal of HBCK2, but I'm surprised a grey-beard hasn't come in to say
confirm/dispute that :). I could be taking that out of context or my dog
remembers things better than I do.

The reasoning behind this line of thinking for HBCK2 is:

* Smaller actions are easier to implement correctly and be well-tested
* The more complex the action, the more likely it is for something we
(as devs) didn't expect to happen which results in a bug.

The "stretch" in my mind is that we can string together small actions to
recreate the bigger ones (the fix* type commands from hbck1), *but*
teach operators to apply knowledge about their cluster instead of
treating hbck like a black box.

For example, if we try to decompose something like fixAssignments into
something like: `for region in $(list non-open regions); do assign
$region; end`. As developers, we don't have to catch every edge case of
_something_ that might be specific to the admin's actual situation (e.g.
what if a table is disabled and we don't want to assign those regions)
and it lets us write better test cases.

Again, this is what I have floating around in my head -- nothing more
than that at present.

On 5/29/19 11:54 AM, Andrew Purtell wrote:
To me this is a succinct specification of minimum functionality for a
recovery tool: using on disk bits, rebuild meta table, with end result a
working cluster that did not miss any data during the reconstruction.

Of course focusing on root causes of metadata mismanagement is

appropriate

when investigating a specific incident, but this is orthogonal from the
question of whether or not recovery is possible after a bug corrupts
metadata. It is customary for filesystems and databases to ship with a

tool

that attempts recovery after corruption, on the (correct, IMHO)

assumption

that corruption is inevitable, either due to logic bug, hardware

problems,

or operator error.

The features of hbck in HBase 1 that have resolved availability problems
where I work are: fixMeta, fixAssignments, fixHdfsHoles, fixHdfsOverlaps.
In HBaseFsck.java in branch-2 these are all in the unsupported options

set.

Because these are all lacking in HBase 2 I will not certify it ready for
production to my employer. If there is some other tool which offers these
recovery options I'm not aware of it nor documentation for it and would
appreciate a pointer if you have one.


On Wed, May 29, 2019 at 7:11 AM Toshihiro Suzuki <brfrn...@apache.org>
wrote:

Thanks Wellington.

I guess those can still be fixed with some combinations of commands

today,

such as merge/assign.


Let me explain the situation I faced in the customer's cluster a little

bit

more.
It seemed like the table data in HDFS was intact but they lost some meta
data
(in hbase:meta) of the table. So I needed to rebuild the meta from HDFS
data.
In this case, we can still fix with some combinations of commands

today? If

so,
I would appreciate it if you could suggest the steps to me.

And focus on fixing the main root cause of such problems, as a mean to
soften the need of use such commands.


Yes, correct. Actually I usually do that. But I didn't do that in that
case..


On Wed, May 29, 2019 at 5:47 AM Wellington Chevreuil <
wellington.chevre...@gmail.com> wrote:

Thanks Toshihiro! I guess those can still be fixed with some

combinations

of commands today, such as merge/assign. Of course, it requires some

extra

scripting and log reading on cases where many regions are in an
inconsistent state, maybe we should work on provide a one liner command
that relies on the current existing ones. And focus on fixing the main

root

cause of such problems, as a mean to soften the need of use such

commands.


I'm not really a fan of offlinemetarepair, nor hbck1 fix

holes/overlaps,

would rather not have those back. Sure those are easy and convenient to
trigger, but hbck1 reports are sometimes misleading (for instance, it
reports holes when region(s) on the chain is/are simply not online),

and

that, combined with availability of such heavy hammers had led
unexperienced operators to fall into running it and getting into a

worse

state.

Em qua, 29 de mai de 2019 às 13:22, Toshihiro Suzuki <

brfrn...@apache.org>

escreveu:

Hi Wellington,

I saw table holes in a customer's cluster actually, and I just fixed

the

issues
by the workaround I mentioned in HBASE-21665
<https://issues.apache.org/jira/browse/HBASE-21665> and I didn't dig

the

reason
why the table holes happened at that time because the customer didn't

want.


However, IMO, whatever the reason I think we should have a direct way

to

fix
holes and overlaps.

On Wed, May 29, 2019 at 4:57 AM Wellington Chevreuil <
wellington.chevre...@gmail.com> wrote:

So JMS, Toshihiro, seems like upgrading from some 1.x to 2.x

consistently

triggers this problem? Do you guys know if there are any bug jiras

open

that would cover these scenarios? If not, and if you guys have enough
resources for investigating it, maybe worth open a specific jira?

Em qua, 29 de mai de 2019 às 11:40, Jean-Marc Spaggiari <
jean-m...@spaggiari.org> escreveu:

Personnaly, when I tried to upgrade from 1.4.x to 2.2.x I end up

in a

situation where my meta was empty and had to get it repaired, but

lacked

OfflineMetaRepair for 2.2.x so I just had to delete all my tables,

get

brand new installation, recreate the tables and bulkload back the

data

into

them. Would have been happy to have a OfflineMetaRepair.

But it's more like an experimental cluster than a production one...

JMS

Le mer. 29 mai 2019 à 06:36, Wellington Chevreuil <
wellington.chevre...@gmail.com> a écrit :

Interesting, I haven't seen any cases where OfflineMetaRepair was

really

required, among our customer base (running cdh6.1.x/hbase2.1.1,
cdh6.2/hbase2.1.2). Majority of RITs issue I had came with on

hbase

2.x

were related to APs/SCPs failures, most of which could be sorted

with

hbck2

commands available by then (in some cases, required some CLI

scripting

to

build up a "bulk" assign command).

Em qua, 29 de mai de 2019 às 00:55, Toshihiro Suzuki <

brfrn...@apache.org>

escreveu:

Hi Josh,

Thank you for the explanation. I agree with the direction for

HBCK2.


The problem I wanted to tell you in the Jira is that until we

implement

the

features
you mentioned, we don't have any direct way how to fix holes

and

overlaps.

The holes and overlaps can be created by bugs or operation

errors,

so I

think we
should be able to fix these issues.

I thought OfflineMetaRepair could be a workaround for the

issues

until

we

implement
the features of HBCK2.

Regards,
Toshi


On Tue, May 28, 2019 at 9:12 AM Josh Elser <els...@apache.org>

wrote:

Context: https://issues.apache.org/jira/browse/HBASE-21665

I left a comment on the above issue about what I thought good

things

to

build into HBCK2 would be -- a focus on specific "primitive"

operations

that an admin/operator could use to help repair an otherwise

broken

HBase installation. Some examples I had in my head were:

* Create an empty region (to plug a hole)
* Report holes in a region chain

In my head, the difference for HBCK2 was that we want to give

folks

the

tools to fix their cluster, but we did not want to own the

"just

fix

everything" kind of tool that HBCK1 had become. That problem

with

HBCK1

was that it was often difficult/problematic for us to know

how

to

correctly fix a problem (the same problem could be corrected

in

different ways).

Andrew had some confusion about this, so I'm not sure if I'm

off-base

or

if we're all in agreement on direction and we just need to

do a

better

job documenting things. Thanks for keeping me honest either

way

:)


And just in case it doesn't go without saying, HBCK2 would be

something

that helps fix a system, while we want to always understand

the

root

cause of how/why we got into a situation where we needed

HBCK2

and

also

address that.

- Josh

Re: [DISCUSS] Direction of HBCK2

Reply via email to