My take from this discussion is that some problems are worth adding "simple to use" hbck2 commands (such as the meta missing regions one), while some simpler/less critical problems could have recipes for using current available commands documented. I suppose HBASE-21745 should be use for triage of recurring/potential problems and discuss ideal solution (if it should either demand a new hbck2 command, or can be a documented recipe).
Em qui, 30 de mai de 2019 às 22:17, Josh Elser <els...@apache.org> escreveu: > Great! Thanks for clarifying. > > Script-able (and recipes for the common-ish problems -- both those we > know and those we don't) are definitely goals in my head. > > On 5/30/19 5:06 PM, Andrew Purtell wrote: > > Composable tools are fine if simple and scriptable. > > > > If you read the thread I think my complaint justifiable. It is not that > they are lacking. It is that they are lacking and the response to the > concern is breezy “oh just do this <thing that requires dev with deep > knowledge>”. Just so we are clear what I am criticizing. Someone needs to > call out in no uncertain terms how operator unfriendly this position is > whether intentional or not. > > > > Thanks for the consideration. > > > >> On May 30, 2019, at 2:00 PM, Josh Elser <els...@apache.org> wrote: > >> > >> It sounds to me like you're saying: "No, I don't think compose-able > tools are a sufficient substitute in HBCK2 for what HBCK1 did". > >> > >> I'm going to just delete everything else I want to write because it's > going to turn into a massive argument and de-rail this further. For a > second time, please stop the complaints about things that don't exist on > this thread. We all know this already. > >> > >>> On 5/30/19 12:58 PM, Andrew Purtell wrote: > >>> I did a both barrels type response to a suggestion Wellington made > that I hope communicates the right level of dismay at the prevailing line > of thought in this thread. > >>> Let me say I agree hbck 1 was sometimes oversold as a magic tool. > >>> However if you analyze all of its options and then look to branch 2, > where are the gaps. In branch 1 there is a command line tool that can be > executed by operations and first level support. Its options can be > described in a runbook with cut and paste examples. In branch 2 ... ? > >>> There appears no ready solution for detecting and deploying undeployed > “missing” regions. > >>> There appears no ready solution for fixing a failed split or merge or > other corruption producing a hole or overlap in the region chain. > >>> There appears no tool capable of rebuilding meta from scratch from > HDFS level metadata; a last but crucial resort as this is what holds the > line against a complete and time intensive restore from backup. > >>> I may have an incorrect impression of some of this. If so that would > be a big relief. If not these are suggested areas of focus. > >>> I’m not saying that 2 needs Hbck exactly as it is in 1. However the > lack of simple recovery tools or actions that can be taken by a non expert > guided by a runbook means the risk to operations when there is the > inevitable problem is higher. And I don’t mean theoretical problems. I mean > the commonly occurring issues Hbck 1 was coded up to address in a mostly > automated way, like failed splits or failed deployments or simple HDFS > level corruptions like loss of meta region hfiles. Lacking simple tooling > our operations will have to do <something> more complex, labor intensive, > and or risky. This factors in to the major version upgrade risk analysis. > >>> What I would advise is an analysis that enumerates all of the risks > and specific conditions Hbck 1 addresses, then excludes those not relevant > for the 2 code base, then excludes those which have easy and simple tools > existing right now to solve. What you have left is a list of action items. > Then there should be an analysis of the new risks in 2 given AMv2s theory > of operation, for example for each procedure based action if the procedure > is always failing how can the operator recover the prerequisites for > successful completion, and provide a simple tool or option for applying a > fix or remediation to cluster state. > >>>> On May 30, 2019, at 7:16 AM, Josh Elser <els...@apache.org> wrote: > >>>> > >>>> Right, this discuss isn't meant to be implying that any of this > exists -- instead, I wanted to make sure we're focused on building tooling > which both devs and users will find usable and effective. > >>>> > >>>> What's your gut-reaction to what I suggested? I think you're saying > you see operators having to apply more understanding/insight to fix a > "complex problem" as taking on more risk which you'd have to weigh. In > other words, anything less than the verbatim "fix these problems" flags you > mentioned earlier would require you to do the risk-analysis math if moving > to HBase2? > >>>> > >>>> Thanks for your insights. > >>>> > >>>>> On 5/29/19 4:45 PM, Andrew Purtell wrote: > >>>>> I have yet to see essential HBCK functions in 1 replaced by anything > - > >>>>> documentation, script, hbck2, whatever. > >>>>> Do we have a tool or script in HBase 2 that can rebuild meta from > HDFS > >>>>> state? This would be faster than a complete restore from backup. It > would > >>>>> be useful and important to offer this option to operators, but not > >>>>> essential, because it could be valid to say if meta is screwed so > are you > >>>>> and you have to restore completely from backup. Meta is small, a > fraction > >>>>> of total data footprint. Seems a real shame to impose such a high > cost when > >>>>> there could be an alternative. I'd have to think for a while about > >>>>> accepting this kind of operational risk when HBase 1 has such > tooling. > >>>>> What I am more worried about is this: Do we have a tool or script in > HBase > >>>>> 2 that can fix errors in the region chain caused by failed splits, > failed > >>>>> merges, or double assignment? It seems not, and the implications for > >>>>> service availability are not good when compared with HBase 1. With > HBase 1, > >>>>> hbck is an option. Sure, it has a lot of problematic aspects, but I > have > >>>>> seen it recover a cluster's total availability with fairly fast > execution. > >>>>> It could be valid, not saying I agree with this point, to clearly > document > >>>>> that all aspects of recovery from corrupted metadata is the > responsibility > >>>>> of the operator, at least this is full disclosure. We can then weigh > the > >>>>> cost and risk associated with this policy when deciding if ever to > upgrade. > >>>>>> On Wed, May 29, 2019 at 1:13 PM Josh Elser <els...@apache.org> > wrote: > >>>>>> My understanding was that recreating sweeping "fix it" flags was an > >>>>>> anti-goal of HBCK2, but I'm surprised a grey-beard hasn't come in > to say > >>>>>> confirm/dispute that :). I could be taking that out of context or > my dog > >>>>>> remembers things better than I do. > >>>>>> > >>>>>> The reasoning behind this line of thinking for HBCK2 is: > >>>>>> > >>>>>> * Smaller actions are easier to implement correctly and be > well-tested > >>>>>> * The more complex the action, the more likely it is for something > we > >>>>>> (as devs) didn't expect to happen which results in a bug. > >>>>>> > >>>>>> The "stretch" in my mind is that we can string together small > actions to > >>>>>> recreate the bigger ones (the fix* type commands from hbck1), *but* > >>>>>> teach operators to apply knowledge about their cluster instead of > >>>>>> treating hbck like a black box. > >>>>>> > >>>>>> For example, if we try to decompose something like fixAssignments > into > >>>>>> something like: `for region in $(list non-open regions); do assign > >>>>>> $region; end`. As developers, we don't have to catch every edge > case of > >>>>>> _something_ that might be specific to the admin's actual situation > (e.g. > >>>>>> what if a table is disabled and we don't want to assign those > regions) > >>>>>> and it lets us write better test cases. > >>>>>> > >>>>>> Again, this is what I have floating around in my head -- nothing > more > >>>>>> than that at present. > >>>>>> > >>>>>>> On 5/29/19 11:54 AM, Andrew Purtell wrote: > >>>>>>> To me this is a succinct specification of minimum functionality > for a > >>>>>>> recovery tool: using on disk bits, rebuild meta table, with end > result a > >>>>>>> working cluster that did not miss any data during the > reconstruction. > >>>>>>> > >>>>>>> Of course focusing on root causes of metadata mismanagement is > >>>>>> appropriate > >>>>>>> when investigating a specific incident, but this is orthogonal > from the > >>>>>>> question of whether or not recovery is possible after a bug > corrupts > >>>>>>> metadata. It is customary for filesystems and databases to ship > with a > >>>>>> tool > >>>>>>> that attempts recovery after corruption, on the (correct, IMHO) > >>>>>> assumption > >>>>>>> that corruption is inevitable, either due to logic bug, hardware > >>>>>> problems, > >>>>>>> or operator error. > >>>>>>> > >>>>>>> The features of hbck in HBase 1 that have resolved availability > problems > >>>>>>> where I work are: fixMeta, fixAssignments, fixHdfsHoles, > fixHdfsOverlaps. > >>>>>>> In HBaseFsck.java in branch-2 these are all in the unsupported > options > >>>>>> set. > >>>>>>> Because these are all lacking in HBase 2 I will not certify it > ready for > >>>>>>> production to my employer. If there is some other tool which > offers these > >>>>>>> recovery options I'm not aware of it nor documentation for it and > would > >>>>>>> appreciate a pointer if you have one. > >>>>>>> > >>>>>>> > >>>>>>> On Wed, May 29, 2019 at 7:11 AM Toshihiro Suzuki < > brfrn...@apache.org> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Thanks Wellington. > >>>>>>>> > >>>>>>>>> I guess those can still be fixed with some combinations of > commands > >>>>>>>> today, > >>>>>>>>> such as merge/assign. > >>>>>>>> > >>>>>>>> Let me explain the situation I faced in the customer's cluster a > little > >>>>>> bit > >>>>>>>> more. > >>>>>>>> It seemed like the table data in HDFS was intact but they lost > some meta > >>>>>>>> data > >>>>>>>> (in hbase:meta) of the table. So I needed to rebuild the meta > from HDFS > >>>>>>>> data. > >>>>>>>> In this case, we can still fix with some combinations of commands > >>>>>> today? If > >>>>>>>> so, > >>>>>>>> I would appreciate it if you could suggest the steps to me. > >>>>>>>> > >>>>>>>>> And focus on fixing the main root cause of such problems, as a > mean to > >>>>>>>>> soften the need of use such commands. > >>>>>>>> > >>>>>>>> Yes, correct. Actually I usually do that. But I didn't do that in > that > >>>>>>>> case.. > >>>>>>>> > >>>>>>>> > >>>>>>>> On Wed, May 29, 2019 at 5:47 AM Wellington Chevreuil < > >>>>>>>> wellington.chevre...@gmail.com> wrote: > >>>>>>>> > >>>>>>>>> Thanks Toshihiro! I guess those can still be fixed with some > >>>>>> combinations > >>>>>>>>> of commands today, such as merge/assign. Of course, it requires > some > >>>>>>>> extra > >>>>>>>>> scripting and log reading on cases where many regions are in an > >>>>>>>>> inconsistent state, maybe we should work on provide a one liner > command > >>>>>>>>> that relies on the current existing ones. And focus on fixing > the main > >>>>>>>> root > >>>>>>>>> cause of such problems, as a mean to soften the need of use such > >>>>>>>> commands. > >>>>>>>>> > >>>>>>>>> I'm not really a fan of offlinemetarepair, nor hbck1 fix > >>>>>> holes/overlaps, > >>>>>>>>> would rather not have those back. Sure those are easy and > convenient to > >>>>>>>>> trigger, but hbck1 reports are sometimes misleading (for > instance, it > >>>>>>>>> reports holes when region(s) on the chain is/are simply not > online), > >>>>>> and > >>>>>>>>> that, combined with availability of such heavy hammers had led > >>>>>>>>> unexperienced operators to fall into running it and getting into > a > >>>>>> worse > >>>>>>>>> state. > >>>>>>>>> > >>>>>>>>> Em qua, 29 de mai de 2019 às 13:22, Toshihiro Suzuki < > >>>>>>>> brfrn...@apache.org> > >>>>>>>>> escreveu: > >>>>>>>>> > >>>>>>>>>> Hi Wellington, > >>>>>>>>>> > >>>>>>>>>> I saw table holes in a customer's cluster actually, and I just > fixed > >>>>>>>> the > >>>>>>>>>> issues > >>>>>>>>>> by the workaround I mentioned in HBASE-21665 > >>>>>>>>>> <https://issues.apache.org/jira/browse/HBASE-21665> and I > didn't dig > >>>>>>>> the > >>>>>>>>>> reason > >>>>>>>>>> why the table holes happened at that time because the customer > didn't > >>>>>>>>> want. > >>>>>>>>>> > >>>>>>>>>> However, IMO, whatever the reason I think we should have a > direct way > >>>>>>>> to > >>>>>>>>>> fix > >>>>>>>>>> holes and overlaps. > >>>>>>>>>> > >>>>>>>>>> On Wed, May 29, 2019 at 4:57 AM Wellington Chevreuil < > >>>>>>>>>> wellington.chevre...@gmail.com> wrote: > >>>>>>>>>> > >>>>>>>>>>> So JMS, Toshihiro, seems like upgrading from some 1.x to 2.x > >>>>>>>>> consistently > >>>>>>>>>>> triggers this problem? Do you guys know if there are any bug > jiras > >>>>>>>> open > >>>>>>>>>>> that would cover these scenarios? If not, and if you guys have > enough > >>>>>>>>>>> resources for investigating it, maybe worth open a specific > jira? > >>>>>>>>>>> > >>>>>>>>>>> Em qua, 29 de mai de 2019 às 11:40, Jean-Marc Spaggiari < > >>>>>>>>>>> jean-m...@spaggiari.org> escreveu: > >>>>>>>>>>> > >>>>>>>>>>>> Personnaly, when I tried to upgrade from 1.4.x to 2.2.x I end > up > >>>>>>>> in a > >>>>>>>>>>>> situation where my meta was empty and had to get it repaired, > but > >>>>>>>>>> lacked > >>>>>>>>>>>> OfflineMetaRepair for 2.2.x so I just had to delete all my > tables, > >>>>>>>>> get > >>>>>>>>>> a > >>>>>>>>>>>> brand new installation, recreate the tables and bulkload back > the > >>>>>>>>> data > >>>>>>>>>>> into > >>>>>>>>>>>> them. Would have been happy to have a OfflineMetaRepair. > >>>>>>>>>>>> > >>>>>>>>>>>> But it's more like an experimental cluster than a production > one... > >>>>>>>>>>>> > >>>>>>>>>>>> JMS > >>>>>>>>>>>> > >>>>>>>>>>>> Le mer. 29 mai 2019 à 06:36, Wellington Chevreuil < > >>>>>>>>>>>> wellington.chevre...@gmail.com> a écrit : > >>>>>>>>>>>> > >>>>>>>>>>>>> Interesting, I haven't seen any cases where > OfflineMetaRepair was > >>>>>>>>>>> really > >>>>>>>>>>>>> required, among our customer base (running > cdh6.1.x/hbase2.1.1, > >>>>>>>>>>>>> cdh6.2/hbase2.1.2). Majority of RITs issue I had came with on > >>>>>>>> hbase > >>>>>>>>>> 2.x > >>>>>>>>>>>>> were related to APs/SCPs failures, most of which could be > sorted > >>>>>>>>> with > >>>>>>>>>>>> hbck2 > >>>>>>>>>>>>> commands available by then (in some cases, required some CLI > >>>>>>>>>> scripting > >>>>>>>>>>> to > >>>>>>>>>>>>> build up a "bulk" assign command). > >>>>>>>>>>>>> > >>>>>>>>>>>>> Em qua, 29 de mai de 2019 às 00:55, Toshihiro Suzuki < > >>>>>>>>>>>> brfrn...@apache.org> > >>>>>>>>>>>>> escreveu: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi Josh, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Thank you for the explanation. I agree with the direction > for > >>>>>>>>>> HBCK2. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> The problem I wanted to tell you in the Jira is that until > we > >>>>>>>>>>> implement > >>>>>>>>>>>>> the > >>>>>>>>>>>>>> features > >>>>>>>>>>>>>> you mentioned, we don't have any direct way how to fix holes > >>>>>>>> and > >>>>>>>>>>>>> overlaps. > >>>>>>>>>>>>>> The holes and overlaps can be created by bugs or operation > >>>>>>>>> errors, > >>>>>>>>>>> so I > >>>>>>>>>>>>>> think we > >>>>>>>>>>>>>> should be able to fix these issues. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I thought OfflineMetaRepair could be a workaround for the > >>>>>>>> issues > >>>>>>>>>>> until > >>>>>>>>>>>> we > >>>>>>>>>>>>>> implement > >>>>>>>>>>>>>> the features of HBCK2. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>> Toshi > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Tue, May 28, 2019 at 9:12 AM Josh Elser < > els...@apache.org> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Context: https://issues.apache.org/jira/browse/HBASE-21665 > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I left a comment on the above issue about what I thought > good > >>>>>>>>>>> things > >>>>>>>>>>>> to > >>>>>>>>>>>>>>> build into HBCK2 would be -- a focus on specific > "primitive" > >>>>>>>>>>>> operations > >>>>>>>>>>>>>>> that an admin/operator could use to help repair an > otherwise > >>>>>>>>>> broken > >>>>>>>>>>>>>>> HBase installation. Some examples I had in my head were: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> * Create an empty region (to plug a hole) > >>>>>>>>>>>>>>> * Report holes in a region chain > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> In my head, the difference for HBCK2 was that we want to > give > >>>>>>>>>> folks > >>>>>>>>>>>> the > >>>>>>>>>>>>>>> tools to fix their cluster, but we did not want to own the > >>>>>>>>> "just > >>>>>>>>>>> fix > >>>>>>>>>>>>>>> everything" kind of tool that HBCK1 had become. That > problem > >>>>>>>>> with > >>>>>>>>>>>> HBCK1 > >>>>>>>>>>>>>>> was that it was often difficult/problematic for us to know > >>>>>>>> how > >>>>>>>>> to > >>>>>>>>>>>>>>> correctly fix a problem (the same problem could be > corrected > >>>>>>>> in > >>>>>>>>>>>>>>> different ways). > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Andrew had some confusion about this, so I'm not sure if > I'm > >>>>>>>>>>> off-base > >>>>>>>>>>>>> or > >>>>>>>>>>>>>>> if we're all in agreement on direction and we just need to > >>>>>>>> do a > >>>>>>>>>>>> better > >>>>>>>>>>>>>>> job documenting things. Thanks for keeping me honest either > >>>>>>>> way > >>>>>>>>>> :) > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> And just in case it doesn't go without saying, HBCK2 would > be > >>>>>>>>>>>> something > >>>>>>>>>>>>>>> that helps fix a system, while we want to always understand > >>>>>>>> the > >>>>>>>>>>> root > >>>>>>>>>>>>>>> cause of how/why we got into a situation where we needed > >>>>>>>> HBCK2 > >>>>>>>>>> and > >>>>>>>>>>>> also > >>>>>>>>>>>>>>> address that. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> - Josh > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> >