On Wed, Nov 1, 2017 at 5:08 PM, Vladimir Rodionov <[email protected]> wrote:
> There is no way to validate correctness of backup in a general case. > > You can restore backup into temp table, but then what? Read rows one-by-one > from temp table and look them up > in a primary table? Won't work, because rows can be deleted or modified > since the last backup was done. > > Replication has a verity table tool. You can ask a cluster not delete rows. You can read at a specific timestamp. Or you could create backups during an extended ITBLL. When ITBLL completes, verify it on src cluster. Create a table from the increment backups. Verify in the restore. Etc. St.Ack > Your results most of the time will be approximate: validation completed, > found 99.5% of rows. Will this satisfies user? > > Offtop here. I hope feature requester will explain in corresponding JIRA > what type of *validation* they perform and expect. > > > > > On Wed, Nov 1, 2017 at 4:59 PM, Apekshit Sharma <[email protected]> wrote: > > > As for HBASE-19106, when someone says that it's fundamental, i think they > > mean that some kind of validation that backup is correct is necessary, > and > > i concur. > > Saying that something wasn't in initial feature list is hardly a > > justification! It's not like the idea was known when initial list was > > planned and was decided not to be done. It's new. And new things can be > > important! > > > > > > > > > > On Wed, Nov 1, 2017 at 4:34 PM, Apekshit Sharma <[email protected]> > wrote: > > > > > Came here just to track anything related to Distributed Log Replay > which > > I > > > am trying to purge. But looks like it's another discussion thread about > > > hbase-backup. > > > Am coming here with limited knowledge about the feature (did a review > > > initially once, lost track after). But then, looks like discussion is > not > > > about technical aspects of feature, but trust in it. > > > > > > Something which can help get trust in B&R, or otherwise, is an accurate > > > summary of it as of now. Basically > > > 1) What features are there in 2.0 > > > 2) What features are being targeted for 2.1 onwards > > > 3) What testing has been done so far. Not just names...details. For eg. > > > ITBLL w/ 50 node cluster and x,y,z fault tolerences. > > > 4) What tests are planned before 2.0. I think a good basis to judge > that > > > would be, will that testing convince Elliot/ Andrew to use that feature > > in > > > their internal clusters. > > > 5) List of existing bugs > > > > > > Once it's there, hopefully everyone agrees that list in (1) is enough > and > > > items in (2) are non-critical for basic B&R. > > > 3 and 4 are most important. > > > Missing anything in (5) will be counter-productive. > > > I'd appreciate if the summary is followed by opinions, and not mixed > > > together. > > > > > > Just a suggestion which can help you get right attention. > > > Thanks. > > > > > > -- Appy > > > > > > > > > > > > On Wed, Nov 1, 2017 at 3:33 PM, Vladimir Rodionov < > > [email protected]> > > > wrote: > > > > > >> >> HBASE-19106 at least is a fundamental > > >> > > >> This new feature was requested 9 days ago (between alpha 3 and alpha 4 > > >> releases) It has never been on a list of features we has agreed to > > >> implement for 2.0 release. > > >> When backup started almost 2 years ago, we described what features and > > >> capabilities will be implemented. We have had a discussions before > and I > > >> do > > >> not remember any > > >> complaints from community that we lack important functionalities > > >> > > >> You can not point to it as a blocker for 2.0 release, Stack. > > >> > > >> Testing at scale (lack of) - the only real issue I see in B&R now. The > > >> question: can it justify your willingness to postpone feature till > next > > >> 2.x release, Stack? > > >> > > >> All blockers are resolved, including pending HBASE-17852 patch. All > > >> functionality for 2.0 has been implemented. Scalability and > > performance > > >> improvements patch is in working > > >> and expected to be ready next week. In any case, this is improvement - > > not > > >> a new feature. > > >> > > >> We have been testing B&R in our internal QA clusters for months. > Others > > >> (SF) have done testing as well. I am pretty confident in > implementation. > > >> > > >> > > >> > > >> On Wed, Nov 1, 2017 at 3:15 PM, Josh Elser <[email protected]> wrote: > > >> > > >> > On 11/1/17 5:52 PM, Stack wrote: > > >> > > > >> >> On Wed, Nov 1, 2017 at 12:25 PM, Vladimir Rodionov< > > >> [email protected] > > >> >> > > > >> >> wrote: > > >> >> > > >> >> 1. HBASE-19104 - 19109 > > >> >>> > > >> >>> None of them are basic, Stack. These requests came from SF after > > >> >>> discussion > > >> >>> we had with them recently > > >> >>> No single comments is because I was out of country last week. > > >> >>> > > >> >>> 2. Backup tables are not system ones, they belong to a separate > > >> >>> namespace - > > >> >>> "backup" > > >> >>> > > >> >>> 3. We make no assumptions on assignment order of these tables. > > >> >>> > > >> >>> As for real scale testing and documentation , we still have time > > >> before > > >> >>> 2.0GA. Can't be blocker IMO > > >> >>> > > >> >>> > > >> >>> First off, wrong response. > > >> >> > > >> >> Better would have been pointers to a description of the feature as > it > > >> >> stands in branch-2 (a list of JIRAs is insufficient), what is to be > > >> done > > >> >> still, and evidence of heavy testing in particular at scale (as > Josh > > >> >> reminds us, we agreed to last time backup-in-hbase2 was broached) > > >> ending > > >> >> with list of what will be done between here and beta-1 to assuage > any > > >> >> concerns that backup is incomplete. As to the issues filed, IMO, > > >> >> HBASE-19106 at least is a fundamental. W/o it, how you even know > > backup > > >> >> works at anything above toy scale. > > >> >> > > >> >> Pardon my mistake on 'system' tables. I'd made the statement 9 days > > >> ago up > > >> >> in HBASE-17852 trying to figure what was going on in the issue and > it > > >> >> stood > > >> >> unchallenged (Josh did let me know later that you were traveling). > > >> >> > > >> >> I'm not up for waiting till GA before we decide what is in the > > release. > > >> >> This DISCUSSION is about deciding now, before beta-1, whats in and > > >> whats > > >> >> out. Backup would be a great to have but it is currently on the > > >> chopping > > >> >> block. I've tried to spend time figuring what is there and where it > > >> stands > > >> >> but I always end up stymied (e.g. see HBASE-17852; see how it > starts > > >> out; > > >> >> see the patch attached w/ no description of what it comprises or > the > > >> >> approach decided upon; and so on). Maybe its me, but hey, > > >> unfortunately, > > >> >> its me who is the RM. > > >> >> > > >> > > > >> > As much as it pains me, I can't argue with the lack of confidence > via > > >> > testing. While it feels like an eternity ago since we posited on > B&R's > > >> > scale/correctness testing, it's only been 1.5 months. In reality, > > >> getting > > >> > to this was delayed by some of the (really good!) FT fixes that Vlad > > has > > >> > made. > > >> > > > >> > We set the bar for the feature and we missed it; there's not arguing > > >> that. > > >> > Yes, it stinks. I see two paths forward: 1) come up with its own > > >> release to > > >> > let those downstream use it now (risks withstanding) or 2) shoot for > > >> HBase > > >> > 2.1.0. The latter is how we've approached this in the past. Building > > the > > >> > test needs to happen regardless of the release vehicle. > > >> > > > >> > New issues/feature-requests are always going to come in as people > > >> > experiment with it. I hope to avoid getting bogged down in this -- I > > >> > sincerely doubt that there is any single answer to what is > "required" > > >> for > > >> > an initial backup and restore implementation. I feel like anything > > more > > >> > will turn into a battle of opinions. When we bring up the feature > > >> again, we > > >> > should make a concerted effort to say "this is the state of the > > feature, > > >> > with the design choices made, and this the result of our testing for > > >> > correctness." Hopefully much of this is already contained in > > >> documentation > > >> > and just needs to be collected/curated. > > >> > > > >> > - Josh > > >> > > > >> > > > > > > > > > > > > -- > > > > > > -- Appy > > > > > > > > > > > -- > > > > -- Appy > > >
