On Wed, Mar 5, 2014 at 9:16 PM, Rakesh R <[email protected]> wrote: > > If the failure is more than the tolerated failures, it would not be safe > to go ahead with any cleanup. > For ex, quorum size is 2 and say failed 2 bookies out of 3, according to > me for this ledger allowed failure is only 1. > > Also, please someone tell me, how the admin will get the confidence to > safely do any cleanups.
I already pointed out. the admin should be aware of potential data loss. so no confidence. > IMHO postponing the recovery would be safe. > the postponing is already there, since the ledger couldn't be opened and fenced. > > -Rakesh > > -----Original Message----- > From: Uma Maheswara Rao G [mailto:[email protected]] > Sent: 06 March 2014 10:05 > To: [email protected] > Subject: Re: Problem in rereplication algorithm > > >As Sijie pointed out, we lost quorum, so the ledger is not good any > longer. > Because we might not be able to detect such cases automatically, I was > wondering if we need to manually delete it. > > Yes. As Sijie and Flavio pointed out , how about providing a tool to clean > such ledgers. > At the same time I agree, we have to think some automatic way to detect it > as we claim the feature as Auto. > at any time, if the quorum requirement is broken, we shouldn't do any auto things. leave it to human. > > or shall we delay such quorum failure ledgers replication cycle > incrementally by somehow tracking time in underreplication ledger nodes? [ > I am not very sure on this, we have to think more] > > Regards, > Uma > > > > On Thu, Mar 6, 2014 at 7:35 AM, Flavio Junqueira <[email protected] > >wrote: > > > I'm not sure what the desirable outcome is here. When you say that the > > underreplicated L1 node hangs around forever, does it mean that we > > keep trying to create new replicas? > The hang means that the ledger couldn't be opened and fenced. > > > > As Sijie pointed out, we lost quorum, so the ledger is not good any > longer. > > Because we might not be able to detect such cases automatically, I was > > wondering if we need to manually delete it. > > > > -Flavio > > > > > > -----Original Message----- > > From: Ivan Kelly [mailto:[email protected]] > > Sent: Wednesday, March 5, 2014 5:15 AM > > To: [email protected] > > Subject: Problem in rereplication algorithm > > > > Hi folks, > > > > We've come across a problem in autorecovery, which I've been banging > > my head against for the last day so I decided to open it up to > > everyone to see if a solution is any clearer. > > > > The problem was observed in production, and while it doesn't cause > > data loss, it does appear to the admin as if entries have been lost. > > > > = Problem scenario = > > > > You have a ledger L1. There is one segment in the ledger with quorum > > 2, ensemble 3 starting at entry 0. This segment is on the bookie B1, > > B2 & B3. So metadata looks like > > > > 0: B1, B2, B3 > > > > No data has been written to the ledger. > > > > B3 crashes. The auditor notes that L1 contains a segment with B3, so > > scheduled the ledger to be checked. A recovery worker opens the ledger > > without fencing. The recovery worker sees that the segment is still > > open and that the lastAddConfirmed is less than the segment start id, > > so it reads forward. Ultimately it gets a lastAddConfirmed which is > > less than the segment start id, as all bookies in the quorum [B1,B2] > > respond with NoSuchEntry for entry 0. So the recovery worker sees that > > there are no underreplicated fragments, so there's nothing to > > recovery. So far, so good. > > > > But now consider if B2 crashes. L1 will be scheduled to be checked > > again. A recovery worker will try to open with fencing. It won't be > > able to reach all quorums; [B2, B3] is now unavailable. Open will > > fail. > > > > As a result, the underreplicated node for L1 hangs around forever. > > > > I have some ideas for a fix, but none is straightforward, so I'd like > > to hear other opinions first. > > > > -Ivan > > > > >
