In your case, you already lost a quorum. Any actions here will cause
potential data loss. If you really want to address it, provide a tool to
ask admin force-close the ledger, in aware of potential data loss.

- Sijie


On Wed, Mar 5, 2014 at 10:01 AM, Ivan Kelly <[email protected]> wrote:

> It was during the open that it failed, but it was at the
> readLastAddConfirmed part, not at recovery, as recovery didn't run
> because it was opening without fencing.
>
> -Ivan
>
> On Wed, Mar 05, 2014 at 02:50:26PM +0000, Rakesh R wrote:
> > Hi Ivan,
> >
> > I hope the following would have happened in your env.
> >
> > During fencing, ReplicationWorker(RW) is hitting the exception
> "org.apache.bookkeeper.client.BKException$BKLedgerRecoveryException"
> > as ledger did not hear success responses from all quorums. Now again and
> again RW will try to do fence and this cycle never ends, isn't it ?
> >
> >
> > If that is the case, I think graceful fencing will be difficult we may
> need to find some alternate way of handling this case.
> >
> >
> > -Rakesh
> >
> > -----Original Message-----
> > From: Ivan Kelly [mailto:[email protected]]
> > Sent: 05 March 2014 18:45
> > To: [email protected]
> > Subject: Problem in rereplication algorithm
> >
> > Hi folks,
> >
> > We've come across a problem in autorecovery, which I've been banging my
> head against for the last day so I decided to open it up to everyone to see
> if a solution is any clearer.
> >
> > The problem was observed in production, and while it doesn't cause data
> loss, it does appear to the admin as if entries have been lost.
> >
> > = Problem scenario =
> >
> > You have a ledger L1. There is one segment in the ledger with quorum 2,
> ensemble 3 starting at entry 0. This segment is on the bookie B1,
> > B2 & B3. So metadata looks like
> >
> > 0: B1, B2, B3
> >
> > No data has been written to the ledger.
> >
> > B3 crashes. The auditor notes that L1 contains a segment with B3, so
> scheduled the ledger to be checked. A recovery worker opens the ledger
> without fencing. The recovery worker sees that the segment is still open
> and that the lastAddConfirmed is less than the segment start id, so it
> reads forward. Ultimately it gets a lastAddConfirmed which is less than the
> segment start id, as all bookies in the quorum [B1,B2] respond with
> NoSuchEntry for entry 0. So the recovery worker sees that there are no
> underreplicated fragments, so there's nothing to recovery. So far, so good.
> >
> > But now consider if B2 crashes. L1 will be scheduled to be checked
> again. A recovery worker will try to open with fencing. It won't be able to
> reach all quorums; [B2, B3] is now unavailable. Open will fail.
> >
> > As a result, the underreplicated node for L1 hangs around forever.
> >
> > I have some ideas for a fix, but none is straightforward, so I'd like to
> hear other opinions first.
> >
> > -Ivan
>

Reply via email to