Re: Problem in rereplication algorithm

Sijie Guo Thu, 06 Mar 2014 00:21:12 -0800

On Wed, Mar 5, 2014 at 9:16 PM, Rakesh R <[email protected]> wrote:

>
> If the failure is more than the tolerated failures, it would not be safe
> to go ahead with any cleanup.
> For ex, quorum size is 2 and say failed 2 bookies out of 3, according to
> me for this ledger allowed failure is only 1.
>
> Also, please someone tell me, how the admin will get the confidence to
> safely do any cleanups.



I already pointed out. the admin should be aware of potential data loss. so
no confidence.


> IMHO postponing the recovery would be safe.
>

the postponing is already there, since the ledger couldn't be opened and
fenced.


>
> -Rakesh
>
> -----Original Message-----
> From: Uma Maheswara Rao G [mailto:[email protected]]
> Sent: 06 March 2014 10:05
> To: [email protected]
> Subject: Re: Problem in rereplication algorithm
>
> >As Sijie pointed out, we lost quorum, so the ledger is not good any
> longer.
> Because we might not be able to detect such cases automatically, I was
> wondering if we need to manually delete it.
>
> Yes. As Sijie and Flavio pointed out , how about providing a tool to clean
> such ledgers.
> At the same time I agree, we have to think some automatic way to detect it
> as we claim the feature as Auto.
>

at any time, if the quorum requirement is broken, we shouldn't do any auto
things. leave it to human.

>
> or shall we delay such quorum failure ledgers replication cycle
> incrementally by somehow tracking time in underreplication ledger nodes? [
> I am not very sure on this, we have to think more]
>
> Regards,
> Uma
>
>
>
> On Thu, Mar 6, 2014 at 7:35 AM, Flavio Junqueira <[email protected]
> >wrote:
>
> > I'm not sure what the desirable outcome is here. When you say that the
> > underreplicated L1 node hangs around forever, does it mean that we
> > keep trying to create new replicas?
>

The hang means that the ledger couldn't be opened and fenced.


> >
> > As Sijie pointed out, we lost quorum, so the ledger is not good any
> longer.
> > Because we might not be able to detect such cases automatically, I was
> > wondering if we need to manually delete it.
> >
> > -Flavio
> >
> >
> > -----Original Message-----
> > From: Ivan Kelly [mailto:[email protected]]
> > Sent: Wednesday, March 5, 2014 5:15 AM
> > To: [email protected]
> > Subject: Problem in rereplication algorithm
> >
> > Hi folks,
> >
> > We've come across a problem in autorecovery, which I've been banging
> > my head against for the last day so I decided to open it up to
> > everyone to see if a solution is any clearer.
> >
> > The problem was observed in production, and while it doesn't cause
> > data loss, it does appear to the admin as if entries have been lost.
> >
> > = Problem scenario =
> >
> > You have a ledger L1. There is one segment in the ledger with quorum
> > 2, ensemble 3 starting at entry 0. This segment is on the bookie B1,
> > B2 & B3. So metadata looks like
> >
> > 0: B1, B2, B3
> >
> > No data has been written to the ledger.
> >
> > B3 crashes. The auditor notes that L1 contains a segment with B3, so
> > scheduled the ledger to be checked. A recovery worker opens the ledger
> > without fencing. The recovery worker sees that the segment is still
> > open and that the lastAddConfirmed is less than the segment start id,
> > so it reads forward. Ultimately it gets a lastAddConfirmed which is
> > less than the segment start id, as all bookies in the quorum [B1,B2]
> > respond with NoSuchEntry for entry 0. So the recovery worker sees that
> > there are no underreplicated fragments, so there's nothing to
> > recovery. So far, so good.
> >
> > But now consider if B2 crashes. L1 will be scheduled to be checked
> > again. A recovery worker will try to open with fencing. It won't be
> > able to reach all quorums; [B2, B3] is now unavailable. Open will
> > fail.
> >
> > As a result, the underreplicated node for L1 hangs around forever.
> >
> > I have some ideas for a fix, but none is straightforward, so I'd like
> > to hear other opinions first.
> >
> > -Ivan
> >
> >
>

Re: Problem in rereplication algorithm

Reply via email to