Re: Improve the process of removing bookies from a cluster

Venkateswara Rao Jujjuri Tue, 07 Sep 2021 07:08:39 -0700

Glad to see this thread. This is one of the biggest limitations to
autoscaling.


On Tue, Sep 7, 2021 at 6:11 AM Jonathan Ellis <[email protected]> wrote:

> On Tue, Sep 7, 2021 at 8:05 AM Ivan Kelly <[email protected]> wrote:
>
> > Hi Yang,
> >
> > > Autoscaling is exactly one motivation for me to bring this topic up. I
> > > understand that the auto-recovery is not perfect at the moment, but
> it's
> > an
> > > important component that maintains the core invariants of a bookkeeper
> > > cluster, so I think we may keep improving it until we find a better
> > > replacement.
> >
> > Internally we have replaced auto recovery with another mechanism that
> > checks that the bookie
> > has all the data it says it has. We have plans to push this upstream
> > in the next month or two. A side
> > effect of the change is that it allows you to run without journal safely.
> > However, it doesn't cover the decommission usecase. For decommission,
> > our thinking is that once we
> > have tiered storage at the bookie level, the decommission story
> > becomes a lot easier. Basically, you
> > switch to read-only and wait for tiered storage to clear it out, even
> > bumping the bookies ledgers in priority
> > for offloading to tiered storage. We're still early in this process
> > (utilization metrics have to come first).
> >
>
> This sounds very useful, I'm excited to see more details.
>


-- 
Jvrao
---
First they ignore you, then they laugh at you, then they fight you, then
you win. - Mahatma Gandhi

Re: Improve the process of removing bookies from a cluster

Reply via email to