Thank you Duo. I have also encountered this issue and it is somewhere on
the to do list. Let me review the PR, this is fantastic.


On Wed, Mar 30, 2022 at 5:26 PM 张铎(Duo Zhang) <[email protected]> wrote:

> Liangjun He from Alibaba has tested the patch on their cloud deployment
> rebuilding scenario, and it works fine if we stop masters first and then
> region servers. Please check the comments on the jira issue for more
> details.
>
> Let me try to get this in. This will be very useful for users who deploy
> HBase on cloud.
>
> Thanks.
>
> 张铎(Duo Zhang) <[email protected]> 于2022年3月28日周一 12:24写道:
>
> > The issue aims to solve the problem of redeploying HBase clusters on
> cloud.
> >
> > I can not find the issue but IIRC, the AWS guys said they tried to do the
> > following steps while redeploying a customer's HBase cluster:
> >
> > 1. Disable write to cluster, flush all data to disk(which is actually S3)
> > 2. Recreate the cluster with a set of new machines, and also a new zk and
> > a new HDFS(for writing WAL)
> >
> > Then the new cluster just hung there and no regions were online.
> >
> > This is because in HMaster startup, we rely on scanning the WAL directory
> > on HDFS to get the previous live region servers, and we will compare the
> > list with the list stored on zookeeper to find out dead region servers
> and
> > schedule SCPs for them, and then the SCPs will bring the regions online.
> >
> > The problem for the above redeploying operation is, the WAL directory is
> > also cleaned, so we can not get the previous live region servers, so no
> SCP
> > will be scheduled.
> >
> > This is a bit annoying as we have already flushed all the data out so it
> > should be safe to delete all the WAL data.
> >
> > The idea in HBASE-26245 is to also store a copy of the live region
> servers
> > in master local region, so when restarting, we could also load the
> previous
> > live region servers from master local region, instead of only relying on
> > the WAL directory. In this way we could solve the problem of the above
> > redeploying operation.
> >
> > The PR is also ready.
> >
> > https://github.com/apache/hbase/pull/4136
> >
> > Suggestions and reviews are always welcomed.
> >
> > Thanks.
> >
>


-- 
Best regards,
Andrew

Unrest, ignorance distilled, nihilistic imbeciles -
    It's what we’ve earned
Welcome, apocalypse, what’s taken you so long?
Bring us the fitting end that we’ve been counting on
   - A23, Welcome, Apocalypse

Reply via email to