Re: HBase master unable to recover with error "Cannot seek after EOF"

Yulin Niu Sat, 18 Dec 2021 18:41:34 -0800

https://issues.apache.org/jira/browse/HBASE-25053
It seems the bug described in this issue, You can try cherry pick this
patch, Claude M


Viraj Jasani <vjas...@apache.org> 于2021年12月19日周日 02:17写道：

> > Your fix is a bit dangerous since you may lose some ongoing procedures,
> but
> > if you did not experience any inconsistency on your cluster, for example,
> > some regions are not online, then it is OK.
>
> Duo, out of curiosity, even if some regions are offline and/or some servers
> go offline, wouldn't master failover re-trigger SCPs and TRSPs to bring all
> regions ONLINE?
> I have played around with removal of MasterProcWAL on hbase1 only (WAL proc
> store) and have seen new SCPs getting triggered i.e. AM doesn bring all
> regions ONLINE eventually.
>
>
> On Thu, Dec 16, 2021 at 9:57 PM 张铎(Duo Zhang) <palomino...@gmail.com>
> wrote:
>
> > I guess this should be a bug. For the master local region we do not
> handle
> > broken WAL files which do not even have a valid header.
> >
> > Will take a look at the code tomorrow to confirm whether this is the
> case.
> >
> > Your fix is a bit dangerous since you may lose some ongoing procedures,
> but
> > if you did not experience any inconsistency on your cluster, for example,
> > some regions are not online, then it is OK.
> >
> > Thanks for reporting.
> >
> > Claude M <claudemur...@gmail.com> 于2021年12月16日周四 03:37写道：
> >
> > > Hello,
> > >
> > > I have the following installed:
> > >
> > >    - Hadoop 3.2.2
> > >    - HBase 2.3.5
> > >
> > >
> > > When all the datanodes in Hadoop are stopped but the HBase cluster is
> > > still running, the HBase master crashes w/ the attached exception and
> is
> > > not recoverable.
> > >
> > > If I delete the contents under the following directories in hdfs, the
> > > master will then recover:
> > >
> > >    - /hbase/MasterData/WALs/
> > >    - /hbase/MasterData/data/master/store/*/recovered.wals/
> > >
> > > Is this an appropriate way to resolve the issue?  If not, what should
> be
> > > done?
> > >
> > >
> > > Thanks
> > >
> >
>

Re: HBase master unable to recover with error "Cannot seek after EOF"

Reply via email to