[GitHub] [hbase] z-york commented on pull request #2113: HBASE-24286: HMaster won't become healthy after after cloning or crea…

GitBox Thu, 23 Jul 2020 15:19:26 -0700


z-york commented on pull request #2113:
URL: https://github.com/apache/hbase/pull/2113#issuecomment-663260010

So the use case here is starting a new cluster in the cloud where HDFS (WAL)
data on the previous cluster will not be available. One of the benefits of
storing the data off the cluster (in our case, S3), is to not have to replicate
data (and just create a new cluster pointed to the same root directory). IMO,
in this case we shouldn't need the WAL directories to exist just to tell us to
reassign and this is a valid use case.

I get that there is pushback for enabling this in the catalog cleaner, and I
think that's fine. For this case, it's a one time issue, not something that
periodically needs fixing. (there might be other unknown server cases that
would require that, but that isn't blocking us at the moment). So, instead
maybe a 1-time run to cleanup old servers/schedule SCP for them (this is what
the code that was removed in HBASE-20708 actually did) makes the most sense? I
understand that it was removed to simplify the assignment, but it has a very
different behavior. In fact it looks like we don't even try to read hbase:meta
if it is found (without SCP/WAL) and simply just delete the directory[1]. What
problem is being solved by deleting instead of trying to assign it if data is
there?

[1]
https://github.com/apache/hbase/blob/bad2d4e409ba57014e0f5931b72e54cc397e268a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java#L75-L81

> For your scenario, it is OK as you can confirm that there is no data loss
as you manually flushed all the data. But what if another user just configs the
wrong WAL directory? In this case, if we schedule SCPs automatically, there
will be data loss.

In this scenario, regardless of what we do, there will be dataloss unless
the correct WAL directory is (again) specified. In fact, I don't believe you
can change WAL dir without restarting servers (I also don't think it works with
rolling restart). I don't think this is a valid scenario for this issue.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hbase] z-york commented on pull request #2113: HBASE-24286: HMaster won't become healthy after after cloning or crea…

Reply via email to