z-york commented on pull request #2113:
URL: https://github.com/apache/hbase/pull/2113#issuecomment-663260010


   So the use case here is starting a new cluster in the cloud where HDFS (WAL) 
data on the previous cluster will not be available. One of the benefits of 
storing the data off the cluster (in our case, S3), is to not have to replicate 
data (and just create a new cluster pointed to the same root directory). IMO, 
in this case we shouldn't need the WAL directories to exist just to tell us to 
reassign and this is a valid use case.  
   
   I get that there is pushback for enabling this in the catalog cleaner, and I 
think that's fine. For this case, it's a one time issue, not something that 
periodically needs fixing. (there might be other unknown server cases that 
would require that, but that isn't blocking us at the moment). So, instead 
maybe a 1-time run to cleanup old servers/schedule SCP for them (this is what 
the code that was removed in HBASE-20708 actually did) makes the most sense? I 
understand that it was removed to simplify the assignment, but it has a very 
different behavior. In fact it looks like we don't even try to read hbase:meta 
if it is found (without SCP/WAL) and simply just delete the directory[1]. What 
problem is being solved by deleting instead of trying to assign it if data is 
there?
   
   [1] 
https://github.com/apache/hbase/blob/bad2d4e409ba57014e0f5931b72e54cc397e268a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/InitMetaProcedure.java#L75-L81
 
   
   > For your scenario, it is OK as you can confirm that there is no data loss 
as you manually flushed all the data. But what if another user just configs the 
wrong WAL directory? In this case, if we schedule SCPs automatically, there 
will be data loss.
   
   In this scenario, regardless of what we do, there will be dataloss unless 
the correct WAL directory is (again) specified. In fact, I don't believe you 
can change WAL dir without restarting servers (I also don't think it works with 
rolling restart). I don't think this is a valid scenario for this issue.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to