wchevreuil commented on pull request #2255:
URL: https://github.com/apache/hbase/pull/2255#issuecomment-673960675


   > What's next if we ignore the exception? We will retry later? Or we will 
just go on without this replication source? 
   
   As you can see on `ReplicationSource.startup`, it keeps looping until 
`initialize` succeeds without throwing any uncaught exceptions.
   
   > Users will then find out that the cluster is fine but data has not been 
replicated out?
   
   It's common practice to verify replication status after a maintenance. 
   
   >  I'm not sure if this is correct way, we fix an issue but introduce 
another hard to find issue?
   
   It does not fail silently, errors will get logged, and it gives operators 
the chance to look after what's going wrong without a complete downtime of 
their source clusters.
   
   >Adding a flag can keep the old behavior but we give users an impression 
that the exception can be ignored? Still not sure if this is the correct way to 
fix this... 
   Mind explaining more on your real usage?
   
   We do use some custom replication endpoints that under certain 
unavailability of some target peer hosts ended up throwing uncaught exception 
and aborting the source RSes. Sure, there could be improvements on the custom 
code, and it was an internal infra issue, but with a flag like this, we 
wouldn't need to face a period of outage at the source.
    


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to