Wellington Chevreuil created HBASE-24877:
--------------------------------------------
Summary: Add option to avoid aborting RS process upon uncaught
exceptions happen on replication source
Key: HBASE-24877
URL: https://issues.apache.org/jira/browse/HBASE-24877
Project: HBase
Issue Type: Improvement
Components: Replication
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil
Currently, we abort entire RS process if any uncaught exceptions happens on
ReplicationSource initialization. This may be too extreme on certain
deployments, where custom replication endpoint implementations may choose to do
so when remote peers are unavailable, but source cluster shouldn't be brought
down entirely. Similarly, source reader and shipper threads would cause RS to
abort on any runtime exception occurrence while running.
This patch adds configuration option (false by default, to keep the original
behaviour), to avoid aborting entire RS processes under these conditions.
Instead, if ReplicationSource initialization fails with a RuntimeException, it
keeps retrying the source startup. In the case of readers/shippers runtime
errors, it refreshes the replication source, terminating current source and its
readers/shippers and creating new ones.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)