[ 
https://issues.apache.org/jira/browse/RATIS-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiacheng Liu updated RATIS-1695:
--------------------------------
    Description: The Daemon thread class has multiple overloaded constructors. 
For ease of maintenance and better flexibility (like adding more fields to 
Daemon in RATIS-1709), need to change Daemon constructors to using a Builder.  
(was: In Ratis many threads are created using `Daemon` class manually. For 
threads like this, if there's an uncaught exception, the thread will just crash 
silently without other components knowing. If the thread happens to be a 
critical component then some part of the RaftServer is essentially down, 
whereas the RaftServer's lifecycle is still RUNNING (not set to EXCEPTION 
because the thread didn't have a chance).

One example where this can happen is 
[https://github.com/apache/ratis/pull/417/files] Before this change is in, the 
StateMachineUpdater thread can throw NPE and exit, so the follower RaftServer 
stays stale forever. The RaftServer's lifecycle is RUNNING and there's no way 
for the external party to know by `RaftServer.getLifeCycleState()`.

The proposal is to improve observability on RaftServer to ensure an uncaught 
exception can be caught and propagated to the external user, by multiple folds:
 # For all `Daemon` threads, they should have UncaughtExceptionHandler set.
 # Add an extra field to the RaftServer to store an exception, and that field 
can be set by the UncaughtExceptionHandler instances.
 # The UncaughtExceptionHandler also transitions the RaftServer to EXCEPTION 
state.

So external users canĀ 
{code:java}
RaftServer server = RaftServer.newBuilder().build();
// Periodically check
if (server.getLifeCycleState() == State.EXCEPTION) {
  Throwable t = server.getError();
  // Deal with the throwable
}{code})

> Use a Builder for Daemon
> ------------------------
>
>                 Key: RATIS-1695
>                 URL: https://issues.apache.org/jira/browse/RATIS-1695
>             Project: Ratis
>          Issue Type: Improvement
>          Components: server
>            Reporter: Jiacheng Liu
>            Assignee: Jiacheng Liu
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: 733_review.patch
>
>          Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> The Daemon thread class has multiple overloaded constructors. For ease of 
> maintenance and better flexibility (like adding more fields to Daemon in 
> RATIS-1709), need to change Daemon constructors to using a Builder.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to