Hi William, Thanks a lot for reporting the problem! Fortunately, the race condition happens during shutdown since "SegmentRaftLog is closed" as mentioned in the description of RATIS-1699. Of course, we would like to fix it. I will check it for more details.
Tsz-Wo On Wed, Sep 7, 2022 at 11:39 PM William Song <[email protected]> wrote: > > Hi, > > We found abnormal behaviors of GrpcLogAppender in a recent run, please refer > to https://issues.apache.org/jira/browse/RATIS-1699 > <https://issues.apache.org/jira/browse/RATIS-1699> for detailed error log and > problem description. > > We think that this problem roots in the data race in LeaderStateImpl. Its > method stop() and restart(LogAppender) can be called concurrently, and under > certain event sequence, will cause this weird situation. Maybe we can add > synchronizations to coordinate these two methods. The event sequence and full > logs are also provided in this issue. > > Please help me to confirm whether our analysis stands. Thanks in advance! > > Regards, > William
