[
https://issues.apache.org/jira/browse/RATIS-785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mukul Kumar Singh updated RATIS-785:
------------------------------------
Reporter: Sammi Chen (was: Shashikant Banerjee)
> Statemachine updater fails with assertion
> -----------------------------------------
>
> Key: RATIS-785
> URL: https://issues.apache.org/jira/browse/RATIS-785
> Project: Ratis
> Issue Type: Bug
> Components: server
> Reporter: Sammi Chen
> Assignee: Shashikant Banerjee
> Priority: Major
> Fix For: 0.5.0
>
>
> {code:java}
> java.lang.IllegalStateException: retry cache entry should be pending:
> client-7E602ACF0902:70:done
> at
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
> at
> org.apache.ratis.server.impl.RetryCache.getOrCreateEntry(RetryCache.java:170)
> at
> org.apache.ratis.server.impl.RaftServerImpl.replyPendingRequest(RaftServerImpl.java:1242)
> at
> org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1303)
> at
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:226)
> at
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
> at java.lang.Thread.run(Thread.java:748)
> 2019-12-20 11:27:24,343 ERROR
> org.apache.ratis.server.impl.StateMachineUpdater:
> ed90869c-317e-4303-8922-9fa83a3983cb@group-9D552F016938-StateMachineUpdater:
> the StateMachineUpdater hits Throwable
> java.lang.IllegalStateException: retry cache entry should be pending:
> client-7E602ACF0902:70:done
> at
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:63)
> at
> org.apache.ratis.server.impl.RetryCache.getOrCreateEntry(RetryCache.java:170)
> at
> org.apache.ratis.server.impl.RaftServerImpl.replyPendingRequest(RaftServerImpl.java:1242)
> at
> org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1303)
> at
> org.apache.ratis.server.impl.StateMachineUpdater.applyLog(StateMachineUpdater.java:226)
> at
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:167)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> The issue seems to be caused by precondition, where in the the reply future
> in retry cache is marked complete already where it expects to be in pending
> state.
>
> One possible case, would be like , if the entry gets evicted from cache, we
> end up creating two different requests (two log entries) for same set of
> client and call id which is the key to retryCache. If the server now restarts
> and starts reapplying the transaction, the earlier index might add it to the
> retryCache but when the apply for the other log index happens, it might
> already see the future marked complete as for both of them retry cache key
> would be same.
> FYI, the issue happens only after a restart.
> cc [~msingh], [~ljain] [~szetszwo]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)