Igor created IGNITE-21639:
-----------------------------

             Summary: Server after kill does not start and stuck on election 
                 Key: IGNITE-21639
                 URL: https://issues.apache.org/jira/browse/IGNITE-21639
             Project: Ignite
          Issue Type: Improvement
          Components: general, networking, platforms
    Affects Versions: 3.0.0-beta1
            Reporter: Igor
         Attachments: 
poc-tester-SERVER-192.168.1.117-id-0-2024-02-29-22-56-11-client.log.0

*Steps to reproduce:*
 # Start the 3 nodes cluster on different machine each (not in docker).
 # Insert about 500 000 rows across 500 tables. Replication is 3.
 # Kill one node.
 # Start killed node.

*Expected:*
The node is started, joined to the cluster and works normally.

Actual:
The node stucks on starting with repeating messages like this:
{code:java}
2024-02-29 23:06:21:261 +0300 
[INFO][%poc-tester-SERVER-192.168.1.117-id-0%JRaft-ElectionTimer-18][NodeImpl] 
Unsuccessful election round number 128
2024-02-29 23:06:21:261 +0300 
[INFO][%poc-tester-SERVER-192.168.1.117-id-0%JRaft-ElectionTimer-18][NodeImpl] 
Node <154_part_24/poc-tester-SERVER-192.168.1.117-id-0> term 3 start preVote. 
2024-02-29 23:06:21:282 +0300 
[ERROR][%poc-tester-SERVER-192.168.1.117-id-0%JRaft-FSMCaller-Disruptor_stripe_5-0][StripedDisruptor]
 Handle disruptor event error 
[name=%poc-tester-SERVER-192.168.1.117-id-0%JRaft-FSMCaller-Disruptor-, 
event=org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTask@efb699b, 
hasHandler=false]
java.lang.AssertionError: Safe time reordering detected 
[current=112016525904248838, proposed=112016523364991002]
    at 
org.apache.ignite.internal.table.distributed.raft.PartitionListener.lambda$onWrite$1(PartitionListener.java:169)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at 
org.apache.ignite.internal.table.distributed.raft.PartitionListener.onWrite(PartitionListener.java:159)
    at 
org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:674)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:557)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:525)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:444)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:136)
    at 
org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:130)
    at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:266)
    at 
org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:231)
    at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137)
    at java.base/java.lang.Thread.run(Thread.java:829){code}
 

[^poc-tester-SERVER-192.168.1.117-id-0-2024-02-29-22-56-11-client.log.0]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to