Mahesh Renduchintala created IGNITE-8728: --------------------------------------------
Summary: Nodes down after other nodes reboot in the cluster Key: IGNITE-8728 URL: https://issues.apache.org/jira/browse/IGNITE-8728 Project: Ignite Issue Type: Bug Affects Versions: 2.5 Reporter: Mahesh Renduchintala I have two nodes on which we have 3 tables are partitioned. Index are also built on these node. For 24 hours caches work fine. The tables are definitely distributed across both the nodes Node 2 reboots, ignite service gets started and in Node 1 we see a crash. [10:38:35,437][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery accepted incoming connection [rmtAddr=/192.168.1.7, rmtPort=45102] [10:38:35,437][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery spawning a new thread for connection [rmtAddr=/192.168.1.7, rmtPort=45102] [10:38:35,437][INFO][tcp-disco-sock-reader-#12][TcpDiscoverySpi] Started serving remote node connection [rmtAddr=/192.168.1.7:45102, rmtPort=45102] [10:38:35,451][INFO][tcp-disco-sock-reader-#12][TcpDiscoverySpi] Finished serving remote node connection [rmtAddr=/192.168.1.7:45102, rmtPort=45102 [10:38:35,457][SEVERE][tcp-disco-msg-worker-#3][TcpDiscoverySpi] TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node in order to prevent cluster wide instability. java.lang.IllegalStateException: Duplicate key at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:223) at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:174) at org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114) at org.apache.ignite.internal.processors.cache.DynamicCacheDescriptor.makeSchemaPatch(DynamicCacheDescriptor.java:360) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.validateNode(GridCacheProcessor.java:2536) at org.apache.ignite.internal.managers.GridManagerAdapter$1.validateNode(GridManagerAdapter.java:566) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processJoinRequestMessage(ServerImpl.java:3629) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2736) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536) at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) [10:38:35,459][SEVERE][tcp-disco-msg-worker-#3][] Critical system error detected. Will be handled accordingly to configured handler [hnd=class o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Duplicate key]] java.lang.IllegalStateException: Duplicate key at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:223) at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:174) at org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114) at org.apache.ignite.internal.processors.cache.DynamicCacheDescriptor.makeSchemaPatch(DynamicCacheDescriptor.java:360) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.validateNode(GridCacheProcessor.java:2536) at org.apache.ignite.internal.managers.GridManagerAdapter$1.validateNode(GridManagerAdapter.java:566) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processJoinRequestMessage(ServerImpl.java:3629) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2736) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536) at org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775) at org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) [10:38:35,460][SEVERE][tcp-disco-msg-worker-#3][] JVM will be halted immediately due to the failure: [failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: Duplicate key]] -- This message was sent by Atlassian JIRA (v7.6.3#76005)