[ 
https://issues.apache.org/jira/browse/IGNITE-8728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16743794#comment-16743794
 ] 

Mahesh Renduchintala commented on IGNITE-8728:
----------------------------------------------

Stan, thanks for the visibility. 
 # Over the last year, we move from various versions of ignite 2.4, 2.5 to 2.7. 
I always keep work folder in tact. 
 # Over a period of development, we might have tried to create index a second 
or many times on the same column on which an index already existed. Now, could 
that cause a confusion at ignite level, especially in a multi-node scenario? 
Was something out of sync? Was a check missing?
 # Over a period of time, we dropped the table several times and recreated the 
table several times and indexes. Was something stable left out in work folder. 
 # We always used 2 or more nodes. 
 # Over a period of time, we saw issues with index creation as well. My 
colleague posted another strange behaviour with index creation. See the issue 
here, 
[http://apache-ignite-users.70518.x6.nabble.com/Failing-to-create-index-on-Ignite-table-column-td26252.html#a26258]
 # Summary is if we don't give index names the ignite gives exceptions.         
                        

 

Something seems to be wrong with Ignite index handling in multi-node 
environment. 

Regarding your point 2, absolutely, makes sense not to crash the node. We have 
about 100GB data (tables) on ignite and the only work around right now seems to 
be 

Boot node 1. Keep its work folder. 

Boot node 2 after removing its work folder

This scenario though does not produce an issue, gives the cluster a down-time 
of about 1-2 hours. 

 

 

> "IllegalStateException: Duplicate Key" on node join
> ---------------------------------------------------
>
>                 Key: IGNITE-8728
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8728
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.7
>            Reporter: Mahesh Renduchintala
>            Priority: Critical
>         Attachments: NS1_ignite-9676df15.0.log, NS2_ignite-7cfc8008.0.log, 
> node-config.xml
>
>
> I have two nodes on which we have 3 tables which are partitioned.  Index are 
> also built on these tables. 
> For 24 hours caches work fine.  The tables are definitely distributed across 
> both the nodes
> Node 2 reboots due to some issue - goes out of the baseline - comes back and 
> joins the baseline.  Other baseline nodes crash and in the logs we see 
> duplicate Key error
> [10:38:35,437][INFO][tcp-disco-srvr-#2|#2][TcpDiscoverySpi] TCP discovery 
> accepted incoming connection [rmtAddr=/192.168.1.7, rmtPort=45102]
>  [10:38:35,437][INFO][tcp-disco-srvr-#2|#2][TcpDiscoverySpi] TCP discovery 
> spawning a new thread for connection [rmtAddr=/192.168.1.7, rmtPort=45102]
>  [10:38:35,437][INFO][tcp-disco-sock-reader-#12|#12][TcpDiscoverySpi] Started 
> serving remote node connection [rmtAddr=/192.168.1.7:45102, rmtPort=45102]
>  [10:38:35,451][INFO][tcp-disco-sock-reader-#12|#12][TcpDiscoverySpi] 
> Finished serving remote node connection [rmtAddr=/192.168.1.7:45102, 
> rmtPort=45102
>  [10:38:35,457][SEVERE][tcp-disco-msg-worker-#3|#3][TcpDiscoverySpi] 
> TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node 
> in order to prevent cluster wide instability.
>  *java.lang.IllegalStateException: Duplicate key*
> at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:223)
> at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:174)
> at 
> org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114)
> at 
> org.apache.ignite.internal.processors.cache.DynamicCacheDescriptor.makeSchemaPatch(DynamicCacheDescriptor.java:360)
> at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.validateNode(GridCacheProcessor.java:2536)
> at 
> org.apache.ignite.internal.managers.GridManagerAdapter$1.validateNode(GridManagerAdapter.java:566)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processJoinRequestMessage(ServerImpl.java:3629)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2736)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621)
> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>  [10:38:35,459][SEVERE][tcp-disco-msg-worker-#3|#3][] Critical system error 
> detected. Will be handled accordingly to configured handler [hnd=class 
> o.a.i.failure.StopNodeOrHaltFailureHandler, failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: 
> Duplicate key]]
> java.lang.IllegalStateException: Duplicate key
> at org.apache.ignite.cache.QueryEntity.checkIndexes(QueryEntity.java:223)
> at org.apache.ignite.cache.QueryEntity.makePatch(QueryEntity.java:174)
> at 
> org.apache.ignite.internal.processors.query.QuerySchema.makePatch(QuerySchema.java:114)
> at 
> org.apache.ignite.internal.processors.cache.DynamicCacheDescriptor.makeSchemaPatch(DynamicCacheDescriptor.java:360)
> at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.validateNode(GridCacheProcessor.java:2536)
> at 
> org.apache.ignite.internal.managers.GridManagerAdapter$1.validateNode(GridManagerAdapter.java:566)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processJoinRequestMessage(ServerImpl.java:3629)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2736)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2536)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerAdapter.body(ServerImpl.java:6775)
> at 
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:2621)
> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>  [10:38:35,460][SEVERE][tcp-disco-msg-worker-#3|#3][] JVM will be halted 
> immediately due to the failure: [failureCtx=FailureContext 
> [type=SYSTEM_WORKER_TERMINATION, err=java.lang.IllegalStateException: 
> Duplicate key]]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to