Hi folks,

We have been facing an instability using the Ignite cluster over the Redis
interface across multiple release versions. I have already filed a Jira
ticket (https://issues.apache.org/jira/projects/IGNITE/issues/IGNITE-23551).
But, I was not able to find anyone reporting a similar issue. So, I was
wondering if anyone is using Ignite in this configuration and whether they
have dealt with this problem.

The simple gist of the problem is as follows (copied from the Jira ticket):

We are using Ignite as a persistent caching system primarily to write KVs
using the redis interface; we define redis caches statically in the xml
config.

We have been plagued by an issue where restarting a node in an existing
stable cluster does not work and the node fails every time trying to join
the cluster giving a NullPointerException. This happens with any and every
node in the cluster and persists no matter how many times the node is
started up.  After a full cluster restart the issue goes away.



Following is the stacktrace we see in the logs of the failed node:

[10:51:34,769][SEVERE][tcp-disco-msg-worker-[fa915882
10.132.0.114:47500]-#2-#57][TcpDiscoverySpi] TcpDiscoverSpi's message
worker thread failed abnormally. S
topping the node in order to prevent cluster wide instability.
java.lang.NullPointerException: Cannot invoke
"org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$CachePredicate.addClientNode(java.util.UUID,
 boolean)" because "p" is null
        at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.addClientNode(GridDiscoveryManager.java:428)
        at 
org.apache.ignite.internal.processors.cache.ClusterCachesInfo.addReceivedClientNodesToDiscovery(ClusterCachesInfo.java:1600)
        at 
org.apache.ignite.internal.processors.cache.ClusterCachesInfo.onGridDataReceived(ClusterCachesInfo.java:1519)
        at 
org.apache.ignite.internal.processors.cache.GridCacheProcessor.onGridDataReceived(GridCacheProcessor.java:3137)
        at 
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$4.onExchange(GridDiscoveryManager.java:1019)
        at 
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.onExchange(TcpDiscoverySpi.java:2197)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processNodeAddFinishedMessage(ServerImpl.java:5359)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:3242)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2918)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:8048)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:3089)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
        at 
org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7979)
        at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)


Appreciate any pointers in advance.

Thanks and Regards,
Ashu Pachauri

Reply via email to