Alexandr Kuramshin created IGNITE-6491: ------------------------------------------
Summary: Race in TopologyValidator.validate() and EVT_NODE_LEFT listener calls (split-brain activator) Key: IGNITE-6491 URL: https://issues.apache.org/jira/browse/IGNITE-6491 Project: Ignite Issue Type: Bug Components: cache, general Affects Versions: 2.1 Reporter: Alexandr Kuramshin Assignee: Alexandr Kuramshin Fix For: 2.2 The following wrong cache {{validate}}/{{put}} sequence may occur On node left {{GridDhtPartitionsExchangeFuture}} will be generated by the {{disco-event-worker}} thread. Then the {{exchange-worker}} thread does {noformat} Split-brain detected [cacheName=test40, activatorTopVer=0, cacheTopVer=14] at org.apache.ignite.internal.util.IgniteUtils.dumpStack(IgniteUtils.java:1141) at org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest$SplitAwareTopologyValidator.validate(IgniteTopologyValidatorGridSplitCacheTest.java:307) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCacheGroup(GridDhtTopologyFutureAdapter.java:64) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:1456) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onDone(GridDhtPartitionsExchangeFuture.java:115) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:450) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:668) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2278) {noformat} The result of validation is stored in {{grpValidRes}} with value of {{false}}. After some delay the {{disco-event-worker}} thread will do {noformat} java.lang.Exception: Node is segment activator [cacheName=test40, activatorTopVer=14] at org.apache.ignite.internal.util.IgniteUtils.dumpStack(IgniteUtils.java:1141) at org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest$SplitAwareTopologyValidator$2.apply(IgniteTopologyValidatorGridSplitCacheTest.java:360) at org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest$SplitAwareTopologyValidator$2.apply(IgniteTopologyValidatorGridSplitCacheTest.java:349) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager$UserListenerWrapper.onEvent(GridEventStorageManager.java:1463) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:859) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.notifyListeners(GridEventStorageManager.java:844) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record0(GridEventStorageManager.java:341) at org.apache.ignite.internal.managers.eventstorage.GridEventStorageManager.record(GridEventStorageManager.java:307) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.recordEvent(GridDiscoveryManager.java:2478) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body0(GridDiscoveryManager.java:2684) at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager$DiscoveryWorker.body(GridDiscoveryManager.java:2507) {noformat} After this invocation the result of {{SplitAwareTopologyValidator.validate}} should be changed to {{true}}, but it was already invoked and the result has been cached in {{grpValidRes}} with the value of {{false}}. So any successive calls to {{cache.put}} causes to fail {noformat} Test failed. java.lang.RuntimeException: tryPut() failed [gridName=cache.IgniteTopologyValidatorGridSplitCacheTest0] at org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest.tryPut(IgniteTopologyValidatorGridSplitCacheTest.java:262) at org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest.testTopologyValidator(IgniteTopologyValidatorGridSplitCacheTest.java:182) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at junit.framework.TestCase.runTest(TestCase.java:176) at org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:2000) at org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:132) at org.apache.ignite.testframework.junits.GridAbstractTest$5.run(GridAbstractTest.java:1915) at java.lang.Thread.run(Thread.java:748) Caused by: javax.cache.CacheException: class org.apache.ignite.IgniteCheckedException: Failed to perform cache operation (cache topology is not valid): test40 at org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1327) at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.cacheException(IgniteCacheProxyImpl.java:1672) at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1032) at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:872) at org.apache.ignite.internal.processors.cache.IgniteTopologyValidatorGridSplitCacheTest.tryPut(IgniteTopologyValidatorGridSplitCacheTest.java:252) ... 10 more Caused by: class org.apache.ignite.IgniteCheckedException: Failed to perform cache operation (cache topology is not valid): test40 at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:112) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:415) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1170) at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put0(GridDhtAtomicCache.java:659) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2334) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2311) at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.put(IgniteCacheProxyImpl.java:1029) ... 12 more {noformat} The updated test {{IgniteTopologyValidatorGridSplitCacheTest}} fails frequently on my laptop with 8 nodes and 100 caches. -- This message was sent by Atlassian JIRA (v6.4.14#64029)