[ https://issues.apache.org/jira/browse/IGNITE-10376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16704658#comment-16704658 ]
Roman Kondakov edited comment on IGNITE-10376 at 11/30/18 12:24 PM: -------------------------------------------------------------------- [~ivanan.fed], in my opinion NPE here is a consequence, but not a reason. Here is my vision of this situation: # Some set of user and system threads hang on the binary metadata registration future: {{CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:495)}} for *unknown* reason. # Blocked system threads are detected by the failure handler watchdog - we can see it in the log. # Grid remains hanged until test timeout happens. # On test timeout all nodes are stopped by the test framework. During nodes stoping all managers are stopped and cleaned, including {{CacheOffheapEvictionManager}}. After this point if we call eviction manager from cache context, it we'll return {{null}}. # Along with stopping managers, all waiting threads are interrupted - we also can see it in log: {{IgniteInterruptedCheckedException: Got interrupted while waiting for future to complete.}} # Having being interrupted threads release their locks, and unblock another threads which continue to perform their tasks - send messages etc. # Messages sent by these threads cannot be processed in the proper way - nodes are stopping, managers, including {{CacheOffheapEvictionManager}} are stopping too. Attempts to obtain the eviction manager from cache context fails - cache context is {{null}} at this point, so we have an NPE at this point. As you can see NPE here appears at the very last stage. The main bug here - is the unknown threads hanging described in p. 1. was (Author: rkondakov): [~ivanan.fed], in my opinion NPE here is a consequence, but not a reason. Here is my vision of this situation: # Some set of user and system threads hang on the binary metadata registration future: {{CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:495)}} for *unknown* reason. # Blocked system threads are detected by the failure handler watchdog - we can see it in the log. # Grid remains hanged until test timeout happens. # On test timeout all nodes are stopped by the test framework. During nodes stoping all managers are stopped and cleaned, including {{CacheOffheapEvictionManager}}. After this point if we call eviction manager from cache context, it we'll return {{null}}. # Along with stopping managers, all waiting threads are interrupted - we also can see it in log: {{IgniteInterruptedCheckedException: Got interrupted while waiting for future to complete.}} # Having being interrupted threads release their locks, and unblock another threads which continue to perform their tasks - send messages etc. # Messages sent by these threads cannot be processed in the proper way - nodes are stopping, managers, including {{CacheOffheapEvictionManager}} are stopping too. Attempts to obtain the eviction manager from cache context fails - cache context is {{null}} at this point - so we have an NPE at this point. As you can see NPE here appears at the very last stage. The main bug here - is the unknown threads hanging described in p. 1. > Failed to touch in CacheOffheapEvictionManager > ---------------------------------------------- > > Key: IGNITE-10376 > URL: https://issues.apache.org/jira/browse/IGNITE-10376 > Project: Ignite > Issue Type: Test > Affects Versions: 2.7 > Reporter: Ivan Fedotov > Assignee: Ivan Fedotov > Priority: Blocker > Labels: MakeTeamcityGreenAgain, stability, test-fail > Attachments: IGNITE-10376 log.txt > > > BinaryObjectException exception sometimes appears in > [testAtomicOnheapTwoBackupAsyncFullSync|https://ci.ignite.apache.org/viewLog.html?buildId=2398013&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ContinuousQuery4#testNameId3300126853696550025] > at the > [moment|https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/internal/processors/cache/query/continuous/CacheContinuousQueryOrderingEventTest.java#L371] > of CacheEntryProcessor invocation. > {code}class org.apache.ignite.binary.BinaryObjectException: Failed to update > meta data for type: > org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryOrderingEventTest$QueryTestValue > at > org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:516) > at > org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.addMeta(CacheObjectBinaryProcessorImpl.java:194) > at > org.apache.ignite.internal.binary.BinaryContext.updateMetadata(BinaryContext.java:1332) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1815) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1668) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:299) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:483) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:443) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1150) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.invoke0(GridDhtAtomicCache.java:831) > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.invoke(GridDhtAtomicCache.java:787) > at > org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.invoke(IgniteCacheProxyImpl.java:1438) > at > org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.invoke(IgniteCacheProxyImpl.java:1482) > at > org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.invoke(GatewayProtectedCacheProxy.java:1228) > at > org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryOrderingEventTest$1.run(CacheContinuousQueryOrderingEventTest.java:373) > at > org.apache.ignite.testframework.GridTestUtils$7.call(GridTestUtils.java:1300) > at > org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:84){code} > It can be because of absence of locks in > GridCacheMapEntry#touch(GridCacheMapEntry.java:5063). > It seems that test does not work after integration MVCC in Continuous Query. -- This message was sent by Atlassian JIRA (v7.6.3#76005)