[ 
https://issues.apache.org/jira/browse/IGNITE-10376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16704658#comment-16704658
 ] 

Roman Kondakov edited comment on IGNITE-10376 at 11/30/18 12:24 PM:
--------------------------------------------------------------------

[~ivanan.fed], in my opinion NPE here is a consequence, but not a reason. Here 
is my vision of this situation:
 # Some set of user and system threads hang on the binary metadata registration 
future: 
{{CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:495)}}
 for *unknown* reason.
 # Blocked system threads are detected by the failure handler watchdog - we can 
see it in the log.
 # Grid remains hanged until test timeout happens.
 # On test timeout all nodes are stopped by the test framework. During nodes 
stoping all managers are stopped and cleaned, including 
{{CacheOffheapEvictionManager}}. After this point if we call eviction manager 
from cache context, it we'll return {{null}}.
 # Along with stopping managers, all waiting threads are interrupted - we also 
can see it in log: {{IgniteInterruptedCheckedException: Got interrupted while 
waiting for future to complete.}}
 # Having being interrupted threads release their locks, and unblock another 
threads which continue to perform their tasks - send messages etc.
 # Messages sent by these threads cannot be processed in the proper way - nodes 
are stopping, managers, including {{CacheOffheapEvictionManager}} are stopping 
too. Attempts to obtain the eviction manager from cache context fails - cache 
context is {{null}} at this point, so we have an NPE at this point.

As you can see NPE here appears at the very last stage. The main bug here - is 
the unknown  threads hanging described in p. 1. 


was (Author: rkondakov):
[~ivanan.fed], in my opinion NPE here is a consequence, but not a reason. Here 
is my vision of this situation:
 # Some set of user and system threads hang on the binary metadata registration 
future: 
{{CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:495)}}
 for *unknown* reason.
 # Blocked system threads are detected by the failure handler watchdog - we can 
see it in the log.
 # Grid remains hanged until test timeout happens.
 # On test timeout all nodes are stopped by the test framework. During nodes 
stoping all managers are stopped and cleaned, including 
{{CacheOffheapEvictionManager}}. After this point if we call eviction manager 
from cache context, it we'll return {{null}}.
 # Along with stopping managers, all waiting threads are interrupted - we also 
can see it in log: {{IgniteInterruptedCheckedException: Got interrupted while 
waiting for future to complete.}}
 # Having being interrupted threads release their locks, and unblock another 
threads which continue to perform their tasks - send messages etc.
 # Messages sent by these threads cannot be processed in the proper way - nodes 
are stopping, managers, including {{CacheOffheapEvictionManager}} are stopping 
too. Attempts to obtain the eviction manager from cache context fails - cache 
context is {{null}} at this point - so we have an NPE at this point.

As you can see NPE here appears at the very last stage. The main bug here - is 
the unknown  threads hanging described in p. 1. 

> Failed to touch in CacheOffheapEvictionManager
> ----------------------------------------------
>
>                 Key: IGNITE-10376
>                 URL: https://issues.apache.org/jira/browse/IGNITE-10376
>             Project: Ignite
>          Issue Type: Test
>    Affects Versions: 2.7
>            Reporter: Ivan Fedotov
>            Assignee: Ivan Fedotov
>            Priority: Blocker
>              Labels: MakeTeamcityGreenAgain, stability, test-fail
>         Attachments: IGNITE-10376 log.txt
>
>
> BinaryObjectException exception sometimes appears in 
> [testAtomicOnheapTwoBackupAsyncFullSync|https://ci.ignite.apache.org/viewLog.html?buildId=2398013&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ContinuousQuery4#testNameId3300126853696550025]
>  at the 
> [moment|https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/internal/processors/cache/query/continuous/CacheContinuousQueryOrderingEventTest.java#L371]
>  of CacheEntryProcessor invocation.
> {code}class org.apache.ignite.binary.BinaryObjectException: Failed to update 
> meta data for type: 
> org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryOrderingEventTest$QueryTestValue
>       at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl.addMeta(CacheObjectBinaryProcessorImpl.java:516)
>       at 
> org.apache.ignite.internal.processors.cache.binary.CacheObjectBinaryProcessorImpl$1.addMeta(CacheObjectBinaryProcessorImpl.java:194)
>       at 
> org.apache.ignite.internal.binary.BinaryContext.updateMetadata(BinaryContext.java:1332)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1815)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1668)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.sendSingleRequest(GridNearAtomicAbstractUpdateFuture.java:299)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:483)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:443)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:248)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1150)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.invoke0(GridDhtAtomicCache.java:831)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.invoke(GridDhtAtomicCache.java:787)
>       at 
> org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.invoke(IgniteCacheProxyImpl.java:1438)
>       at 
> org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.invoke(IgniteCacheProxyImpl.java:1482)
>       at 
> org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.invoke(GatewayProtectedCacheProxy.java:1228)
>       at 
> org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryOrderingEventTest$1.run(CacheContinuousQueryOrderingEventTest.java:373)
>       at 
> org.apache.ignite.testframework.GridTestUtils$7.call(GridTestUtils.java:1300)
>       at 
> org.apache.ignite.testframework.GridTestThread.run(GridTestThread.java:84){code}
> It can be because of absence of locks in 
> GridCacheMapEntry#touch(GridCacheMapEntry.java:5063).
> It seems that test does not work after integration MVCC in Continuous Query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to