[ 
https://issues.apache.org/jira/browse/IGNITE-28662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Petrov updated IGNITE-28662:
------------------------------------
    Attachment: reproducer.patch

> Node stop may be infinitely blocked by atomic cache operations invoked from 
> thin client
> ---------------------------------------------------------------------------------------
>
>                 Key: IGNITE-28662
>                 URL: https://issues.apache.org/jira/browse/IGNITE-28662
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Mikhail Petrov
>            Priority: Major
>         Attachments: reproducer.patch
>
>
> Consider a cluster of 3 nodes - node0, node1, node2
> 1. node1 and node2 receive put request from a thin client. node1 is "primary" 
> for cache keys received by node2 and node2 is "primary" for cache keys 
> received by node1. Both of them begin operations execution and wait for them 
> to complete.
> 2. node1 and node2 receive stop signal (Ignite#close). The stop procedure on 
> both nodes blocks on GridNioAsyncNotifyFilter#stop, which waits for the thin 
> client operations to complete.
> 3. node1 and node2 fail to process cache request for some reason (a cache 
> interceptor raised an exceception)
> 4. node1 and node 2 will not send GridNearAtomicUpdateResponse with failed 
> keys to each other because they are both stopping (see 
> GridCacheIoManager#onSend). This message is an indication to the "near" node 
> that some keys could not be processed and the operation should be terminated 
> with an exception.
> 5. node1 and node2 are unable to complete the cache operations received from 
> the thin client (both of them will never receive GridNearAtomicUpdateResponse 
> or NODE_LEFT event for the primary node ) -> they are unable to complete the 
> stop procedure



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to