[ 
https://issues.apache.org/jira/browse/IGNITE-20828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Puchkovskiy updated IGNITE-20828:
---------------------------------------
    Summary: Do not retry attempts to unsubscribe in 
TopologyAwareRaftGroupService  (was: Do not retry attempts to (un)subscribe in 
TopologyAwareRaftGroupService)

> Do not retry attempts to unsubscribe in TopologyAwareRaftGroupService
> ---------------------------------------------------------------------
>
>                 Key: IGNITE-20828
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20828
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Roman Puchkovskiy
>            Assignee: Roman Puchkovskiy
>            Priority: Major
>              Labels: ignite-3
>             Fix For: 3.0.0-beta2
>
>
> When TopologyAwareRaftGroupService is shutdown, it tries to unsubscribe 
> itself from all peers. If the unsubscription fails, it tries to get the 
> logical topology (calling the CMG leader with RAFT), check that the target 
> node is still in the topology, and if yes, retry the unsubscription request. 
> So, if the CMG leader has already left the topology, an attempt to check the 
> logical topology will take 10 seconds. This makes partition stop in 
> TableManager timeout (as it has a limit of 10 seconds), which in turn results 
> in a partition group staying registered with Loza even after 
> TableManager#stop() returns, which causes Loza#stop() to fail the Ignite node 
> stop procedure (leaving HTTP(S) ports bound).
> It seems that it makes no sense to retry unsubscription requests at all. Even 
> more, subscription requests should not be retries as well (instead, 
> propagating the exception right away). The difference between the scenarios 
> should be that for unsubscription an exception should never be propagated (if 
> it's not an Error).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to