Nick, *TcpDiscoveryMetricsUpdateMessage* (aka heartbeats) are processed in the discovery thread, but not the exchange. So, if your *cancel() *method is hanging, then an exchange won't be able to proceed, but discovery will work fine. Hanging deactivation is not a pleasant thing, but I think, this is a fair consequence of a badly implemented *cancel()* method.
Denis вт, 15 мая 2018 г. в 20:24, npordash <[email protected]>: > Hey Denis, > > I agree with your assessment. I think it's unexpected that the cluster > transitions to inactive state prior to services being given a chance to > shutdown gracefully. > > However, to protect the overall cluster health I think we do need some kind > of protection against errant services with cancel methods that may block > for > an unacceptable period of time. What we probably don't want to have happen > is that the exchange thread blocks for so long while trying to shutdown > services that it can't handle heartbeats within the configured failure > detection timeout and trigger a node failure. > > Perhaps during deactivation the cancel method can be invoked in the service > deployment thread pool and the exchange thread can wait for that take to > complete until some timeout is reached before it gives up. Between each > service cancellation it would need to prioritize heartbeat handling to > prevent node failures. > > WDYT? > > -Nick > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >
