Re: Service isn't initialized on cluster re-activation

Denis Mekhanikov Thu, 17 May 2018 04:25:49 -0700

Nick,

*TcpDiscoveryMetricsUpdateMessage* (aka heartbeats) are processed in the
discovery thread, but not the exchange.
So, if your *cancel() *method is hanging, then an exchange won't be able to
proceed, but discovery will work fine.
Hanging deactivation is not a pleasant thing, but I think, this is a fair
consequence of a badly implemented *cancel()* method.


Denis

вт, 15 мая 2018 г. в 20:24, npordash <[email protected]>:

> Hey Denis,
>
> I agree with your assessment. I think it's unexpected that the cluster
> transitions to inactive state prior to services being given a chance to
> shutdown gracefully.
>
> However, to protect the overall cluster health I think we do need some kind
> of protection against errant services with cancel methods that may block
> for
> an unacceptable period of time. What we probably don't want to have happen
> is that the exchange thread blocks for so long while trying to shutdown
> services that it can't handle heartbeats within the configured failure
> detection timeout and trigger a node failure.
>
> Perhaps during deactivation the cancel method can be invoked in the service
> deployment thread pool and the exchange thread can wait for that take to
> complete until some timeout is reached before it gives up. Between each
> service cancellation it would need to prioritize heartbeat handling to
> prevent node failures.
>
> WDYT?
>
> -Nick
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Service isn't initialized on cluster re-activation

Reply via email to