[
https://issues.apache.org/jira/browse/SLING-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13899228#comment-13899228
]
Stefan Egli commented on SLING-3382:
------------------------------------
to allow the topology still react somewhat timely, the discovery.impl could
hook into the shutdown procedure and send out connector-disconnects, so that a
normal shutdown doesnt leave that instance in the topology for too long
> introduce back-off strategy for topology connector frequency
> ------------------------------------------------------------
>
> Key: SLING-3382
> URL: https://issues.apache.org/jira/browse/SLING-3382
> Project: Sling
> Issue Type: Improvement
> Components: Extensions
> Affects Versions: Discovery Impl 1.0.2
> Reporter: Stefan Egli
> Assignee: Stefan Egli
>
> Currently topology heartbeats are sent every 15 or 30 sec, which might seem a
> lot – especially as they were way too chatty (which is fixed now with
> SLING-3377). The suggestion by [~fmeschbe] is to lower this heartbeat
> frequency.
> The main reason for having a high heartbeat frequency is quicker failure
> detection – but it's obviously a trade-off as it increases load.
> Here's a proposal for how to tackle this:
> * introduce two different sets of heartbeats, one for repository and one for
> connectors
> * the repository ones would remain at the current frequency (suggested
> default: 30sec interval, 60sec timeout). The idea is that we would want to
> detect crashes within a cluster rather quickly, more quickly than in the
> topology in general.
> * the connectors would get a back-off behavior, where initially the values
> are the same (30sec/60sec) but then they send out less frequent heartbeats
> over time, reaching a max (eg 5min). This would have to be controlled by the
> receiving side, ie both sides of the connector have to agree that interval
> and timeout are the same.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)