[ 
https://issues.apache.org/jira/browse/SLING-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Egli resolved SLING-3382.
--------------------------------

       Resolution: Fixed
    Fix Version/s: Discovery Impl 1.0.4

implemented the following backoff behavior:
 * for connectors that have stable announcements (ie where the announcements 
are identical) for a number of heartbeatIntervals, the servlet instructs the 
client to use a 'backoffInterval', which is increased to a maximum.
 * the maximum backoffInterval is configurable, currently set a 5 times the 
heartbeatInterval (the config parameter is relative)
 * whenever the client sends a different announcement (ie anything changes from 
its point of view in the topology), then the backoffInterval is reset

> introduce back-off strategy for topology connector frequency
> ------------------------------------------------------------
>
>                 Key: SLING-3382
>                 URL: https://issues.apache.org/jira/browse/SLING-3382
>             Project: Sling
>          Issue Type: Improvement
>          Components: Extensions
>    Affects Versions: Discovery Impl 1.0.2
>            Reporter: Stefan Egli
>            Assignee: Stefan Egli
>             Fix For: Discovery Impl 1.0.4
>
>
> Currently topology heartbeats are sent every 15 or 30 sec, which might seem a 
> lot – especially as they were way too chatty (which is fixed now with 
> SLING-3377). The suggestion by [~fmeschbe] is to lower this heartbeat 
> frequency.
> The main reason for having a high heartbeat frequency is quicker failure 
> detection – but it's obviously a trade-off as it increases load.
> Here's a proposal for how to tackle this:
>  * introduce two different sets of heartbeats, one for repository and one for 
> connectors
>  * the repository ones would remain at the current frequency (suggested 
> default: 30sec interval, 60sec timeout). The idea is that we would want to 
> detect crashes within a cluster rather quickly, more quickly than in the 
> topology in general.
>  * the connectors would get a back-off behavior, where initially the values 
> are the same (30sec/60sec) but then they send out less frequent heartbeats 
> over time, reaching a max (eg 5min). This would have to be controlled by the 
> receiving side, ie both sides of the connector have to agree that interval 
> and timeout are the same.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to