Sounds like a good strategy to me

+1

Carsten


2014-02-07 11:38 GMT+01:00 Stefan Egli <[email protected]>:

> Hi,
>
> During an offline discussion, Felix brought up the suggestion to lower the
> topology connector's heartbeat frequency. Currently they are sent every 15
> or 30 sec, which might seem a lot - especially as they were way too chatty
> (which is fixed now with SLING-3377).
>
> The main reason for having a high heartbeat frequency is quicker failure
> detection - but it's obviously a trade-off as it increases load.
>
> I would like to get some opinion on to the following proposal:
>
>   *   introduce two different sets of heartbeats, one for repository and
> one for connectors
>   *   the repository ones would remain at the current frequency (suggested
> default: 30sec interval, 60sec timeout). The idea is that we would want to
> detect crashes within a cluster rather quickly, more quickly than in the
> topology in general.
>   *   the connectors would get a back-off behavior, where initially the
> values are the same (30sec/60sec) but then they send out less frequent
> heartbeats over time, reaching a max (eg 5min). This would have to be
> controlled by the receiving side, ie both sides of the connector have to
> agree that interval and timeout are the same.
>
> I've opened a Jira to track this, please comment there:
>
> https://issues.apache.org/jira/browse/SLING-3382
>
> Thanks,
> Cheers,
> Stefan
>



-- 
Carsten Ziegeler
[email protected]

Reply via email to