[ 
https://issues.apache.org/jira/browse/SLING-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothee Maret updated SLING-8531:
----------------------------------
    Description: 
The average load generated by JournalAvailabilityChecker multiplies quickly for 
multi tenant deployments. The checker can be configured (via Sling Scheduler 
{{scheduler.period}}) to reduce the polling frequency but doing so also reduces 
the sensibility to detect availability changes.

To improve the sensibility we should support an exponential backoff algorithm. 
The algorithm would divide the rate by two (up to a limit) every time the 
availability status does not change and reset the rate when the status changes. 
Steady states (available or unavailable) would eventually yield the least load. 
In the average case (availability status is steady) the load will be reduced up 
to the limit. In the worst case (availability changes all the time) the load 
will not be reduced compared to today. 

The base rate would be Sling Scheduler {{scheduler.period}}. The rate at time t 
+ 1 would be computed as follow: Rate~t+1~ = Multiplier~t+1~ * Rate~t+1~. The 
table below summarise how the multiplier would evolve according to the 
available status change. 
||State~t~||State~t+1~||Multiplier~t+1~||
|unavailable|unavailable|max(2 * Multiplier~t~, limit)|
|unavailable|available|1|
|available|unavailable|1|
|available|available|max(2 * Multiplier~t~, limit)|

The limit would be hardcoded to 16 which would reduce the load by an order of 
magnitude, we could expose the limit as a configuration later if needed.

There should be no need to randomise the multiplier for now as the checker are 
expected to be started at random time. If we hit a scenario where the checkers 
start at the same time, we could simply randomise the first scheduled event.

 

  was:
The average load generated by JournalAvailabilityChecker multiplies quickly for 
multi tenant deployments. The checker can be configured (via Sling Scheduler 
{{scheduler.period}}) to reduce the polling frequency but doing so also reduces 
the sensibility to detect availability changes.

To improve the sensibility we should support an exponential backoff algorithm. 
The algorithm would divide the rate by two (up to a limit) every time the 
availability status does not change and reset the rate when the status changes. 
Steady states (available or unavailable) would eventually yield the least load. 
In the average case (availability status is steady) the load will be reduced up 
to the limit. In the worst case (availability changes all the time) the load 
will not be reduced compared to today. 

The base rate would be Sling Scheduler {{scheduler.period}}. The rate at time t 
+ 1 would be computed as follow: Rate~t+1~ = Multiplier~t+1~ * Rate~t+1~. The 
table below summarise how the multiplier would evolve according to the 
available status change. 
||State~t~||State~t+1~||Multiplier~t+1~||
|unavailable|unavailable|max(2 * Multiplier~t~, limit)|
|unavailable|available|1|
|available|unavailable|1|
|available|available|max(2 * Multiplier~t~, limit)|

The limit would be hardcoded to 16 which would reduce the load by an order of 
magnitude, we could expose the limit as a configuration later if needed.

There should be no need to randomise the multiplier for now as the checker are 
expected to be started at random time. If we hit a scenario where the checkers 
start at the same time, we could simply randomise the first scheduled event.

 


> Support JournalAvailabilityChecker exponential backoff 
> -------------------------------------------------------
>
>                 Key: SLING-8531
>                 URL: https://issues.apache.org/jira/browse/SLING-8531
>             Project: Sling
>          Issue Type: Improvement
>          Components: Content Distribution
>    Affects Versions: Content Distribution Journal Core 0.1.2
>            Reporter: Timothee Maret
>            Assignee: Timothee Maret
>            Priority: Major
>             Fix For: Content Distribution Journal Core 0.1.4
>
>
> The average load generated by JournalAvailabilityChecker multiplies quickly 
> for multi tenant deployments. The checker can be configured (via Sling 
> Scheduler {{scheduler.period}}) to reduce the polling frequency but doing so 
> also reduces the sensibility to detect availability changes.
> To improve the sensibility we should support an exponential backoff 
> algorithm. The algorithm would divide the rate by two (up to a limit) every 
> time the availability status does not change and reset the rate when the 
> status changes. Steady states (available or unavailable) would eventually 
> yield the least load. In the average case (availability status is steady) the 
> load will be reduced up to the limit. In the worst case (availability changes 
> all the time) the load will not be reduced compared to today. 
> The base rate would be Sling Scheduler {{scheduler.period}}. The rate at time 
> t + 1 would be computed as follow: Rate~t+1~ = Multiplier~t+1~ * Rate~t+1~. 
> The table below summarise how the multiplier would evolve according to the 
> available status change. 
> ||State~t~||State~t+1~||Multiplier~t+1~||
> |unavailable|unavailable|max(2 * Multiplier~t~, limit)|
> |unavailable|available|1|
> |available|unavailable|1|
> |available|available|max(2 * Multiplier~t~, limit)|
> The limit would be hardcoded to 16 which would reduce the load by an order of 
> magnitude, we could expose the limit as a configuration later if needed.
> There should be no need to randomise the multiplier for now as the checker 
> are expected to be started at random time. If we hit a scenario where the 
> checkers start at the same time, we could simply randomise the first 
> scheduled event.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to