[
https://issues.apache.org/jira/browse/SLING-8531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Timothee Maret closed SLING-8531.
---------------------------------
> Support JournalAvailabilityChecker exponential backoff
> -------------------------------------------------------
>
> Key: SLING-8531
> URL: https://issues.apache.org/jira/browse/SLING-8531
> Project: Sling
> Issue Type: Improvement
> Components: Content Distribution
> Affects Versions: Content Distribution Journal Core 0.1.2
> Reporter: Timothee Maret
> Assignee: Christian Schneider
> Priority: Major
> Fix For: Content Distribution Journal Core 0.1.4, Content
> Distribution Journal Kafka 0.1.4, Content Distribution Journal Messages 0.1.2
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> The average load generated by JournalAvailabilityChecker multiplies quickly
> for multi tenant deployments. The checker can be configured (via Sling
> Scheduler {{scheduler.period}}) to reduce the polling frequency but doing so
> also reduces the sensibility to detect availability changes.
> To improve the sensibility we should support an exponential backoff
> algorithm. The algorithm would divide the rate by two (up to a limit) every
> time the availability status does not change and reset the rate when the
> status changes. Steady states (available or unavailable) would eventually
> yield the least load. In the average case (availability status is steady) the
> load will be reduced up to the limit. In the worst case (availability changes
> all the time) the load will not be reduced compared to today.
> The base rate would be Sling Scheduler {{scheduler.period}}. The rate at time
> t + 1 would be computed as follow: Rate~t+1~ = Multiplier~t+1~ * Rate~t+1~.
> The table below summarise how the multiplier would evolve according to the
> available status change.
> ||State~t~||State~t+1~||Multiplier~t+1~||
> |unavailable|unavailable|max(2 * Multiplier~t~, limit)|
> |unavailable|available|1|
> |available|unavailable|1|
> |available|available|max(2 * Multiplier~t~, limit)|
> The limit would be hardcoded to 16 which would reduce the load by an order of
> magnitude, we could expose the limit as a configuration later if needed.
> There should be no need to randomise the multiplier for now as the checker
> are expected to be started at random time. If we hit a scenario where the
> checkers start at the same time, we could simply randomise the first
> scheduled event.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)