Timothee Maret created SLING-8531:
-------------------------------------
Summary: Support JournalAvailabilityChecker exponential backoff
Key: SLING-8531
URL: https://issues.apache.org/jira/browse/SLING-8531
Project: Sling
Issue Type: Improvement
Components: Content Distribution
Affects Versions: Content Distribution Journal Core 0.1.2
Reporter: Timothee Maret
Assignee: Timothee Maret
Fix For: Content Distribution Journal Core 0.1.4
The average load generated by JournalAvailabilityChecker multiplies quickly for
multi tenant deployments. The checker can be configured (via Sling Scheduler
{{scheduler.period}}) to reduce the polling frequency but doing so also reduces
the sensibility to detect availability changes.
To improve the sensibility we should support an exponential backoff algorithm.
The algorithm would divide the rate by two (up to a limit) every time the
availability status does not change and reset the rate when the status changes.
Steady states (available or unavailable) would eventually yield the least load.
In the average case (availability status is steady) the load will be reduced up
to the limit. In the worst case (availability changes all the time) the load
will not be reduced compared to today.
The base rate would be Sling Scheduler {{scheduler.period}}. The rate at time t
+ 1 would be computed as follow: Rate~t+1~ = Multiplier~t+1~ * Rate~t+1~. The
table below summarise how the multiplier would evolve according to the
available status change.
||State~t~||State~t+1~||Multiplier~t+1~||
|unavailable|unavailable|max(2 * Multiplier~t~, limit)|
|unavailable|available|1|
|available|unavailable|1|
|available|available|max(2 * Multiplier~t~, limit)|
The limit would be hardcoded to 16 which would reduce the load by an order of
magnitude, we could expose the limit as a configuration later if needed.
There should be no need to randomise the multiplier for now as the checker are
expected to be started at random time. If we hit a scenario where the checkers
start at the same time, we could simply randomise the first scheduled event.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)