[DISCUSS] Replicator scheduler improvement

Nick Vatamaniuc Tue, 12 Jan 2021 09:55:50 -0800

Hello everyone

I wanted to see what we thought about adding a scheduling improvement
to the replicator. Specifically adding round-robin fairness amongst
different _replicator dbs.


Currently, the scheduler runs all the jobs in the system fairly. It
does it by using the jobs' "last started" timestamp to select which
jobs to stop and start next. So each scheduling cycle, the jobs which
ran the longest (have the lowest start time) get stopped, then, from
the list of pending jobs we also pick those with the lowest start
times (since they waited the longest) to run next. However, this
algorithm can be unfair among replication jobs from different
_replicator dbs. For example, if max_jobs=10 and one _replicator db
has a 1000 jobs and another has only one doc then, that one job might
have to wait for quite a while to get its turn to run. If we picked
fairly amongst _replicator dbs, then for example, the one job would
get at least one of the 10 max_jobs run "slots", and the rest of the 9
slots would go to the replicator db with 1000 docs. If there would be
11 _replicator dbs, then a docs from the ones with the lowest start
times amongst them would be picked first.

This feature would also allow running some quick replication jobs,
when there is already a full queue main _replicator db jobs by
creating a "quickjobs/_replicator" db, and insert these new jobs
there. With this new scheme they would start running right away even
if the queue is full with jobs from the main _replicator db. Another,
related idea I had, was to add per job user-configurable priorities:
high/default/low or numeric. However, that scheme is not as good as it
could lead to permanent starvation of jobs while the round-robin db
scheduling scheme still guarantees all jobs will eventually run.

Would this feature be useful to have? I was thinking of giving it a
try on a branch. I suspect implementing this for 3.x might be a tad
simpler since scheduling is done in memory independently on each
separate node so was thinking of starting there first. For main
(future 4.x) we might require some coordination state to live in FDB
and would have to possibly patch up couch_jobs to know about
priorities.

Cheers,
-Nick

[DISCUSS] Replicator scheduler improvement

Reply via email to