This is an automated email from the ASF dual-hosted git repository.
vatamane pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/couchdb-documentation.git
The following commit(s) were added to refs/heads/main by this push:
new 1186c83 3.x fair share scheduler documetation (#629)
1186c83 is described below
commit 1186c837e36413cbba1e076815e039384afb4744
Author: Nick Vatamaniuc <[email protected]>
AuthorDate: Wed Mar 17 11:20:53 2021 -0400
3.x fair share scheduler documetation (#629)
A short description on how the algorithm works along with the
configuration sections.
Main PR: https://github.com/apache/couchdb/pull/3364
---
src/config/replicator.rst | 49 ++++++++++++++++++++++++++++++++++++++++++
src/replication/replicator.rst | 45 ++++++++++++++++++++++++++++++++++++++
2 files changed, 94 insertions(+)
diff --git a/src/config/replicator.rst b/src/config/replicator.rst
index 9e78b59..9e46026 100644
--- a/src/config/replicator.rst
+++ b/src/config/replicator.rst
@@ -249,3 +249,52 @@ Replicator Database Configuration
.. note::
In version 2.2, the session plugin is considered experimental and
is not enabled by default.
+
+ .. config:option:: usage_coeff
+
+ .. versionadded:: 3.2.0
+
+ Usage coefficient decays historic fair share usage every
+ scheduling cycle. The value must be between 0.0 and 1.0. Lower
+ values will ensure historic usage decays quicker and higher
+ values means it will be remembered longer::
+
+ [replicator]
+ usage_coeff = 0.5
+
+ .. config:option:: priority_coeff
+
+ .. versionadded:: 3.2.0
+
+ Priority coefficient decays all the job priorities such that they slowly
+ drift towards the front of the run queue. This coefficient defines a
maximum
+ time window over which this algorithm would operate. For example, if
this
+ value is too small (0.1), after a few cycles quite a few jobs would end
up at
+ priority 0, and would render this algorithm useless. The default value
of
+ 0.98 is picked such that if a job ran for one scheduler cycle, then
didn't
+ get to run for 7 hours, it would still have priority > 0. 7 hours was
picked
+ as it was close enough to 8 hours which is the default maximum error
backoff
+ interval::
+
+ [replicator]
+ priority_coeff = 0.98
+
+.. _config/replicator.shares:
+
+Fair Share Replicator Share Allocation
+======================================
+
+.. config:section:: replicator.shares :: Per-Database Fair Share Allocation
+
+ .. config:option:: $replicator_db
+
+ .. versionadded:: 3.2.0
+
+ Fair share configuration section. More shares result in a
+ higher chance that jobs from that db get to run. The default
+ value is 100, minimum is 1 and maximum is 1000. The
+ configuration may be set even if the database does not exist::
+
+ [replicator.shares]
+ _replicator_db = 100
+ $another/_replicator_db = 100
diff --git a/src/replication/replicator.rst b/src/replication/replicator.rst
index de53930..05a55e6 100644
--- a/src/replication/replicator.rst
+++ b/src/replication/replicator.rst
@@ -21,6 +21,11 @@ Replicator Database
anymore. There are new replication job states and new API endpoints
``_scheduler/jobs`` and ``_scheduler/docs``.
+.. versionchanged:: 3.2.0 Fair share scheduling was introduced. Multiple
+ ``_replicator`` databases get an equal chance (configurable) of running
+ their jobs. Previously replication jobs were scheduled without any regard of
+ their originating database.
+
The ``_replicator`` database works like any other in CouchDB, but
documents added to it will trigger replications. Create (``PUT`` or
``POST``) a document to start replication. ``DELETE`` a replication
@@ -539,6 +544,46 @@ After this operation, replication pulling from server X
will be stopped
and the replications in the ``_replicator`` database (pulling from
servers A and B) will continue.
+Fair Share Job Scheduling
+=========================
+
+When multiple ``_replicator`` databases are used, and the total number
+of jobs on any node is greater than ``max_jobs``, replication jobs
+will be scheduled such that each of the ``_replicator`` databases by
+default get an equal chance of running their jobs.
+
+This is accomplished by assigning a number of "shares" to each
+``_replicator`` database and then automatically adjusting the
+proportion of running jobs to match each database's proportion of
+shares. By default, each ``_replicator`` database is assigned 100
+shares. It is possible to alter the share assignments for each
+individual ``_replicator`` database in the :ref:`[replicator.shares]
+<config/replicator.shares>` configuration section.
+
+The fair share behavior is perhaps easier described with a set of
+examples. Each example assumes the default of ``max_jobs = 500``, and
+two replicator databases: ``_replicator`` and ``another/_replicator``.
+
+Example 1: If ``_replicator`` has 1000 jobs and
+``another/_replicator`` has 10, the scheduler will run about 490 jobs
+from ``_replicator`` and 10 jobs from ``another/_replicator``.
+
+Example 2: If ``_replicator`` has 200 jobs and ``another/_replicator``
+also has 200 jobs, all 400 jobs will get to run as the sum of all the
+jobs is less than the ``max_jobs`` limit.
+
+Example 3: If both replicator databases have 1000 jobs each, the
+scheduler will run about 250 jobs from each database on average.
+
+Example 4: If both replicator databases have 1000 jobs each, but
+``_replicator`` was assigned 400 shares, then on average the scheduler
+would run about 400 jobs from ``_replicator`` and 100 jobs from
+``_another/replicator``.
+
+The proportions described in the examples are approximate and might
+oscillate a bit, and also might take anywhere from tens of minutes to
+an hour to converge.
+
Replicating the replicator database
===================================