davisp commented on a change in pull request #470: Scheduling Replicator
URL: https://github.com/apache/couchdb/pull/470#discussion_r111619093
##########
File path: src/couch_replicator/README.md
##########
@@ -0,0 +1,297 @@
+Developer Oriented Replicator Description
+=========================================
+
+This description of scheduling replicator's functionality is mainly geared to
+CouchDB developers. It dives a bit into the internal and explains how
+everything is connected together.
+
+A natural place to start is the top applicatin supervisor:
+`couch_replicator_sup`. It's a `rest_for_one` so if a child process
+terminates, the rest of the childred in the hierarchy following it are also
+terminated. This structure implies a useful constraint -- children to the
"right"
+if viewing it vertically with the root at the top, can safely call children
+on the "left", because this supervisor ensures those on the "left" will already
+be started and runnig.
+
+A description of each child:
+
+ * `couch_replication_event`: Starts a gen_event publication bus to handle some
+ replication related events. This used for example, to publish cluster
+ membership changes by the `couch_replicator_clustering` process. But is
+ also used in replication tests to minotor for replication events.
+ Notification is performed via the `couch_replicator_notifier:notify/1`
+ function. It's the first (left-most) child because
+ `couch_replicator_clustering` is using.
+
+ * `couch_replicator_clustering`: This module maintains cluster membership
+ information for replication application and provides functions to check
+ ownership of replication jobs. A cluster membership change is published via
+ the `gen_event` event server set up in the `couch_replication_event` child
+ above. Published events are `{cluster, stable}` when cluster membership has
+ stabilized, that it is not fluctuating anymore, and `{cluster, unstable}`
+ which indicates there was a recent change to the cluster membership and now
+ it's considered unstable. Listeners for cluster membership change include
+ `couch_replicator_doc_processor` and `couch_replicator_db_changes`. When
+ doc processor gets an `{cluster, stable}` event it will remove all the
+ replication jobs not belonging to the current node. When
+ `couch_replicator_db_chanages` gets a `{cluster, stable}` event, it will
+ restart `couch_multidb_changes` process it controls which will launch an
+ new scan of all the replicator databases.
+
+ * `couch_replicator_connection`: Maintains a global replication connection
+ pool. It allows reusing connection across replication tasks. Main interface
+ is a `acquire/1` and `release/1`. The main idea here is that once a
+ connection is established, it is kept around for
+ `replicator.connection_close_interval` milliseconds in case another
+ replication task wants to re-use it. It is worth pointing out how linking
+ and monitoring is handled: Workers are linked to the connection pool when
+ they are created. If they crash connection pool listens for the EXIT event
+ and cleans up. Connection pool also monitors owners (by monitoring the the
+ `Pid` from the `From` argument in the call to `acquire/1`) and cleans up if
+ owner dies. Another interesting thing is that connection establishment
+ (creation) happens in the owner process so the pool is not blocked on it.
+
+ * `couch_replicator_rate_limiter` : Implements a rate limiter to handle
+ connection throttling from sources or targets where requests return 429
+ error codes. Uses the Additive Increase / Multiplicative Decrease feedback
+ control algorithm to converge on the channel capacity. Implemented using a
+ 16-way sharded ETS table to maintain connection state. The table sharding
+ code is split out to `couch_replicator_rate_limiter_tables` module. The
+ main idea of the module it so maintain and continually estimate an interval
+ for each connection represented by the `{Method, Url}`. The interval is
Review comment:
The purpose of the module is to maintain and continually estimate sleep
intervals for each connection represented as a ``{Method, Url}`` pair.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services