davisp commented on a change in pull request #470: Scheduling Replicator
URL: https://github.com/apache/couchdb/pull/470#discussion_r111618734
##########
File path: src/couch_replicator/README.md
##########
@@ -0,0 +1,297 @@
+Developer Oriented Replicator Description
+=========================================
+
+This description of scheduling replicator's functionality is mainly geared to
+CouchDB developers. It dives a bit into the internal and explains how
+everything is connected together.
+
+A natural place to start is the top applicatin supervisor:
+`couch_replicator_sup`. It's a `rest_for_one` so if a child process
+terminates, the rest of the childred in the hierarchy following it are also
+terminated. This structure implies a useful constraint -- children to the
"right"
+if viewing it vertically with the root at the top, can safely call children
+on the "left", because this supervisor ensures those on the "left" will already
+be started and runnig.
+
+A description of each child:
+
+ * `couch_replication_event`: Starts a gen_event publication bus to handle some
+ replication related events. This used for example, to publish cluster
+ membership changes by the `couch_replicator_clustering` process. But is
+ also used in replication tests to minotor for replication events.
+ Notification is performed via the `couch_replicator_notifier:notify/1`
+ function. It's the first (left-most) child because
+ `couch_replicator_clustering` is using.
+
+ * `couch_replicator_clustering`: This module maintains cluster membership
+ information for replication application and provides functions to check
+ ownership of replication jobs. A cluster membership change is published via
+ the `gen_event` event server set up in the `couch_replication_event` child
+ above. Published events are `{cluster, stable}` when cluster membership has
+ stabilized, that it is not fluctuating anymore, and `{cluster, unstable}`
+ which indicates there was a recent change to the cluster membership and now
+ it's considered unstable. Listeners for cluster membership change include
+ `couch_replicator_doc_processor` and `couch_replicator_db_changes`. When
+ doc processor gets an `{cluster, stable}` event it will remove all the
+ replication jobs not belonging to the current node. When
+ `couch_replicator_db_chanages` gets a `{cluster, stable}` event, it will
+ restart `couch_multidb_changes` process it controls which will launch an
+ new scan of all the replicator databases.
+
+ * `couch_replicator_connection`: Maintains a global replication connection
+ pool. It allows reusing connection across replication tasks. Main interface
+ is a `acquire/1` and `release/1`. The main idea here is that once a
+ connection is established, it is kept around for
+ `replicator.connection_close_interval` milliseconds in case another
+ replication task wants to re-use it. It is worth pointing out how linking
+ and monitoring is handled: Workers are linked to the connection pool when
+ they are created. If they crash connection pool listens for the EXIT event
+ and cleans up. Connection pool also monitors owners (by monitoring the the
+ `Pid` from the `From` argument in the call to `acquire/1`) and cleans up if
+ owner dies. Another interesting thing is that connection establishment
Review comment:
and cleans up if it receives a 'DOWN' message.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services