[jira] [Commented] (COUCHDB-3324) Scheduling Replicator

ASF subversion and git services (JIRA) Thu, 20 Apr 2017 14:08:23 -0700

    [ 
https://issues.apache.org/jira/browse/COUCHDB-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977520#comment-15977520
 ]


ASF subversion and git services commented on COUCHDB-3324:
----------------------------------------------------------

Commit 4c48f69b1abddf8081ef1e04a05cb1ef1add51fb in couchdb's branch 
refs/heads/63012-scheduler from [~vatamane]
[ https://gitbox.apache.org/repos/asf?p=couchdb.git;h=4c48f69 ]

Cluster ownership module implementation

This module maintains cluster membership information for replication and
provides functions to check ownership of replication jobs.

A cluster membership change is registered only after a configurable
`cluster_quiet_period` interval has passed since the last node addition or
removal. This is useful in cases of rolling node reboots in a cluster in order
to avoid rescanning for membership changes after every node up and down event,
and instead doing only on rescan at the very end.

Jira: COUCHDB-3324


> Scheduling Replicator
> ---------------------
>
>                 Key: COUCHDB-3324
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-3324
>             Project: CouchDB
>          Issue Type: New Feature
>            Reporter: Nick Vatamaniuc
>
> Improve CouchDB replicator
>  * Allow running a large number of replication jobs
>  * Improve API with a focus on ease of use and performance. Avoid updating 
> replication document with transient state updates. Instead create a proper 
> API for querying replication states. At the same time provide a compatibility 
> mode to let users keep existing behavior (of getting updates in documents).
>  * Improve network resource usage and performance. Multiple connection to the 
> same cluster could share socket connections
>  * Handle rate limiting on target and source HTTP endpoints. Let replication 
> request auto-discover rate limit capacity based on a proven algorithm such as 
> Additive Increase / Multiplicative Decrease feedback control loop.
>  * Improve performance by avoiding repeatedly retrying failing replication 
> jobs. Instead use exponential backoff. 
>  * Improve recovery from long (but temporary) network failure. Currently if 
> replications jobs fail to start 10 times in a row they will not be retried 
> anymore. This is not always desirable. In case of a long enough DNS (or other 
> network) failure replication jobs will effectively stop until they are 
> manually restarted.
>  * Better handling of filtered replications: Failing to fetch filters could 
> block couch replicator manager, lead to message queue backups and memory 
> exhaustion. Also, when replication filter code changes update replication 
> accordingly (replication job ID should change in that case)
>  * Provide better metrics to introspect replicator behavior.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (COUCHDB-3324) Scheduling Replicator

Reply via email to