[ 
https://issues.apache.org/jira/browse/COUCHDB-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15400044#comment-15400044
 ] 

ASF GitHub Bot commented on COUCHDB-3088:
-----------------------------------------

GitHub user iilyak opened a pull request:

    https://github.com/apache/couchdb-couch-replicator/pull/44

    Inject random delays in scan_all_dbs

    couch_replication_server scans filesystem to find all _replication
    databases. For every database found it does
    
        gen_server:cast(Server, {resume_scan, DbName})
    
    Extract independent process where we do gen_server:cast after a random 
delay.
    This effectively removes stampede and randomizes the order in which we
    process _replication databases.
    
    COUCHDB-3088

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cloudant/couchdb-couch-replicator 
69914-insert-random-delays

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-couch-replicator/pull/44.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #44
    
----
commit 5715c5e25dba61442834b08d7d7202b185341a87
Author: ILYA Khlopotov <[email protected]>
Date:   2016-07-29T21:32:02Z

    Inject random delays in scan_all_dbs
    
    couch_replication_server scans filesystem to find all _replication
    databases. For every database found it does
    
        gen_server:cast(Server, {resume_scan, DbName})
    
    Extract independent process where we do gen_server:cast after a random 
delay.
    This effectively removes stampede and randomizes the order in which we
    process _replication databases.
    
    COUCHDB-3088

----


> restart of couch_replication_server causes a stampede
> -----------------------------------------------------
>
>                 Key: COUCHDB-3088
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-3088
>             Project: CouchDB
>          Issue Type: Bug
>            Reporter: ILYA
>
> couch_replication_server scans all files in database_dir searching for files 
> matching "_replicator.<number>.couch". For every _replication db it does 
> gen_server:cast(Server, {resume_scan, DbName}). This creates a stampede 
> effect and causes sharp load spikes on the replication cluster. The problem 
> get worse if you migrate from older version of couchdb. In this case there is 
> a logic which injects validation ddoc into every _replication db. Causing a 
> spike in [couchdb, database_writes] metric. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to