[PR] Optimize replicator [couchdb]

via GitHub Mon, 04 May 2026 23:15:39 -0700


nickva opened a new pull request, #5994:
URL: https://github.com/apache/couchdb/pull/5994


   There are two related optimizations:
   
   Use gen casts to queue docs for _bulk_docs in replication workers instead of 
gen calls.
   
   Previously, queue_fetch loop from the worker used a gen_sever call to add 
each doc to the parent's (worker's) batch. Even though it was an immediate 
gen_server return, it was still synchronous call, and if the worker was in the 
middle of a _bulk_docs flush, it couldn't reply to the queue_fetch_loop and 
stall it for that moment. The optimization is to use casts instead. This way, 
the fetch loop can continue fetching (running bulk_gets) without having to 
periodically wait for individual batch_doc calls. The bulk_docs pending queue 
is still properly bounded, as at the end of the fetch loop there is a 
`gen_server:call(Parent, flush, infinity)` call which will properly 
backpropagate the pressure for a slow target
   
   The second optimization is to increase the bulk_docs worker memory limit a 
bit from 500KB to 4MB. Previously 500KB always restricted _bulk_docs batch size 
to always be less than 500KB regardless what the user set as the replicator 
worker batch size. This way we'll still limit the maximum batch size at 4MB, 
but since it's a higher limit and users's replicator batch size will have a 
wider range effective range to take effect (in other words if some users want 
to set a batch size of 2500, they can now)
   
   A quick benchmark with a script [1] replicating 100k 2KB docs shows a local 
speedup of 32 -> 22 seconds
   
   ```
   ./rep_bench.py --ndocs 100000 --doc-size 2048 --source-url 
http://localhost:15984  --target-url http://localhost:25984
   source:      http://localhost:15984/rep_bench_source
   target:      http://localhost:25984 (db prefix: rep_bench_target)
   coordinator: http://localhost:15984  poll: 2.0s  http_timeout: 600s
   docs: 100000  doc_size: 2048  n_jobs: 1
   == source ==
     loading 100000 docs into rep_bench_source (doc_size=2048, batch=500)...
     loaded 100000 docs in 23.8s, size 27.0MB
    setting applied: {('replicator', 'startup_jitter'): '0', ('replicator', 
'interval'): '1000'}
   == bench ==
   
   wall=30.28s
     job 00  elapsed= 30.27s  docs_read= 100000  docs_written= 100000  missing= 
 100000
   ```
   
   ```
   ./rep_bench.py --ndocs 100000 --doc-size 2048 --source-url 
http://localhost:15984  --target-url http://localhost:25984
   source:      http://localhost:15984/rep_bench_source
   target:      http://localhost:25984 (db prefix: rep_bench_target)
   coordinator: http://localhost:15984  poll: 2.0s  http_timeout: 600s
   docs: 100000  doc_size: 2048  n_jobs: 1
   == source ==
     loading 100000 docs into rep_bench_source (doc_size=2048, batch=500)...
     loaded 100000 docs in 24.2s, size 26.9MB
    setting applied: {('replicator', 'startup_jitter'): '0', ('replicator', 
'interval'): '1000'}
   == bench ==
   
   wall=22.11s
     job 00  elapsed= 22.10s  docs_read= 100000  docs_written= 100000  missing= 
100000
   ```
   
   [1] https://gist.github.com/nickva/2a49f6e624208c45dc0dafdd935a4aae
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] Optimize replicator [couchdb]

Reply via email to