[ 
https://issues.apache.org/jira/browse/COUCHDB-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Kocoloski updated COUCHDB-160:
-----------------------------------

    Attachment: couch_rep_v2.diff

Here's an updated patch that uses persistent connections and pipelining to 
further accelerate replications where the source is remote.  Updated benchmarks 
indicate a 3x improvement in performance for remote-local relative to my first 
patch, or a total of 10x faster replications than trunk:

parallel+pipeline:
local-remote    31
remote-remote   36
remote-local    13

Note the asymmetry for local-remote vs. remote-local.  Replications to remote 
targets are still negotiating a new TCP connection for every POST.  Now, we're 
not allowed to pipeline POSTs, but there's nothing wrong with using persistent 
connections.  Last I heard, Erlang's HTTP client needs to be updated to deal 
with that particular use case:

http://www.erlang.org/pipermail/erlang-questions/2008-August/037113.html

Best, Adam

> replication performance improvements
> ------------------------------------
>
>                 Key: COUCHDB-160
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-160
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>    Affects Versions: 0.9
>            Reporter: Adam Kocoloski
>            Priority: Minor
>         Attachments: couch_rep.erl.diff, couch_rep_v2.diff
>
>
> I wrote some code to speed up CouchDB's replication process by parallelizing 
> document requests and using _bulk_docs to write changes to the target.  I 
> tested the speedup as follows:
> * 1000 document DB, 1022 update_seq, ~450 KB after compaction
> * local and remote machines have ~45 ms latency
> * timed requests using timer:tc(couch_rep, replicate, [<<"source">>, 
> <<"target">>]
> * all replications are "from scratch"
> trunk:
> local-local     115
> local-remote    145
> remote-remote   173
> remote-local    146
> db size after replication: 1.8 MB
> patch:
> local-local     1.83
> local-remote    38
> remote-remote   64
> remote-local    35
> db size after replication: 453 KB
> I'll attach the patch as an update to this issue.  It might be worth exposing 
> the "batch size" (currently 100 docs) as a configurable parameter.  Comments 
> welcome.  Best, 
> Adam

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to