Why is replication so slow?

Attila Nagy Fri, 16 Jul 2010 09:52:20 -0700

Hi,

I have three equal machines with Pentium(R) D CPU 3.20GHz, 2GiB RAM,FreeBSD 8, Erlang R13B04 (erts-5.7.5) [source] [64-bit] [smp:2:2] [rq:2][async-threads:0] [hipe] [kernel-poll:false], and CouchDB 1.0.0.

I would like to replicate documents between the three (even moremachines later) in a fully meshed replica agreement (every nodereplicates from/to every other to ensure that there is no SPoF and everydocument gets to others ASAP). The nodes would store small, but quicklychanging documents (application no. 1) and larger (from several kBs toseveral GBs) binary attachments (application no. 2). The applicationsare not mixed on the same CouchDB instance (even the machines).

I've experimented with the first and noticed that no matter how fastinsert documents (BTW, I could achieve about 230 inserts per second,parallel connections, no bulk inserts) the traffic between the machinesdoesn't go beyond about 500 kBps and the replicas lag behind the writtennode (a lot!).

Based on this, I've started another test, now with smaller binaryattachments. The first run did this:

for i in `jot 128`
do

curl -X PUT http://localhost:5984/testdb/$i/file -H "Content-Type:application/octet-stream" --data-binary @bin1

done

That is, it uploads 128 MB of data (bin1 is 1MB of size).

Without replication, it runs in 8.64 seconds (14.81 MBps, not that fasteither, but hey, it's erlang :). If I run it with background curlprocesses (maximum 128 parallel uploads), the script runs in 6.74s(18.99 MBps).

Now if I make a one way replica to another node (connected with gigabitethernet), the run time slightly increases to 7.04s on the master node,but it takes 42 seconds (3.04 MBps) for all the 128 documents to reachthe slave node.Things get worse when I make a two way replication between the twonodes, this time the upload on the "master" node takes 7.4 seconds, but75 seconds are needed for the two nodes to become consistent. The erlangprocesses on both sides eat more resources, so this slowdown iscompletely visible, not network bound (of course).

If I make two one way replications (A->B, A->C node), the times looklike this: time needed to upload on the master (A) node: 6.52s, timeneeded for the slave (B, C) nodes to get consistent with A: 44s (A->B),39s (A->C).BTW, I calculate this from the start of the script (I'm not writing thedata on A and then set up replication).

With the following replications defined: A<->B, A<->C, I get these:uploading to A: 7.34s, A->B consistency: 72s , A->C consistency: 72s

During the process I saw this on node A:
  PID USERNAME   THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND

15427 couchdb 11 109 0 217M 149M CPU0 0 14:44 135.94%beam.smpand this was after the upload has been done, so this is what CouchDBeats when doing bilateral replication towards two nodes.


And now the full mesh (A<->B, A<->C, B<->C):
CouchDB resource usage tops:

15427 couchdb 11 110 0 270M 202M CPU1 1 18:44 140.14%beam.smp


and the consistency times also: A->B: 125s, A->C: 107s.
BTW, the upload lasted for 7.59s.

Summary: it seems unilateral replication is consistent in it's resourceusage, and it's pretty slow (7s on localhost write vs. 42s ofreplication to the remote node). If I define a bilateral replication itslows down further, nearly to the half. Every bilateral agreementintroduces this slowdown, so one unilateral: 42s, one bilateral: 72s,two bilaterals: 125s.

I'm sure it's not about waiting for the network or disk, it seems to bepure resource usage problem. Is this known? Will it be fixed?


Thanks,

Why is replication so slow?

Reply via email to