Hi, It sounds like every write you make goes to every node (whether direct or via replication), which explains the lack of improvement (though your fault tolerance will be amazing!)
CouchDB 1.x is not a clustered solution, but 2.0 will be, this will give you horizontal scalability (following the well-known Dynamo model). B. > On 10 Apr 2015, at 20:39, Christopher D. Malon <[email protected]> wrote: > > [cross-post from Server Fault, where apparently nobody looked at it] > > Everyone raves about CouchDB's horizontal scaling, but I must be doing > something wrong, because my simple test isn't getting faster performance with > more servers. > > My backend lives in an EC2 VPC, so I'm in admin party mode in a private > subnet, using plain HTTP without authorization. Each of the N backend > instances has (N-1) `_replicator` entries per table, continuously pulling > from the (N-1) peers. The architecture looks like > > [M x m1.small] REST client -> [1 x m1.small] HaProxy -> [N x m1.medium] > CouchDB > > Because M is small, I've set up HaProxy with `balance roundrobin`; otherwise > the requests end up going to a single instance. > > I test by (manually) launching a script on each of the M clients, just a > split-second apart, to do the following: > > - Each client forks into 30 processes before connecting, so that roughly 30 * > M requests can be simulated. Each client will establish its own keep-alive > HTTP connection to the load balancer. > > - Each forked process creates 100 tiny randomly named records and PUTs them > in a single table. A GET is done before each PUT to make sure there is no > previous revision (but with random names, there never is). I measure the > wallclock time before all processes finish on each of the M clients. > > - About thirty seconds after all the PUTs finish, I do the same thing with > GETS. Each forked child GETs the records that it just created. I measure > wallclock time on each of the M clients again. > > I find that > > - the PUT job gets slower as N increases (2:21 for N=1, 3:43 for N=2) > > - the GET job takes the same amount of time for N=1,2,3 (0:16) > > I'm not surprised that PUT is slower, because each write now has to be sent N > places instead of one. However, I'm surprised that GET stays constant. My > post-facto guess at an explanation is: > > - No time is saved on HTTP requests per machine, because the bottleneck would > be at the load balancer. (And according to [AWS > documentation](http://docs.aws.amazon.com/opsworks/latest/userguide/workinglayers-load.html), > "one small instance [of HaProxy] is usually sufficient to handle all > application server traffic" (under what assumptions, I don't know). > > - No time is saved on disk access because everything is still hot in the disk > cache. > > How can I make this a realistic test of the number of clients and requests > per second I can serve with a given setup? Should I fill the disk with > trivial records in order to make cache hits less likely? Or can I already > conclude that there's no benefit to horizontal scaling (and the only way to > do better is to buy provisioned IOPS)? > > Thanks in advance for your help!
