On Mar 15, 2010, at 3:09 PM, Matthew Sinclair-Day wrote: > Hi folks, > > I've been putting couch 10.1 on Solaris 10/x86 through its paces lately > trying to understand its replication performance and behavior, and have > noticed the size of pre-compacted replicas can vary from one host to another. > > In one test, the origin has roughly 1.2 million documents taking up 263MB of > storage, but replicated size varies from one server to another: > > origin : 263MB > replica 1: 0.6GB > replica 2: 0.7GB > replica 3: 1.0GB > > As expected the replicas are larger than the compacted origin database, but I > didn't expect such size differences from replica to replica. > > After compacting the origin (again) and the replicas, their sizes settle down > to: > > origin: : 262.4MB > replica 1: 262.4MB > replica 2: 262.5MB > replica 3: 262.4MB > > I'm trying to understand what the reason could be for the variance in > pre-compacted database sizes. All replicas are running the same build of > CouchDB on the same version of Solaris, though replica3 is running on newer > hardware in a VMWare container. > > Matt
Hi Matt, the variation in target DB file sizes is due to variations in number and size of _bulk_docs calls used by the replicator. The DB size is inversely correlated with the size of an average _bulk_docs POST, and the size of a POST is governed by the relative speed of the source and the target. If the target is fast and the replication is limited by the source throughput you'll see lots of very small calls to _bulk_docs. Conversely if the target is slow the replicator will batch writes together in blocks of 1000 and send them over. In short, the faster your target server is the larger the un-compacted target DB will be. Looks like that VMWare container isn't slowing you down much at all :) Best, Adam
