On Mar 15, 2010, at 3:09 PM, Matthew Sinclair-Day wrote:

> Hi folks,
> 
> I've been putting couch 10.1 on Solaris 10/x86 through its paces lately 
> trying to understand its replication performance and behavior, and have 
> noticed the size of pre-compacted replicas can vary from one host to another.
> 
> In one test, the origin has roughly 1.2 million documents taking up 263MB of 
> storage, but replicated size varies from one server to another:
> 
> origin   : 263MB
> replica 1: 0.6GB
> replica 2: 0.7GB
> replica 3: 1.0GB
> 
> As expected the replicas are larger than the compacted origin database, but I 
> didn't expect such size differences from replica to replica.
> 
> After compacting the origin (again) and the replicas, their sizes settle down 
> to:
> 
> origin:  : 262.4MB
> replica 1: 262.4MB
> replica 2: 262.5MB
> replica 3: 262.4MB
> 
> I'm trying to understand what the reason could be for the variance in 
> pre-compacted database sizes.  All replicas are running the same build of 
> CouchDB on the same version of Solaris, though replica3 is running on newer 
> hardware in a VMWare container.
> 
> Matt

Hi Matt, the variation in target DB file sizes is due to variations in number 
and size of _bulk_docs calls used by the replicator.  The DB size is inversely 
correlated with the size of an average _bulk_docs POST, and the size of a POST 
is governed by the relative speed of the source and the target.  If the target 
is fast and the replication is limited by the source throughput you'll see lots 
of very small calls to _bulk_docs.  Conversely if the target is slow the 
replicator will batch writes together in blocks of 1000 and send them over.

In short, the faster your target server is the larger the un-compacted target 
DB will be. Looks like that VMWare container isn't slowing you down much at all 
:)  Best,

Adam

Reply via email to