There was a known problem replicating large attachments on 0.10.1 which was fixed. There is still a remaining problem with replication crashes that I'm looking into. Progress is tracked here:
http://issues.apache.org/jira/browse/COUCHDB-597 On Mon, Mar 29, 2010 at 13:43, Matthew Sinclair-Day <[email protected]> wrote: > Hi folks, > > I have been digging further into replication performance on Couch 0.10.1 and > have noticed a few problems when replicating attachments. The main problem > centers around continuous replication of large attachments. > > Below is a summary of two test scenarios. The tests were broken up by size > of attachment, 2.6MB and 14K, which represent possible working sets in my > system. Think PDF or JAR file, versus a small text document. > > I know some bug reports have been filed around replication performance in > 0.10.1, but I am offering this report in case it helps further delineate > problem cases. > > For now, I am looking for guidance on how to plan around these problems. It > seems the upcoming 0.11 release would address some of these problems, but > will it address all of them? Is there an ETA on the 0.11 release? > > Matt > > > > 1. Large attachment replication test > > Source database contains 100 documents each with one 2.6M attachment. Total > database size is 263.3MB. Document ID's follow the form "1testx" where x is > a number from 1 to 100. > > remote := replication across two couchdb servers on separate hosts on the > LAN > local := local replication on one couchdb server > > > a. Failed: remote, continuous pull. > > Set up a continuous "pull" replication. Source and target are on separate > CouchDB servers on the LAN. Only one document named "1test26" was > replicated. In the couch log file, GET requests for replicated documents > were seen. After a few more minutes, the database grew in size to 13M and > contained two documents. The target CouchDB server then crashed without > replicating any more documents. > > > b. Failed: remote, continuous push. > > Same as (a). > > > c. Failed: local, continuous, fully-qualified source and target names. > > Same as (a) but the target and source were on same Couch server. Same > result as (a). > > See attachments to this email for erl crash dump, > "large_local_pull_erl_crash.dump". > > In this case, the curl command specified fully-qualified source and > target sources: > > > d. Success: local, NOT continuous, only source and target names. > > Source and target were on same couch server. The name of the source and > target database names were specified; fully-qualified URL was not used. > > curl -d '\"source\":"src_test1\" \"target\":\"dest_test1\". . . ' > > "continuous" was set to "false". > > Unlike (c), this replication was a success. All documents were > replicated. > > > e. Failed: local, continuous, only source and target names. > > Replication was configured the same as (d) but "continuous" was set to > "true". > > The result was the same as (a): initial replication of one document was > followed, a few minutes later, by a crash. > > > 2. Small attachment replication test > > Source database contains 100 documents each with one 14K attachment. Total > database size is 2.3MB. > > a. Success: remote, continuous pull. > b. Success: remote, continuous push. > c. Success: local, continuous, fully-qualified source and target names. > d. Success: local, NOT continuous, only source and target names. > e. Success: local, continuous, only source and target names. > > In other words, small attachments replicated without problem with or without > continuous replication. Given the relatively small size of the database in > this test, would it be worthwhile to gin up a larger working set (~10K > documents) in order to increase the database size? >
