Re: New write performance optimizations coming

Adam Kocoloski Thu, 23 Jun 2011 20:06:29 -0700

Hi Damien, I'd like to see these 220 commits rebased into a set of logical 
patches against trunk.  It'll make the review easier and will help future devs 
track down any bugs that are introduced.  Best,


Adam
 
On Jun 23, 2011, at 6:49 PM, Damien Katz wrote:

> Hi everyone,
> 
> As it’s known by many of you, Filipe and I have been working on improving 
> performance, specially write performance [1]. This work has been public in 
> the Couchbase github account since the beginning, and the non Couchbase 
> specific changes are now isolated in [2] and [3].
> In [3] there’s an Erlang module that is used to test the performance when 
> writing and updating batches of documents with concurrency, which was used, 
> amongst other tools, to measure the performance gains. This module bypasses 
> the network stack and the JSON parsing, so that basically it allows us to see 
> more easily how significant the changes in couch_file, couch_db and 
> couch_db_updater are.
> 
> The main and most important change is asynchronous writes. The file module no 
> longer blocks callers until the write calls complete. Instead they 
> immediately reply to the caller with the position in the file where the data 
> is going to be written to. The data is then sent to a dedicated loop process 
> that is continuously writing the data it receives, from the couch_file 
> gen_server, to disk (and batching when possible). This allows callers (such 
> as the db updater for.e.g.) to issue write calls and keep doing other work 
> (preparing documents, etc) while the writes are being done in parallel. After 
> issuing all the writes, callers simply call the new ‘flush’ function in the 
> couch_file gen_server, which will block the caller until everything was 
> effectively written to disk - normally this flush call ends up not blocking 
> the caller or it blocks it for a very small period.
> 
> There are other changes such as avoiding 2 btree lookups per document ID 
> (COUCHDB-1084 [4]), faster sorting in the updater (O(n log n) vs O(n^2)) and 
> avoid sorting already sorted lists in the updater.
> 
> Checking if attachments are compressible was also moved into a new 
> module/process. We verified this took much CPU time when all or most of the 
> documents to write/update have attachments - building the regexps and 
> matching against them for every single attachment is surprisingly expensive.
> 
> There’s also a new couch_db:update_doc/s flag named ‘optimistic’ which 
> basically changes the behaviour to write the document bodies before entering 
> the updater and skip some attachment related checks (duplicated names for 
> e.g.). This flag is not yet exposed to the HTTP api, but it could be via an 
> X-Optimistic-Write header in the doc PUT/POST requests and _bulk_docs for 
> e.g. We’ve seen this as good when the client knows that the documents to 
> write don’t exist yet in the database and we aren’t already IO bound, such as 
> when SSDs are used.
> 
> We used relaximation, Filipe’s basho bench based tests [5] and the Erlang 
> test module mentioned before [6, 7], exposed via the HTTP . Here follow some 
> benchmark results.
> 
> 
> # Using the Erlang test module (test output)
> 
> ## 1Kb documents, 10 concurrent writers, batches of 500 docs
> 
> trunk before snappy was added:
> 
> {"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":270071}
> 
> trunk:  
> 
> {"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":157328}
> 
> trunk + async writes (and snappy):
> 
> {"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":121518}
> 
> ## 2.5Kb documents, 10 concurrent writers, batches of 500 docs
> 
> trunk before snappy was added:
> 
> {"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":507098}
> 
> trunk:
> 
> {"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":230391}
> 
> trunk + async writes (and snappy):
> 
> {"db":"load_test","total":100000,"batch":500,"concurrency":10,"rounds":10,"delayed_commits":false,"optimistic":false,"total_time_ms":190151}
> 
> 
> # bash bench tests, via the public HTTP APIs
> 
> ## batches of 1 1Kb docs, 50 writers, 5 minutes run
> 
> trunk:     147 702 docs written
> branch:  149 534 docs written
> 
> ## batches of 10 1Kb docs, 50 writers, 5 minutes run
> 
> trunk:     878 520 docs written
> branch:  991 330 docs written
> 
> ## batches of 100 1Kb docs, 50 writers, 5 minutes run
> 
> trunk:    1 627 600 docs written
> branch: 1 865 800 docs written
> 
> ## batches of 1 2.5Kb docs, 50 writers, 5 minutes run
> 
> trunk:    142 531 docs written
> branch: 143 012 docs written
> 
> ## batches of 10 2.5Kb docs, 50 writers, 5 minutes run
> 
> trunk:     724 880 docs written
> branch:   780 690 docs written
> 
> ## batches of 100 2.5Kb docs, 50 writers, 5 minutes run
> 
> trunk:      1 028 600 docs written
> branch:   1 152 800 docs written
> 
> 
> # bash bench tests, via the internal Erlang APIs
> ## batches of 100 2.5Kb docs, 50 writers, 5 minutes run
> 
> trunk:    3 170 100 docs written
> branch: 3 359 900 docs written
> 
> 
> # Relaximation tests
> 
> 1Kb docs:
> 
> http://graphs.mikeal.couchone.com/#/graph/4843dbdf8fa104783870094b83002a1a
> 
> 2.5Kb docs:
> 
> http://graphs.mikeal.couchone.com/#/graph/4843dbdf8fa104783870094b830022c0
> 
> 4Kb docs:
> 
> http://graphs.mikeal.couchone.com/#/graph/4843dbdf8fa104783870094b8300330d
> 
> 
> All the documents used for these tests can be found at:  
> https://github.com/fdmanana/basho_bench_couch/tree/master/couch_docs
> 
> 
> Now some view indexing tests.
> 
> # indexer_test_2 database 
> (http://fdmanana.couchone.com/_utils/database.html?indexer_test_2)
> 
> ## trunk
> 
> $ time curl 
> http://localhost:5984/indexer_test_2/_design/test/_view/view1?limit=1
> {"total_rows":1102400,"offset":0,"rows":[
> {"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.73},{"x":153746.28,"y":190006.59}]}
> ]}
> 
> real  20m51.388s
> user  0m0.040s
> sys   0m0.000s
> 
> 
> ## branch async writes
> 
> $ time curl 
> http://localhost:5984/indexer_test_2/_design/test/_view/view1?limit=1
> {"total_rows":1102400,"offset":0,"rows":[
> {"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.73},{"x":153746.28,"y":190006.59}]}
> ]}
> 
> real  15m17.908s
> user  0m0.008s
> sys   0m0.020s
> 
> 
> # indexer_test_3_database 
> (http://fdmanana.couchone.com/_utils/database.html?indexer_test_3)
> 
> ## trunk
> 
> $ time curl 
> http://localhost:5984/indexer_test_3/_design/test/_view/view1?limit=1
> {"total_rows":1102400,"offset":0,"rows":[
> {"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.73},{"x":153746.28,"y":190006.59}]}
> ]}
> 
> real  21m17.346s
> user  0m0.012s
> sys   0m0.028s
> 
> ## branch async writes
> 
> $ time curl 
> http://localhost:5984/indexer_test_3/_design/test/_view/view1?limit=1
> {"total_rows":1102400,"offset":0,"rows":[
> {"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.73},{"x":153746.28,"y":190006.59}]}
> ]}
> 
> real  16m28.558s
> user  0m0.012s
> sys   0m0.020s
> 
> We don’t show nearly as big of improvements for single write per request 
> benchmarks as we do with bulk writes. This is due to the HTTP request 
> overhead and our own inefficiencies at that layer. We have lots of room yet 
> for optimizations at the networking layer.
> 
> We'd like to merge this code into trunk next week by next wednesday. Please 
> respond with any improvement, objections or comments by then. Thanks!
> 
> -Damien
> 
> 
> [1] - 
> http://blog.couchbase.com/driving-performance-improvements-couchbase-single-server-two-dot-zero
> [2] - https://github.com/fdmanana/couchdb/compare/async_file_writes_no_test
> [3] - https://github.com/fdmanana/couchdb/compare/async_file_writes
> [4] - https://issues.apache.org/jira/browse/COUCHDB-1084
> [5] - https://github.com/fdmanana/basho_bench_couch
> [6] - https://github.com/fdmanana/couchdb/blob/async_file_writes/gen_load.sh
> [7] - 
> https://github.com/fdmanana/couchdb/blob/async_file_writes/src/couchdb/couch_internal_load_gen.erl

Re: New write performance optimizations coming

Reply via email to