Filipe, Thanks for the explanation. I agree on the way forward and will apply your patch to the relevant branches with attribution.
B. On 5 March 2012 07:41, Filipe David Manana <[email protected]> wrote: > On Sun, Mar 4, 2012 at 9:45 AM, Bob Dionne <[email protected]> > wrote: >> yes, I was surprised by the 30% claim as my numbers showed it only getting >> back to where we were with 1.1.x >> >> I think Bob's suggestion to get to the root code change that caused this >> regression is important as it will help us assess all the other cases this >> testing hasn't even touched yet > > The explanation I gave in the 1.2.0 second round vote identifies the > reason, which is that the updater is (depending on timings) collecting > smaller batches of map results, which makes the btree updates less > efficient (besides higher number of btree updates). The patch > addresses this by queuing a batch of map results instead of queuing > map results one by one. Jan's tests and mine are evidence that this is > valid in practice and not just theory. > > The original main goal of COUCHDB-1186 was to make the indexing of > views that emit reasonably large (or complex in structure) map values > more efficient. > Here's an example using Jason's slow_couchdb script with wow.tpl and > map function of "function(doc) {emit([doc.type, doc.category], > doc);}": > > 1.1.x: > > fdmanana 07:04:12 ~/git/hub/slow_couchdb (master)> docs=200000 > batch=5000 ./bench.sh wow.tpl > Server: CouchDB/1.1.2a785d32f-git (Erlang OTP/R14B03) > {"couchdb":"Welcome","version":"1.1.2a785d32f-git"} > > [INFO] Created DB named `db1' > [INFO] Uploaded 5000 documents via _bulk_docs > (....) > [INFO] Uploaded 5000 documents via _bulk_docs > Building view. > {"total_rows":200000,"offset":0,"rows":[ > {"id":"00144af5-9f07-448e-af88-026674e3e3d0","key":["dwarf","assassin"],"value":{"_id":"00144af5-9f07-448e-af88-026674e3e3d0","_rev":"1-785fbf5e641f3d10fa083501ad82a9fe","data3":"Vl6BftQEWY6imvNs0FasOj2CrPCptP70z5d","ratio":1.6,"integers":[48028,3170,54066,95547,70643,23763,25804,33180,89061,35274,48244,91792,37936,11855],"category":"assassin","nested":{"dict":{"3XGVdTTF":31490,"SDxKa54e":40,"XIzUloRH":7,"5Mj9F4bp":192,"1sXfjgYf":1203,"XP5YSqhX":25461,"QJr0Xhxn":9941},"string1":"3Q4tvmhHwKvedKiRnoL6xUz","string2":"dWI1mrrAypRh","values":[33712,57371,88567,88361,70873,6327,17326,91004,41840,86257],"string3":"i7OGysnXvynz41VMQJ","coords":[{"x":65350.46,"y":103881.18},{"x":24180.14,"y":8474.9},{"x":88326.66,"y":43151.76},{"x":120199.77,"y":102270.29},{"x":191924.18,"y":74479.75}]},"level":21,"type":"dwarf","data1":"Vpkplo80LshlcjBE0ySJNNpfgDy2bu8byWrmZ44B","data2":"GnyNbos75Wxm1C5MLdOeXNniHamBjld70vHqoJnEtnlfekuPXJ"}} > ]} > > real 2m49.227s > user 0m0.006s > sys 0m0.005s > > > 1.2.x: > > fdmanana 07:13:30 ~/git/hub/slow_couchdb (master)> docs=200000 > batch=5000 ./bench.sh wow.tpl > Server: CouchDB/1.2.0 (Erlang OTP/R14B03) > {"couchdb":"Welcome","version":"1.2.0"} > > [INFO] Created DB named `db1' > [INFO] Uploaded 5000 documents via _bulk_docs > (....) > [INFO] Uploaded 5000 documents via _bulk_docs > Building view. > {"total_rows":200000,"offset":0,"rows":[ > {"id":"0005cd07-49f4-4a99-b506-acef948f2acc","key":["dwarf","assassin"],"value":{"_id":"0005cd07-49f4-4a99-b506-acef948f2acc","_rev":"1-4b418e69618bf11124a03e1a3845f071","data3":"T0W2JBUET9yzRXHfUqcUBwFhYGKh14YFVxk","ratio":1.6999999999999999556,"integers":[25658,7573,47779,43217,49586,57992,13549,90984,45253,49560,1643,64085,38381,62544],"category":"assassin","nested":{"dict":{"oWhW4jJ6":199,"EPSVtKtS":5638,"8WpzvD5x":73714,"stD9Ynfh":8924,"0qh5Nc1g":5994,"pBa5vJyy":18,"s25oAkRc":165270},"string1":"fNNHb8lxtcy7GpwSU3yRyaC","string2":"rilbiZM7yAaK","values":[49632,93665,73258,75675,4229,15742,16965,76825,22049,79829],"string3":"IwX09SiOLMSSyxffMB","coords":[{"x":179620.45000000001164,"y":11539.989999999999782},{"x":68483.820000000006985,"y":110559.19999999999709},{"x":67197.940000000002328,"y":96702.210000000006403},{"x":25469.869999999998981,"y":79049.490000000005239},{"x":157059.89999999999418,"y":34963.410000000003492}]},"level":6,"type":"dwarf","data1":"njpz38JSfz00p2Lc2Jv0dON7UfTljRgz0J2B7w7K","data2":"4hpsT2szDrssbUCHEirTzHOIhSxTd83i1FO5aNXgoGAfO2srH1"}} > ]} > > real 1m51.989s > user 0m0.006s > sys 0m0.004s > > > 1.2.x + patch: > > fdmanana 07:29:11 ~/git/hub/slow_couchdb (master)> docs=200000 > batch=5000 ./bench.sh wow.tpl > Server: CouchDB/1.2.0 (Erlang OTP/R14B03) > {"couchdb":"Welcome","version":"1.2.0"} > > [INFO] Created DB named `db1' > [INFO] Uploaded 5000 documents via _bulk_docs > (....) > [INFO] Uploaded 5000 documents via _bulk_docs > Building view. > {"total_rows":200000,"offset":0,"rows":[ > {"id":"0005cd07-49f4-4a99-b506-acef948f2acc","key":["dwarf","assassin"],"value":{"_id":"0005cd07-49f4-4a99-b506-acef948f2acc","_rev":"1-4b418e69618bf11124a03e1a3845f071","data3":"T0W2JBUET9yzRXHfUqcUBwFhYGKh14YFVxk","ratio":1.6999999999999999556,"integers":[25658,7573,47779,43217,49586,57992,13549,90984,45253,49560,1643,64085,38381,62544],"category":"assassin","nested":{"dict":{"oWhW4jJ6":199,"EPSVtKtS":5638,"8WpzvD5x":73714,"stD9Ynfh":8924,"0qh5Nc1g":5994,"pBa5vJyy":18,"s25oAkRc":165270},"string1":"fNNHb8lxtcy7GpwSU3yRyaC","string2":"rilbiZM7yAaK","values":[49632,93665,73258,75675,4229,15742,16965,76825,22049,79829],"string3":"IwX09SiOLMSSyxffMB","coords":[{"x":179620.45000000001164,"y":11539.989999999999782},{"x":68483.820000000006985,"y":110559.19999999999709},{"x":67197.940000000002328,"y":96702.210000000006403},{"x":25469.869999999998981,"y":79049.490000000005239},{"x":157059.89999999999418,"y":34963.410000000003492}]},"level":6,"type":"dwarf","data1":"njpz38JSfz00p2Lc2Jv0dON7UfTljRgz0J2B7w7K","data2":"4hpsT2szDrssbUCHEirTzHOIhSxTd83i1FO5aNXgoGAfO2srH1"}} > ]} > > real 1m45.573s > user 0m0.006s > sys 0m0.004s > > > Unless someone comes up with scenarios where 1.2.x with the patch is > significantly slower than 1.1.x, I think we should close this and move > to release 1.2.0. > > Thanks all for the testing. > >> >> On Mar 3, 2012, at 5:25 PM, Bob Dionne wrote: >> >>> I ran some tests, using Bob's latest script. The first versus the second >>> clearly show the regression. The third is the 1.2.x with the patch >>> to couch_os_process reverted and it seems to have no impact. The last has >>> Filipe's latest patch to couch_view_updater discussed in the >>> other thread and it brings the performance back to the 1.1.x level. >>> >>> I'd say that patch is a +1 >>> >>> 1.2.x >>> real 3m3.093s >>> user 0m0.028s >>> sys 0m0.008s >>> ================== >>> 1.1.x >>> real 2m16.609s >>> user 0m0.026s >>> sys 0m0.007s >>> ================= >>> 1.2.x with patch to couch_os_process reverted >>> real 3m7.012s >>> user 0m0.029s >>> sys 0m0.008s >>> ================= >>> 1.2.x with Filipe's katest patch to couch_view_updater >>> real 2m11.038s >>> user 0m0.028s >>> sys 0m0.007s >>> On Feb 28, 2012, at 8:17 AM, Jason Smith wrote: >>> >>>> Forgive the clean new thread. Hopefully it will not remain so. >>>> >>>> If you can, would you please clone https://github.com/jhs/slow_couchdb >>>> >>>> And build whatever Erlangs and CouchDB checkouts you see fit, and run >>>> the test. For example: >>>> >>>> docs=500000 ./bench.sh small_doc.tpl >>>> >>>> That should run the test and, God willing, upload the results to a >>>> couch in the cloud. We should be able to use that information to >>>> identify who you are, whether you are on SSD, what Erlang and Couch >>>> build, and how fast it ran. Modulo bugs. >>> >> > > > > -- > Filipe David Manana, > > "Reasonable men adapt themselves to the world. > Unreasonable men adapt the world to themselves. > That's why all progress depends on unreasonable men."
