On Sun, Mar 4, 2012 at 9:45 AM, Bob Dionne <[email protected]> wrote:
> yes, I was surprised by the 30% claim as my numbers showed it only getting
> back to where we were with 1.1.x
>
> I think Bob's suggestion to get to the root code change that caused this
> regression is important as it will help us assess all the other cases this
> testing hasn't even touched yet
The explanation I gave in the 1.2.0 second round vote identifies the
reason, which is that the updater is (depending on timings) collecting
smaller batches of map results, which makes the btree updates less
efficient (besides higher number of btree updates). The patch
addresses this by queuing a batch of map results instead of queuing
map results one by one. Jan's tests and mine are evidence that this is
valid in practice and not just theory.
The original main goal of COUCHDB-1186 was to make the indexing of
views that emit reasonably large (or complex in structure) map values
more efficient.
Here's an example using Jason's slow_couchdb script with wow.tpl and
map function of "function(doc) {emit([doc.type, doc.category],
doc);}":
1.1.x:
fdmanana 07:04:12 ~/git/hub/slow_couchdb (master)> docs=200000
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.1.2a785d32f-git (Erlang OTP/R14B03)
{"couchdb":"Welcome","version":"1.1.2a785d32f-git"}
[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
(....)
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{"total_rows":200000,"offset":0,"rows":[
{"id":"00144af5-9f07-448e-af88-026674e3e3d0","key":["dwarf","assassin"],"value":{"_id":"00144af5-9f07-448e-af88-026674e3e3d0","_rev":"1-785fbf5e641f3d10fa083501ad82a9fe","data3":"Vl6BftQEWY6imvNs0FasOj2CrPCptP70z5d","ratio":1.6,"integers":[48028,3170,54066,95547,70643,23763,25804,33180,89061,35274,48244,91792,37936,11855],"category":"assassin","nested":{"dict":{"3XGVdTTF":31490,"SDxKa54e":40,"XIzUloRH":7,"5Mj9F4bp":192,"1sXfjgYf":1203,"XP5YSqhX":25461,"QJr0Xhxn":9941},"string1":"3Q4tvmhHwKvedKiRnoL6xUz","string2":"dWI1mrrAypRh","values":[33712,57371,88567,88361,70873,6327,17326,91004,41840,86257],"string3":"i7OGysnXvynz41VMQJ","coords":[{"x":65350.46,"y":103881.18},{"x":24180.14,"y":8474.9},{"x":88326.66,"y":43151.76},{"x":120199.77,"y":102270.29},{"x":191924.18,"y":74479.75}]},"level":21,"type":"dwarf","data1":"Vpkplo80LshlcjBE0ySJNNpfgDy2bu8byWrmZ44B","data2":"GnyNbos75Wxm1C5MLdOeXNniHamBjld70vHqoJnEtnlfekuPXJ"}}
]}
real 2m49.227s
user 0m0.006s
sys 0m0.005s
1.2.x:
fdmanana 07:13:30 ~/git/hub/slow_couchdb (master)> docs=200000
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.2.0 (Erlang OTP/R14B03)
{"couchdb":"Welcome","version":"1.2.0"}
[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
(....)
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{"total_rows":200000,"offset":0,"rows":[
{"id":"0005cd07-49f4-4a99-b506-acef948f2acc","key":["dwarf","assassin"],"value":{"_id":"0005cd07-49f4-4a99-b506-acef948f2acc","_rev":"1-4b418e69618bf11124a03e1a3845f071","data3":"T0W2JBUET9yzRXHfUqcUBwFhYGKh14YFVxk","ratio":1.6999999999999999556,"integers":[25658,7573,47779,43217,49586,57992,13549,90984,45253,49560,1643,64085,38381,62544],"category":"assassin","nested":{"dict":{"oWhW4jJ6":199,"EPSVtKtS":5638,"8WpzvD5x":73714,"stD9Ynfh":8924,"0qh5Nc1g":5994,"pBa5vJyy":18,"s25oAkRc":165270},"string1":"fNNHb8lxtcy7GpwSU3yRyaC","string2":"rilbiZM7yAaK","values":[49632,93665,73258,75675,4229,15742,16965,76825,22049,79829],"string3":"IwX09SiOLMSSyxffMB","coords":[{"x":179620.45000000001164,"y":11539.989999999999782},{"x":68483.820000000006985,"y":110559.19999999999709},{"x":67197.940000000002328,"y":96702.210000000006403},{"x":25469.869999999998981,"y":79049.490000000005239},{"x":157059.89999999999418,"y":34963.410000000003492}]},"level":6,"type":"dwarf","data1":"njpz38JSfz00p2Lc2Jv0dON7UfTljRgz0J2B7w7K","data2":"4hpsT2szDrssbUCHEirTzHOIhSxTd83i1FO5aNXgoGAfO2srH1"}}
]}
real 1m51.989s
user 0m0.006s
sys 0m0.004s
1.2.x + patch:
fdmanana 07:29:11 ~/git/hub/slow_couchdb (master)> docs=200000
batch=5000 ./bench.sh wow.tpl
Server: CouchDB/1.2.0 (Erlang OTP/R14B03)
{"couchdb":"Welcome","version":"1.2.0"}
[INFO] Created DB named `db1'
[INFO] Uploaded 5000 documents via _bulk_docs
(....)
[INFO] Uploaded 5000 documents via _bulk_docs
Building view.
{"total_rows":200000,"offset":0,"rows":[
{"id":"0005cd07-49f4-4a99-b506-acef948f2acc","key":["dwarf","assassin"],"value":{"_id":"0005cd07-49f4-4a99-b506-acef948f2acc","_rev":"1-4b418e69618bf11124a03e1a3845f071","data3":"T0W2JBUET9yzRXHfUqcUBwFhYGKh14YFVxk","ratio":1.6999999999999999556,"integers":[25658,7573,47779,43217,49586,57992,13549,90984,45253,49560,1643,64085,38381,62544],"category":"assassin","nested":{"dict":{"oWhW4jJ6":199,"EPSVtKtS":5638,"8WpzvD5x":73714,"stD9Ynfh":8924,"0qh5Nc1g":5994,"pBa5vJyy":18,"s25oAkRc":165270},"string1":"fNNHb8lxtcy7GpwSU3yRyaC","string2":"rilbiZM7yAaK","values":[49632,93665,73258,75675,4229,15742,16965,76825,22049,79829],"string3":"IwX09SiOLMSSyxffMB","coords":[{"x":179620.45000000001164,"y":11539.989999999999782},{"x":68483.820000000006985,"y":110559.19999999999709},{"x":67197.940000000002328,"y":96702.210000000006403},{"x":25469.869999999998981,"y":79049.490000000005239},{"x":157059.89999999999418,"y":34963.410000000003492}]},"level":6,"type":"dwarf","data1":"njpz38JSfz00p2Lc2Jv0dON7UfTljRgz0J2B7w7K","data2":"4hpsT2szDrssbUCHEirTzHOIhSxTd83i1FO5aNXgoGAfO2srH1"}}
]}
real 1m45.573s
user 0m0.006s
sys 0m0.004s
Unless someone comes up with scenarios where 1.2.x with the patch is
significantly slower than 1.1.x, I think we should close this and move
to release 1.2.0.
Thanks all for the testing.
>
> On Mar 3, 2012, at 5:25 PM, Bob Dionne wrote:
>
>> I ran some tests, using Bob's latest script. The first versus the second
>> clearly show the regression. The third is the 1.2.x with the patch
>> to couch_os_process reverted and it seems to have no impact. The last has
>> Filipe's latest patch to couch_view_updater discussed in the
>> other thread and it brings the performance back to the 1.1.x level.
>>
>> I'd say that patch is a +1
>>
>> 1.2.x
>> real 3m3.093s
>> user 0m0.028s
>> sys 0m0.008s
>> ==================
>> 1.1.x
>> real 2m16.609s
>> user 0m0.026s
>> sys 0m0.007s
>> =================
>> 1.2.x with patch to couch_os_process reverted
>> real 3m7.012s
>> user 0m0.029s
>> sys 0m0.008s
>> =================
>> 1.2.x with Filipe's katest patch to couch_view_updater
>> real 2m11.038s
>> user 0m0.028s
>> sys 0m0.007s
>> On Feb 28, 2012, at 8:17 AM, Jason Smith wrote:
>>
>>> Forgive the clean new thread. Hopefully it will not remain so.
>>>
>>> If you can, would you please clone https://github.com/jhs/slow_couchdb
>>>
>>> And build whatever Erlangs and CouchDB checkouts you see fit, and run
>>> the test. For example:
>>>
>>> docs=500000 ./bench.sh small_doc.tpl
>>>
>>> That should run the test and, God willing, upload the results to a
>>> couch in the cloud. We should be able to use that information to
>>> identify who you are, whether you are on SSD, what Erlang and Couch
>>> build, and how fast it ran. Modulo bugs.
>>
>
--
Filipe David Manana,
"Reasonable men adapt themselves to the world.
Unreasonable men adapt the world to themselves.
That's why all progress depends on unreasonable men."