Indexer speedup (for non-native view servers)
---------------------------------------------
Key: COUCHDB-1334
URL: https://issues.apache.org/jira/browse/COUCHDB-1334
Project: CouchDB
Issue Type: Improvement
Components: Database Core, JavaScript View Server, View Server Support
Reporter: Filipe Manana
Assignee: Filipe Manana
Fix For: 1.2
Attachments: 0001-More-efficient-view-updater-writes.patch,
0002-More-efficient-communication-with-the-view-server.patch,
master-0002-More-efficient-communication-with-the-view-server.patch
The following 2 patches significantly improve view index generation/update time
and reduce CPU consumption.
The first patch makes the view updater's batching more efficient, by ensuring
each btree bulk insertion adds/removes a minimum of N (=100) key/value pairts.
This also makes the index file size grow not so fast with old data (old btree
nodes basically). This behaviour is already done in master/trunk in the new
indexer (by Paul Davis).
The second patch maximizes the throughput with an external view server (such as
couchjs). Basically it makes the pipe (erlang port) communication between the
Erlang VM (couch_os_process basically) and the view server more efficient since
the 2 sides spend less time block on reading from the pipe.
Here follow some benchmarks.
test database at http://fdmanana.iriscouch.com/test_db (1 million documents)
branch 1.2.x
$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}
real 2m45.097s
user 0m0.006s
sys 0m0.007s
view file size: 333Mb
CPU usage:
$ sar 1 60
22:27:20 %usr %nice %sys %idle
22:27:21 38 0 12 50
(....)
22:28:21 39 0 13 49
Average: 39 0 13 47
branch 1.2.x + batch patch (first patch)
$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}
real 2m12.736s
user 0m0.006s
sys 0m0.005s
view file size 72Mb
branch 1.2.x + batch patch + os_process patch
$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}
real 1m9.330s
user 0m0.006s
sys 0m0.004s
view file size: 72Mb
CPU usage:
$ sar 1 60
22:22:55 %usr %nice %sys %idle
22:23:53 22 0 6 72
(....)
22:23:55 22 0 6 72
Average: 22 0 7 70
master/trunk
$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}
real 1m57.296s
user 0m0.006s
sys 0m0.005s
master/trunk + os_process patch
$ echo 3 > /proc/sys/vm/drop_caches
$ time curl http://localhost:5984/test_db/_design/test/_view/test1
{"rows":[
{"key":null,"value":1000000}
]}
real 0m53.768s
user 0m0.006s
sys 0m0.006s
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira