Speedups in the view indexer
----------------------------
Key: COUCHDB-1186
URL: https://issues.apache.org/jira/browse/COUCHDB-1186
Project: CouchDB
Issue Type: Improvement
Reporter: Filipe Manana
Assignee: Filipe Manana
Fix For: 1.2
The patches at [1] and [2] do 2 distinct optimizations to the view indexer
1) Use a NIF to implement couch_view:less_json/2;
2) Multiple small optimizations to couch_view_updater - the main one is to
decode the view server's JSON only in the updater's write process, avoiding 2
EJSON term copying phases (couch_os_process -> updater processes and writes
work queue)
[1] -
https://github.com/fdmanana/couchdb/commit/3935a4a991abc32132c078e908dbc11925605602
[2] -
https://github.com/fdmanana/couchdb/commit/cce325378723c863f05cca2192ac7bd58eedde1c
Using these 2 patches, I've seen significant improvements to view generation
time. Here I present as example the databases at:
A) http://fdmanana.couchone.com/indexer_test_2
B) http://fdmanana.couchone.com/indexer_test_3
## Trunk
### database A
$ time curl
http://localhost:5985/indexer_test_2/_design/test/_view/view1?limit=1
{"total_rows":1102400,"offset":0,"rows":[
{"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871
.73},{"x":153746.28,"y":190006.59}]}
]}
real 19m46.007s
user 0m0.024s
sys 0m0.020s
### Database B
$ time curl
http://localhost:5985/indexer_test_3/_design/test/_view/view1?limit=1
{"total_rows":1102400,"offset":0,"rows":[
{"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871
.73},{"x":153746.28,"y":190006.59}]}
]}
real 21m41.958s
user 0m0.004s
sys 0m0.028s
## Trunk + the 2 patches
### Database A
$ time curl
http://localhost:5984/indexer_test_2/_design/test/_view/view1?limit=1
{"total_rows":1102400,"offset":0,"rows":[
{"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.7
3},{"x":153746.28,"y":190006.59}]}
]}
real 16m1.820s
user 0m0.000s
sys 0m0.028s
(versus 19m46 with trunk)
### Database B
$ time curl
http://localhost:5984/indexer_test_3/_design/test/_view/view1?limit=1
{"total_rows":1102400,"offset":0,"rows":[
{"id":"00d49881-7bcf-4c3d-a65d-e44435eeb513","key":["dwarf","assassin",2,1.1],"value":[{"x":174347.18,"y":127272.8},{"x":35179.93,"y":41550.55},{"x":157014.38,"y":172052.63},{"x":116185.83,"y":69871.7
3},{"x":153746.28,"y":190006.59}]}
]}
real 17m22.778s
user 0m0.020s
sys 0m0.016s
(versus 21m41s with trunk)
Repeating these tests, always clearing my OS/fs cache before running them (via
`echo 3 > /proc/sys/vm/drop_caches`), I always get about the same relative
differences.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira