[
https://issues.apache.org/jira/browse/COUCHDB-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13146441#comment-13146441
]
Paul Joseph Davis commented on COUCHDB-1334:
--------------------------------------------
@Randall, now that 1.8.5 is out, we should be able to lean on that for quite a
long time. If people want to package versions of SpiderMonkey trunk in their
distro, I'm disinclined to put too much immediate effort into supporting those
random versions.
@Filipe, reading the patch I think the idea is pretty good in general but I'd
implement it a bit differently. Firstly, the logic for whether or not
pipelining is used shouldn't be exposed to the client. That's just going to
entangle a whole bunch of API knowledge in the wrong place. I've been meaning
to go back and finish the refactoring of couch_(os|native)_process and
couch_query_servers which would make this behavior possible.
The other part of this that might be interesting is the erlang:port_connect/2
call that can set the destination Pid for that port. I played with it a bit
during my refactoring work but couldn't get it to work quite right. I didn't
spend too much time figuring it out, but it might be a way to skip the
intermediary process and extra message passing.
http://erlang.org/doc/man/erlang.html#port_connect-2
> Indexer speedup (for non-native view servers)
> ---------------------------------------------
>
> Key: COUCHDB-1334
> URL: https://issues.apache.org/jira/browse/COUCHDB-1334
> Project: CouchDB
> Issue Type: Improvement
> Components: Database Core, JavaScript View Server, View Server
> Support
> Reporter: Filipe Manana
> Assignee: Filipe Manana
> Fix For: 1.2
>
> Attachments: 0001-More-efficient-view-updater-writes.patch,
> 0002-More-efficient-communication-with-the-view-server.patch,
> master-0002-More-efficient-communication-with-the-view-server.patch
>
>
> The following 2 patches significantly improve view index generation/update
> time and reduce CPU consumption.
> The first patch makes the view updater's batching more efficient, by ensuring
> each btree bulk insertion adds/removes a minimum of N (=100) key/value
> pairts. This also makes the index file size grow not so fast with old data
> (old btree nodes basically). This behaviour is already done in master/trunk
> in the new indexer (by Paul Davis).
> The second patch maximizes the throughput with an external view server (such
> as couchjs). Basically it makes the pipe (erlang port) communication between
> the Erlang VM (couch_os_process basically) and the view server more efficient
> since the 2 sides spend less time block on reading from the pipe.
> Here follow some benchmarks.
> test database at http://fdmanana.iriscouch.com/test_db (1 million documents)
> branch 1.2.x
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real 2m45.097s
> user 0m0.006s
> sys 0m0.007s
> view file size: 333Mb
> CPU usage:
> $ sar 1 60
> 22:27:20 %usr %nice %sys %idle
> 22:27:21 38 0 12 50
> (....)
> 22:28:21 39 0 13 49
> Average: 39 0 13 47
> branch 1.2.x + batch patch (first patch)
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real 2m12.736s
> user 0m0.006s
> sys 0m0.005s
> view file size 72Mb
> branch 1.2.x + batch patch + os_process patch
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real 1m9.330s
> user 0m0.006s
> sys 0m0.004s
> view file size: 72Mb
> CPU usage:
> $ sar 1 60
> 22:22:55 %usr %nice %sys %idle
> 22:23:53 22 0 6 72
> (....)
> 22:23:55 22 0 6 72
> Average: 22 0 7 70
> master/trunk
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real 1m57.296s
> user 0m0.006s
> sys 0m0.005s
> master/trunk + os_process patch
> $ echo 3 > /proc/sys/vm/drop_caches
> $ time curl http://localhost:5984/test_db/_design/test/_view/test1
> {"rows":[
> {"key":null,"value":1000000}
> ]}
> real 0m53.768s
> user 0m0.006s
> sys 0m0.006s
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira