[
https://issues.apache.org/jira/browse/COUCHDB-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798962#action_12798962
]
Roger Binns commented on COUCHDB-620:
-------------------------------------
The objective criteria is that the CPU and/or I/O is saturated, or at least
close to saturation. (Doing anything less is effectively adding gratuitous
delays. Operating systems have the ability to prioritize tasks easily once
saturation has been achieved and is a better way of dealing with the issue.
For example would you add gratuitous delays to a C compiler so it has less
impact on a machine?)
I tried current svn trunk to see what the state is with my data set (10 million
documents).
Generating the view now takes about 75 minutes, whereas before it took 4 hours.
(The machine was also upgraded this weekend from two to four cores, 2.8GHz
speed to 3.6GHz and double the disk bandwidth - new drives and striping, so the
numbers are not strictly comparable.)
During the view generation the CouchDB process took between 70% and 130% of a
CPU, usually in the 110% range. The couchjs process was hovering around 25% of
a CPU. Using iostat I could see between 5 and 30% of disk utilization, usually
closer to 30%. (iostat showed there was still plenty of disk access available.)
Quite simply the view generation is still a very long time but somewhat more
tolerable, 2 1/2 cores of my machine sat idle during this time while the disk
was idle 66% of the time. Trunk is consequently an improvement but nowhere
near as good as it could be.
I think this ticket should be re-opened, with the word 'extremely' removed.
To understand why I care so much, my views are my document access. That is how
the documents are found - look for strings in the view. If the string is not
found then the document may as well not exist. Except the documents do
reference each other so while view generation is happening it is possible to
have inconsistencies - the view claiming a document doesn't exist while another
linked one saying the first one does.
And since views are generated on each machine after replication I can't incur
the generation overhead on one machine and then replicate the results.
> Generating views is extremely slow - makes CouchDB hard to use with
> non-trivial number of docs
> ----------------------------------------------------------------------------------------------
>
> Key: COUCHDB-620
> URL: https://issues.apache.org/jira/browse/COUCHDB-620
> Project: CouchDB
> Issue Type: Improvement
> Components: Infrastructure
> Affects Versions: 0.10
> Environment: Ubuntu 9.10 64 bit, CouchDB 0.10
> Reporter: Roger Binns
> Assignee: Damien Katz
>
> Generating views is extremely slow. For example adding 10 million documents
> takes less than 10 minutes but generating some simple views on the same docs
> takes over 4 hours.
> Using top you can see that CouchDB (erlang) and couchjs between them cannot
> even saturate a single CPU let alone the I/O system. Under ideal conditions
> performance should be limited by cpu, disk or memory. This implies that the
> processes are doing simple things in lockstep accumulating latencies in each
> process as well as the communication between them which when multiplied by
> the number of documents can amount to a lot.
> Some suggestions:
> * Run as many couchjs instances as there are processor cores and scatter work
> amongst them
> * Have some sort of pipelining in the erlang so that the moment the first
> byte of response is received from couchjs the data is sent for the next
> request (the JSON conversion, HTTP headers etc should all have been assembled
> already) to reduce latencies. Do whatever is most similar in couchjs (eg use
> separate threads to read requests, process them and write responses).
> * Use the equivalent of HTTP pipelining when talking to couchjs so that it
> always has a doc ready to work on rather than having to transmit an entire
> response and then wait for erlang to think and provide an entire new request
> A simple test of success is to have a database with a million or so documents
> with a trivial view and have view creation max out the CPU,. memory or disk.
> Some things in CouchDB make this a particularly nasty problem. View data is
> not replicated so replicating documents can lead the view data by a large
> margin on the recipient database. This can lead to inconsistencies. You
> also can't expect users to then wait minutes (or hours) for a request to
> complete because the view generation got that far behind. (My own plans now
> are to not use replication and instead create the database file on another
> couchdb instance and then rsync the binary database file over instead!)
> Although stale=ok is available, you still have no idea if the response will
> be quick or take however long view generation does. (Sure I could add some
> sort of timeout and complicate the code but then what value do I pick? If I
> have a user waiting I want an answer ASAP or I have to give them some
> horrible error message. Taking a long wait and then giving a timeout is even
> worse!)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.