[ 
https://issues.apache.org/jira/browse/COUCHDB-620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799411#action_12799411
 ] 

Roger Binns commented on COUCHDB-620:
-------------------------------------

The latency in a non-pipelined implementation really adds up.  For example an 
additional 1 millisecond latency adds almost 3 minutes to generation time with 
a 10 million document database.  Have a millisecond here, a millisecond there 
and pretty soon you are measuring generation times in hours  :-)

Since couchjs is not threaded I don't see any way for it to answer requests in 
a different order than sent.  (Ok you can do it with some sort of some internal 
state machine and non-blocking I/O like Python's Twisted but I'm pretty sure 
couchjs is not doing that either.)

The only complication with pipelining is error handling.  For example there may 
be 5 requests in the pipeline when the couchjs processes crashes.  Any 
unanswered requests would then need to be resubmitted to a freshly spawned 
couchjs.

(BTW Brian 110% CPU consumption has nothing to do with afterburners.  Strictly 
speaking I meant core not CPU.  It just means multiple threads at the OS level 
were running and aggregate consumption between them amounted to 110% of a 
single core.  Or in other words CouchDB/beam.smp consumed 27.5% of the total 
compute resources that were available - 4 cores in one CPU.  CouchDB also seems 
to avoid using more than 3% of my available RAM.)

> Generating views is extremely slow - makes CouchDB hard to use with 
> non-trivial number of docs
> ----------------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-620
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-620
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Infrastructure
>    Affects Versions: 0.10
>         Environment: Ubuntu 9.10 64 bit, CouchDB 0.10
>            Reporter: Roger Binns
>            Assignee: Damien Katz
>         Attachments: pipelining.jpg
>
>
> Generating views is extremely slow.  For example adding 10 million documents 
> takes less than 10 minutes but generating some simple views on the same docs 
> takes over 4 hours.
> Using top you can see that CouchDB (erlang) and couchjs between them cannot 
> even saturate a single CPU let alone the I/O system.  Under ideal conditions 
> performance should be limited by cpu, disk or memory.  This implies that the 
> processes are doing simple things in lockstep accumulating latencies in each 
> process as well as the communication between them which when multiplied by 
> the number of documents can amount to a lot.
> Some suggestions:
> * Run as many couchjs instances as there are processor cores and scatter work 
> amongst them
> * Have some sort of pipelining in the erlang so that the moment the first 
> byte of response is received from couchjs the data is sent for the next 
> request (the JSON conversion, HTTP headers etc should all have been assembled 
> already) to reduce latencies.  Do whatever is most similar in couchjs (eg use 
> separate threads to read requests, process them and write responses).
> * Use the equivalent of HTTP pipelining when talking to couchjs so that it 
> always has a doc ready to work on rather than having to transmit an entire 
> response and then wait for erlang to think and provide an entire new request
> A simple test of success is to have a database with a million or so documents 
> with a trivial view and have view creation max out the CPU,. memory or disk.
> Some things in CouchDB make this a particularly nasty problem.  View data is 
> not replicated so replicating documents can lead the view data by a large 
> margin on the recipient database.  This can lead to inconsistencies.  You 
> also can't expect users to then wait minutes (or hours) for a request to 
> complete because the view generation got that far behind.  (My own plans now 
> are to not use replication and instead create the database file on another 
> couchdb instance and then rsync the binary database file over instead!)
> Although stale=ok is available, you still have no idea if the response will 
> be quick or take however long view generation does.  (Sure I could add some 
> sort of timeout and complicate the code but then what value do I pick?  If I 
> have a user waiting I want an answer ASAP or I have to give them some 
> horrible error message.  Taking a long wait and then giving a timeout is even 
> worse!)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to