Hello all, Last weekend we ran a mashup site [0] for the Dutch Pinkpop music festival. A backend harvesting process stored all data in CouchDB and we used couchdb-lucene for full-text indexing. (There are a lot of other moving parts, it's not all run on CouchDB.)
All individual parts of our setup performed very well, but still we had some serious performance problems which seem to trace back to the fact that CouchDB and external processes communicate via stdin/stdout (a pipe), which AFAIK is a communication channel that does not allow for concurrency. Queries to CouchDB were fast, query times on couchdb-lucene were fast, but still the end result was slow, because queries to couchdb-lucene all needed to be serialized and go through the pipe to couchdb-lucene's handler script. We have discussed this with Robert Newson of couchdb-lucene and he suggested going around CouchDB and talking HTTP to couchdb-lucene directly. While this may work, I thought I'd join the dev list and bring up this issue on the dev list here and ask if there might be a way to allow concurrent access to external processes somehow, because this was a performance bottleneck we hadn't accounted for and I feel others may run into this as well at some point. Nils. [0] http://pinkpop.vpro.nl/ De informatie vervat in deze e-mail en meegezonden bijlagen is uitsluitend bedoeld voor gebruik door de geadresseerde en kan vertrouwelijke informatie bevatten. Openbaarmaking, vermenigvuldiging, verspreiding en/of verstrekking van deze informatie aan derden is voorbehouden aan geadresseerde. De VPRO staat niet in voor de juiste en volledige overbrenging van de inhoud van een verzonden e-mail, noch voor tijdige ontvangst daarvan.
