On Wed, Aug 24, 2011 at 9:41 PM, Jason Smith <[email protected]> wrote: > (Migrated to user@) > > On Thu, Aug 25, 2011 at 4:05 AM, Chris Stockton > <[email protected]> wrote: >> Hello, >> >> On Wed, Aug 24, 2011 at 1:53 PM, Paul Davis <[email protected]> >> wrote: >>> I bet you're hitting a bug we just recently fixed on trunk. Basically, >>> there was a possibility that errors in some of the JS functions would >>> end up causing a couchjs process to be come unusable without removing >>> it from the list. Eventually there wouldn't be any spots left and >>> clients would get timeouts like you see. >>> >>> Patch is at [1]. If it doesn't apply cleanly, you really only need the >>> bits from couch_os_process.erl and couch_query_servers.erl. The rest >>> is just test code. >>> >>> https://github.com/apache/couchdb/commit/95da6f6f4246d2e8e86a3cf92ddf6487e46c10e9 >>> >> >> Right after sending this I finally saw what the issue was, our bug >> report here: >> https://issues.apache.org/jira/browse/COUCHDB-1257?focusedCommentId=13090484#comment-13090484 >> as a side effect was leaving lingering processes ultimately leading to >> instability of couch. >> >> We are working to patch our reduce issue, and will look at applying >> that commit perhaps once it hits mainline couch? > > Chris, I'm glad those bugs are identified and will be fixed soon. But > in the meantime, perhaps you can change your code to add robustness? I > can think of two ideas. > > 1. CouchDB has an odd, idiosyncratic, feature where GET queries > produce side-effects. From HTTP, there are no side-effects, but as you > can see, GETting a view can spawn couches processes and write files to > the disk. Perhaps you could add ?stale=ok to all of your queries used > in production. To my knowledge, stale=ok guarantees that couchjs will > not be involved in servicing that query. This protects your users from > seeing map/reduce errors. The down-side is that you must of course > query the views yourself to keep them current. >
No, couchjs will still be required to service reductions. stale=ok just means you don't have to wait for a possibly lengthy view build. Also, "GET's cause side effects" is a bit misleading I think. A GET on a view by default tries to return data from something equal to or newer than the current database update_seq. If the current view state hasn't caught up to the db's update_seq it waits. Classifying this as "GET's have side effects" is really like saying "GET to a Rails APP causes side effects because you have to wait for the template to render." For those familiar with internals its quite true that if the indexer isn't running, a new GET request will trigger an update. But say we change that so that view updates are tried every N seconds and a GET itself will never trigger an update but might endure a delay of N-1 seconds before anything starts happening. You'd be hard pressed to say that the "GET caused side effects" in that case yet the observable behavior is the same: "sometimes it takes a while." > 2. Design documents should never be published (used in production) > until their views are fully built. This is not a CouchDB bug, but > rather a lack of tooling. The technique is pretty simple. Publish your > design document under an alternative id: _design/example_staging. > Next, query the views (which you are conveniently already doing in > step 1!). When the views are fresh, with no bugs and everything looks > good, query with HTTP COPY to promote _design/example_staging to > _design/example. That is an atomic software upgrade; and views will be > ready instantly. Not bad! > "Should never be ..." is a bit too prescriptive for my taste. Bottom line, views can take a non trivial amount of time to build. Beware. Though the advice for pre-building views and promoting is spot on if you anticipate long view builds. > Perhaps these ideas will help you work around your bugs. IMO, they are > good general policies anyway. > > -- > Iris Couch >
