On Tue, Mar 12, 2013 at 2:01 AM, Jeff Charette <[email protected]> wrote:
> What is your CouchDB host preference? Here has been my experience which > leaves me as a loss for hosted services. > > Cloudant > - doesn't support newest couch techniques like require and I can't find a > tutorial to port my couch app. > > Iriscouch (currently using) > - I have nothing but love for these guys, but have had a lot of issues > lately. I've requested an upgrade with no response unfortunetly. > - they are on 1.2.1 which would be great, but 1.2.1 has a big issue which > has been fixed for 1.2.2 > https://issues.apache.org/jira/browse/COUCHDB-1651 Thanks for your love. Regarding Iris Couch, I am biased; but I myself have nothing but love for the people at Cloudant, too. Of course, ultimately, you don't need people, you need the stuff they make and do (i.e. CouchDB service). You are right that we have had issues lately. We've always had random failures; but this is the first time things have gotten bad enough that general users felt prolonged slowness or unavailability. Long story short: these issues are behind us and we are back to our well-known quality of service. I thought our failure would be a boring story, but maybe I'll tell it anyway. The big problem was that we failed to support people, not that we failed to run software. Do you know how lots of stuff runs just fine from 0% to about 90% or 95% capacity, then it collapses horribly (e.g. memory, filesystems, disk i/o)? We experienced a similar collapse with customer support. The past two weeks, due to vacations and traveling engineers, we were doing less regular maintenance than usual. Then, also randomly, a few machines crashed badly. As a sysadmin I like CouchDB, because only safe operations are allowed. (For example, CouchDB has no JOINs, therefore every read operation is guaranteed to complete in logarithmic time.) That is usually the situation; however there is still the occasional memory leak or out of control process or whatever. Anyway, we exhausted memory on several machines which crashed many people's couches. That's fine; but the real collapse happened when everybody began to inquire about their server. Fixing stuff over SSH is quick, but supporting people takes much more time. When we saw the support volume spike, I decided to enter triage mode: make a priority list of technical and personal obligations and work from the top down. All software has real-time constraints. In fact, all human activity has real-time constraints. Right? Right? Hello? Hello! Can you hear me? After a certain time, if something is not done, it may as well never be done. That is how I approached our support load. I have learned from many trusted advisors (Hi, Jan and Noah and everyone!) that "support load" is a terrible phrase. CPU load is CPU load; but "support load" is people. So, I have learned my lesson, and we are now working through the entire backlog. Some people emailed to tell us nevermind, they had moved to Cloudant. I think they wanted to twist the knife a bit, to blow off steam. Okay, but that put them near the bottom of our priority list (they are no longer using the service; outstanding issues are moot). However they are still people. We will be emailing even them, to say the issue has been resolved. If you ask a question, I should respond, otherwise it's rude. -- Iris Couch
