hrm, well, this certainly isn't normal, and I've not seen this behaviour at cloudant either. Not that this helps you much...
I would suggest recording /_stats periodically but nothing in there seems helpful for this, it doesn't shown current connection count. Are all requests mediated by Tornado? Is it possible to see if these symptoms manifest without it interceding? B. > On 23 Aug 2016, at 21:02, Ian Danforth <[email protected]> wrote: > > Robert, > > I haven't done this comparison, was hoping to avoid it :) Couch.log has > nothing but 200's up until the point where it becomes unresponsive. > > Ian > > On Tue, Aug 23, 2016 at 12:56 PM, Robert Samuel Newson <[email protected]> > wrote: > >> >> Are you able to compare with couchdb 1.6.1 (1.5.0 is fairly old, though I >> don't recall a fix between 1.5.0 and 1.6.1 that matches your symptoms)? >> >> couch.log has nothing interesting to say leading up to this point? >> >> >>> On 23 Aug 2016, at 20:45, Ian Danforth <[email protected]> >> wrote: >>> >>> Robert, >>> >>> Yes. We had encountered ulimit issues previously with our setup because >> we >>> weren't properly closing client connections to couch, so we are well >> aware >>> of that possibility. Thanks again for continuing to think about this! >>> >>> On Tue, Aug 23, 2016 at 12:42 PM, Robert Samuel Newson < >> [email protected]> >>> wrote: >>> >>>> sorry for not getting back to you. >>>> >>>> Do you have enough monitoring here to rule out things like hitting a >> file >>>> descriptor ulimit or ephemeral ports? >>>> >>>> >>>>> On 9 Aug 2016, at 00:05, Ian Danforth <[email protected]> >>>> wrote: >>>>> >>>>> Robert, >>>>> >>>>> Sorry that error code is thrown by tornado-couch (a python library we >> use >>>>> to make async requests to couch from our Tornado server). That is the >>>> error >>>>> of last resort when no response is forthcoming. >>>>> >>>>> curl (or any) requests to couch endpoints simply do not return. >>>>> >>>>> Thanks, >>>>> >>>>> Ian >>>>> >>>>> On Mon, Aug 8, 2016 at 3:59 PM, Robert Samuel Newson < >> [email protected] >>>>> >>>>> wrote: >>>>> >>>>>> I am pretty sure couchdb does not send 599 status code. can you show a >>>>>> full request/response please (a curl -v would do it)? >>>>>> >>>>>>> On 8 Aug 2016, at 23:32, Ian Danforth <[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> Hello! >>>>>>> >>>>>>> First post to the list so please forgive any faux-pas. I'm running >>>>>> couchdb >>>>>>> 1.5.0 on Ubuntu 14.04 and I am consistently running into a state >> where, >>>>>>> after 14 days of uptime on the computer, couchdb becomes >> unresponsive. >>>>>>> Requests to the db start queueing up until all I'm getting from the >>>>>> python >>>>>>> client are 599 relax exceptions. >>>>>>> >>>>>>> couch.couch.CouchException: HTTP 599: Unknown >>>>>>> >>>>>>> Asking the service to stop and restart does not recover and >>>>>>> /var/log/couchdb/couch.log doesn't have any errors. >>>>>>> >>>>>>> I have been unable to find reports of similar errors in various >> Google >>>>>>> searches, so I thought I'd ask here. Additional debugging and logging >>>>>>> suggestions are welcome! >>>>>>> >>>>>>> -- >>>>>>> Ian Danforth >>>>>>> Fetch Robotics >>>>>>> Lead Robotics Engineer >>>>>>> 650-391-4467 >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Ian Danforth >>>>> Fetch Robotics >>>>> Lead Robotics Engineer >>>>> 650-391-4467 >>>> >>>> >>> >>> >>> -- >>> Ian Danforth >>> Fetch Robotics >>> Lead Robotics Engineer >>> 650-391-4467 >> >> > > > -- > Ian Danforth > Fetch Robotics > Lead Robotics Engineer > 650-391-4467
