Do you have the full stacktrace from couch.log?
On 17 October 2011 13:04, Paolo Negri <[email protected]> wrote: > On Mon, Oct 17, 2011 at 1:57 PM, Robert Newson <[email protected]> wrote: >> Compaction is an online process, there should be no expectation of 500 >> responses before, during, or after compaction. >> >> In this case, it seems the couch_server process is blocked for more >> than five seconds performing I/O and the gen_server:call from >> couch_server:open times out. This timeout has been increased to >> infinity since 1.0.0. >> >> What version are you running? > > I compiled master from github here are the details > > "CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"}, > > The reason to use master is that we wanted to benefit from the > ejson/snappy adoption so I guess I could actually also use the 1.2 > branch > > Paolo > >> >> B. >> >> On 17 October 2011 12:05, Martin Hewitt <[email protected]> wrote: >>> I disagree, it makes sense as the 5xx error code range is for responses >>> where the server can't fulfil a well-formed, valid client request. >>> >>> Your GET is well-formed, but the server can't process it as it's working on >>> the previous action, so a 500 is perfectly valid. Perhaps a 503 would be >>> more accurate, but the 5xx prefix is certainly correct. >>> >>> Martin >>> >>> Sent from my iPhone >>> >>> On 17 Oct 2011, at 09:29, Paolo Negri <[email protected]> wrote: >>> >>>> I agree on the fact that what happens is pretty clear to explain, I >>>> still thought it would be useful for the developers to know since >>>> offering a 500 status code for a known system condition is probably >>>> something that can be improved. >>>> >>>> Thanks, >>>> >>>> Paolo >>>> >>>> On Mon, Oct 17, 2011 at 10:24 AM, CGS <[email protected]> wrote: >>>>> I am not developer, but it's quite logic, I may say. Once you started the >>>>> compaction, your CouchDB is not responsive while the database is preparing >>>>> for compaction. Triggering immediately GET, the web instance responds with >>>>> status code 500 (internal server error, meaning unresponsive server in >>>>> this >>>>> case). So, nothing unusual in my opinion. >>>>> >>>>> Cheers, >>>>> CGS >>>>> >>>>> >>>>> >>>>> >>>>> On 10/17/2011 09:57 AM, Paolo Negri wrote: >>>>>> >>>>>> IO activity is not monitored, there's only one db on the couchdb >>>>>> instance and the described job is the only activity executed on this >>>>>> machine. >>>>>> Delaying the first request on the database url by 30 seconds did >>>>>> actually prevent the problem from happening again. >>>>>> So the issue seems to happen specifically at the moment right after >>>>>> compaction is started. >>>>>> The database is about 7GB big once compressed, the server is hosted on >>>>>> ec2 with the database directory placed on his own dedicated ephemeral >>>>>> storage. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Paolo >>>>>> >>>>>> On Fri, Oct 14, 2011 at 9:05 PM, Paul Davis<[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> Do you monitor IO activity or system responsiveness when you're doing >>>>>>> this. I've seen some compactions wallop a system when it switches over >>>>>>> due to removing large old files and such. It doesn't sound like this >>>>>>> is big enough for that case but it might be something worth checking. >>>>>>> >>>>>>> On Fri, Oct 14, 2011 at 3:41 AM, Paolo Negri<[email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> Dear list, >>>>>>>> >>>>>>>> We have a script that does the following (strictly sequentially) >>>>>>>> >>>>>>>> 1) update 300K docs in a db >>>>>>>> 2) launch compaction of the db >>>>>>>> 3) poll at a 30 sec frequency http://127.0.0.1:5984/database to know >>>>>>>> when compaction completed >>>>>>>> >>>>>>>> Last night we got a timeout error during 3, we think that this might >>>>>>>> be because the first polling (GET http://127.0.0.1:5984/database) is >>>>>>>> done right after triggering compaction >>>>>>>> >>>>>>>> I thought the dev team might be interested in knowing that this is >>>>>>>> happening >>>>>>>> >>>>>>>> There's no other activity on the couchdb instance other than what >>>>>>>> described in this email. >>>>>>>> >>>>>>>> ERROR unexpectd response checking compaction db: {ok,"500", >>>>>>>> [{"Server", >>>>>>>> >>>>>>>> "CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"}, >>>>>>>> {"Date", >>>>>>>> "Fri, 14 Oct 2011 >>>>>>>> 01:46:37 GMT"}, >>>>>>>> {"Content-Type", >>>>>>>> "text/plain; >>>>>>>> charset=utf-8"}, >>>>>>>> >>>>>>>> {"Content-Length","350"}, >>>>>>>> {"Cache-Control", >>>>>>>> "must-revalidate"}], >>>>>>>> >>>>>>>> >>>>>>>> <<"{\"error\":\"{timeout,{gen_server,call,[<0.21934.9>,{open_ref_count,<0.4090.13>}]}}\",\"reason\":\"{gen_server,call,\\n >>>>>>>> [couch_server,\\n {open,<<\\\"backup\\\">>,\\n >>>>>>>> [{user_ctx,\\n {user_ctx,null,\\n >>>>>>>> [<<\\\"_admin\\\">>],\\n<<\\\"{couch_httpd_auth, >>>>>>>> default_authentication_handler}\\\">>}}]},\\n infinity]}\"}\n">>} >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Paolo >>>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Engineering >>>> http://www.wooga.com | phone +49-30-8962 5058 | fax +49-30-8964 9064 >>>> >>>> wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany >>>> Sitz der Gesellschaft: Berlin; HRB 117846 B >>>> Registergericht Berlin-Charlottenburg >>>> Geschaeftsfuehrung: Jens Begemann, Philipp Moeser >>> >> > > > > -- > Engineering > http://www.wooga.com | phone +49-30-8962 5058 | fax +49-30-8964 9064 > > wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany > Sitz der Gesellschaft: Berlin; HRB 117846 B > Registergericht Berlin-Charlottenburg > Geschaeftsfuehrung: Jens Begemann, Philipp Moeser >
