Do you have the full stacktrace from couch.log?

On 17 October 2011 13:04, Paolo Negri <[email protected]> wrote:
> On Mon, Oct 17, 2011 at 1:57 PM, Robert Newson <[email protected]> wrote:
>> Compaction is an online process, there should be no expectation of 500
>> responses before, during, or after compaction.
>>
>> In this case, it seems the couch_server process is blocked for more
>> than five seconds performing I/O and the gen_server:call from
>> couch_server:open times out. This timeout has been increased to
>> infinity since 1.0.0.
>>
>> What version are you running?
>
> I compiled master from github here are the details
>
> "CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"},
>
> The reason to use master is that we wanted to benefit from the
> ejson/snappy adoption so I guess I could actually also use the 1.2
> branch
>
> Paolo
>
>>
>> B.
>>
>> On 17 October 2011 12:05, Martin Hewitt <[email protected]> wrote:
>>> I disagree, it makes sense as the 5xx error code range is for responses 
>>> where the server can't fulfil a well-formed, valid client request.
>>>
>>> Your GET is well-formed, but the server can't process it as it's working on 
>>> the previous action, so a 500 is perfectly valid. Perhaps a 503 would be 
>>> more accurate, but the 5xx prefix is certainly correct.
>>>
>>> Martin
>>>
>>> Sent from my iPhone
>>>
>>> On 17 Oct 2011, at 09:29, Paolo Negri <[email protected]> wrote:
>>>
>>>> I agree on the fact that what happens is pretty clear to explain, I
>>>> still thought it would be useful for the developers to know since
>>>> offering a 500 status code for a known system condition is probably
>>>> something that can be improved.
>>>>
>>>> Thanks,
>>>>
>>>> Paolo
>>>>
>>>> On Mon, Oct 17, 2011 at 10:24 AM, CGS <[email protected]> wrote:
>>>>> I am not developer, but it's quite logic, I may say. Once you started the
>>>>> compaction, your CouchDB is not responsive while the database is preparing
>>>>> for compaction. Triggering immediately GET, the web instance responds with
>>>>> status code 500 (internal server error, meaning unresponsive server in 
>>>>> this
>>>>> case). So, nothing unusual in my opinion.
>>>>>
>>>>> Cheers,
>>>>> CGS
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 10/17/2011 09:57 AM, Paolo Negri wrote:
>>>>>>
>>>>>> IO activity is not monitored, there's only one db on the couchdb
>>>>>> instance and the described job is the only activity executed on this
>>>>>> machine.
>>>>>> Delaying the first request on the database url by 30 seconds did
>>>>>> actually prevent the problem from happening again.
>>>>>> So the issue seems to happen specifically at the moment right after
>>>>>> compaction is started.
>>>>>> The database is about 7GB big once compressed, the server is hosted on
>>>>>> ec2 with the database directory placed on his own dedicated ephemeral
>>>>>> storage.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Paolo
>>>>>>
>>>>>> On Fri, Oct 14, 2011 at 9:05 PM, Paul Davis<[email protected]>
>>>>>>  wrote:
>>>>>>>
>>>>>>> Do you monitor IO activity or system responsiveness when you're doing
>>>>>>> this. I've seen some compactions wallop a system when it switches over
>>>>>>> due to removing large old files and such. It doesn't sound like this
>>>>>>> is big enough for that case but it might be something worth checking.
>>>>>>>
>>>>>>> On Fri, Oct 14, 2011 at 3:41 AM, Paolo Negri<[email protected]>
>>>>>>>  wrote:
>>>>>>>>
>>>>>>>> Dear list,
>>>>>>>>
>>>>>>>> We have a script that does the following (strictly sequentially)
>>>>>>>>
>>>>>>>> 1) update 300K docs in a db
>>>>>>>> 2) launch compaction of the db
>>>>>>>> 3) poll at a 30 sec frequency http://127.0.0.1:5984/database to know
>>>>>>>> when compaction completed
>>>>>>>>
>>>>>>>> Last night we got a timeout error during 3, we think that this might
>>>>>>>> be because the first polling (GET  http://127.0.0.1:5984/database) is
>>>>>>>> done right after triggering compaction
>>>>>>>>
>>>>>>>> I thought the dev team might be interested in knowing that this is
>>>>>>>> happening
>>>>>>>>
>>>>>>>> There's no other activity on the couchdb instance other than what
>>>>>>>> described in this email.
>>>>>>>>
>>>>>>>> ERROR unexpectd response checking compaction db: {ok,"500",
>>>>>>>>                                                 [{"Server",
>>>>>>>>
>>>>>>>> "CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"},
>>>>>>>>                                                  {"Date",
>>>>>>>>                                                   "Fri, 14 Oct 2011
>>>>>>>> 01:46:37 GMT"},
>>>>>>>>                                                  {"Content-Type",
>>>>>>>>                                                   "text/plain;
>>>>>>>> charset=utf-8"},
>>>>>>>>
>>>>>>>>  {"Content-Length","350"},
>>>>>>>>                                                  {"Cache-Control",
>>>>>>>>                                                   "must-revalidate"}],
>>>>>>>>
>>>>>>>>
>>>>>>>> <<"{\"error\":\"{timeout,{gen_server,call,[<0.21934.9>,{open_ref_count,<0.4090.13>}]}}\",\"reason\":\"{gen_server,call,\\n
>>>>>>>>   [couch_server,\\n     {open,<<\\\"backup\\\">>,\\n
>>>>>>>> [{user_ctx,\\n              {user_ctx,null,\\n
>>>>>>>> [<<\\\"_admin\\\">>],\\n<<\\\"{couch_httpd_auth,
>>>>>>>> default_authentication_handler}\\\">>}}]},\\n     infinity]}\"}\n">>}
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> Paolo
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Engineering
>>>> http://www.wooga.com | phone +49-30-8962 5058  | fax +49-30-8964 9064
>>>>
>>>> wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
>>>> Sitz der Gesellschaft: Berlin; HRB 117846 B
>>>> Registergericht Berlin-Charlottenburg
>>>> Geschaeftsfuehrung: Jens Begemann, Philipp Moeser
>>>
>>
>
>
>
> --
> Engineering
> http://www.wooga.com | phone +49-30-8962 5058  | fax +49-30-8964 9064
>
> wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany
> Sitz der Gesellschaft: Berlin; HRB 117846 B
> Registergericht Berlin-Charlottenburg
> Geschaeftsfuehrung: Jens Begemann, Philipp Moeser
>

Reply via email to