I agree on the fact that what happens is pretty clear to explain, I still thought it would be useful for the developers to know since offering a 500 status code for a known system condition is probably something that can be improved.
Thanks, Paolo On Mon, Oct 17, 2011 at 10:24 AM, CGS <[email protected]> wrote: > I am not developer, but it's quite logic, I may say. Once you started the > compaction, your CouchDB is not responsive while the database is preparing > for compaction. Triggering immediately GET, the web instance responds with > status code 500 (internal server error, meaning unresponsive server in this > case). So, nothing unusual in my opinion. > > Cheers, > CGS > > > > > On 10/17/2011 09:57 AM, Paolo Negri wrote: >> >> IO activity is not monitored, there's only one db on the couchdb >> instance and the described job is the only activity executed on this >> machine. >> Delaying the first request on the database url by 30 seconds did >> actually prevent the problem from happening again. >> So the issue seems to happen specifically at the moment right after >> compaction is started. >> The database is about 7GB big once compressed, the server is hosted on >> ec2 with the database directory placed on his own dedicated ephemeral >> storage. >> >> Thanks, >> >> Paolo >> >> On Fri, Oct 14, 2011 at 9:05 PM, Paul Davis<[email protected]> >> wrote: >>> >>> Do you monitor IO activity or system responsiveness when you're doing >>> this. I've seen some compactions wallop a system when it switches over >>> due to removing large old files and such. It doesn't sound like this >>> is big enough for that case but it might be something worth checking. >>> >>> On Fri, Oct 14, 2011 at 3:41 AM, Paolo Negri<[email protected]> >>> wrote: >>>> >>>> Dear list, >>>> >>>> We have a script that does the following (strictly sequentially) >>>> >>>> 1) update 300K docs in a db >>>> 2) launch compaction of the db >>>> 3) poll at a 30 sec frequency http://127.0.0.1:5984/database to know >>>> when compaction completed >>>> >>>> Last night we got a timeout error during 3, we think that this might >>>> be because the first polling (GET http://127.0.0.1:5984/database) is >>>> done right after triggering compaction >>>> >>>> I thought the dev team might be interested in knowing that this is >>>> happening >>>> >>>> There's no other activity on the couchdb instance other than what >>>> described in this email. >>>> >>>> ERROR unexpectd response checking compaction db: {ok,"500", >>>> [{"Server", >>>> >>>> "CouchDB/1.3.0a-74613f5-git (Erlang OTP/R14B04)"}, >>>> {"Date", >>>> "Fri, 14 Oct 2011 >>>> 01:46:37 GMT"}, >>>> {"Content-Type", >>>> "text/plain; >>>> charset=utf-8"}, >>>> >>>> {"Content-Length","350"}, >>>> {"Cache-Control", >>>> "must-revalidate"}], >>>> >>>> >>>> <<"{\"error\":\"{timeout,{gen_server,call,[<0.21934.9>,{open_ref_count,<0.4090.13>}]}}\",\"reason\":\"{gen_server,call,\\n >>>> [couch_server,\\n {open,<<\\\"backup\\\">>,\\n >>>> [{user_ctx,\\n {user_ctx,null,\\n >>>> [<<\\\"_admin\\\">>],\\n<<\\\"{couch_httpd_auth, >>>> default_authentication_handler}\\\">>}}]},\\n infinity]}\"}\n">>} >>>> >>>> >>>> Thanks, >>>> >>>> Paolo >>>> >> >> > > -- Engineering http://www.wooga.com | phone +49-30-8962 5058 | fax +49-30-8964 9064 wooga GmbH | Saarbruecker Str. 38 | 10405 Berlin | Germany Sitz der Gesellschaft: Berlin; HRB 117846 B Registergericht Berlin-Charlottenburg Geschaeftsfuehrung: Jens Begemann, Philipp Moeser
