wohali opened a new issue #1383: couch_compaction_daemon dies on busy node URL: https://github.com/apache/couchdb/issues/1383 # Current Behaviour When couch_compaction_daemon fails to spawn a compactor with `start_compact` and receives a timeout from `gen_server:call/2`, the entire compaction daemon fails and restarts. Sample log excerpt: ``` [error] 2018-05-29T12:41:51.965000Z [email protected] <0.4859.674> -------- gen_server couch_compaction_daemon terminated with reason: {compaction_loop_died,{timeout,{gen_server,call,[<0.21919.5399>,start_compact]}}} last msg: {'EXIT',<0.23803.535>,{timeout,{gen_server,call,[<0.21919.5399>,start_compact]}}} state: {state,<0.23803.535>,[<<"shards/c0000000-dfffffff/bb.1519652899">>]} [error] 2018-05-29T12:41:51.965000Z [email protected] <0.4859.674> -------- CRASH REPORT Process couch_compaction_daemon (<0.4859.674>) with 0 neighbors exited with reason: {compaction_loop_died,{timeout,{gen_server,call,[<0.21919.5399>,start_compact]}}} at gen_server:terminate/7(line:826) <= proc_lib:init_p_do_apply/3(line:240); initial_call: {couch_compaction_daemon,init,['Argument__1']}, ancestors: [couch_secondary_services,couch_sup,<0.209.0>], messages: [], links: [<0.17698.21>], dictionary: [], trap_exit: true, status: running, heap_size: 987, stack_size: 27, reductions: 2899 [error] 2018-05-29T12:41:51.965000Z [email protected] <0.17698.21> -------- Supervisor couch_secondary_services had child compaction_daemon started with couch_compaction_daemon:start_link() at <0.4859.674> exit with reason {compaction_loop_died,{timeout,{gen_server,call,[<0.21919.5399>,start_compact]}}} in context child_terminated ``` If the machine is especially busy, this can lead to restart throttling: ``` [error] 2018-05-29T12:45:56.635000Z [email protected] <0.17698.21> -------- Supervisor couch_secondary_services had child compaction_daemon started with couch_compaction_daemon:start_link() at <0.14978.429> exit with reason {compaction_loop_died,{timeout,{gen_server,call,[<0.21919.5399>,start_compact]}}} in context child_terminated [error] 2018-05-29T12:45:56.635000Z [email protected] <0.17698.21> -------- Supervisor couch_secondary_services had child compaction_daemon started with couch_compaction_daemon:start_link() at <0.14978.429> exit with reason reached_max_restart_intensity in context shutdown ``` # Expected Behaviour The compaction daemon should handle timeouts gracefully. Ideally this would start a sleep cycle before trying to start another compaction process.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
