Ori.livneh has submitted this change and it was merged. Change subject: Make respawn behavior of EventLogging consumers more resilient ......................................................................
Make respawn behavior of EventLogging consumers more resilient If the database goes into read-only mode (as happened today), the MySQL consumer on vanadium fails to insert records. The time from service start to crash is very quick, so it hits Upstart's default respawn thresholds and Upstart gives up before the database has had a chance to recover. This patch specifies a 5-second sleep interval between restarts and sets the respawn threshold to >30 attempts in a five-minute period. Change-Id: Idaefb8958cd996a5581d708e4a596f56dbdcb599 --- M modules/eventlogging/files/init/consumer.conf 1 file changed, 2 insertions(+), 0 deletions(-) Approvals: Ori.livneh: Looks good to me, approved jenkins-bot: Verified diff --git a/modules/eventlogging/files/init/consumer.conf b/modules/eventlogging/files/init/consumer.conf index 0773276..025a17f 100644 --- a/modules/eventlogging/files/init/consumer.conf +++ b/modules/eventlogging/files/init/consumer.conf @@ -17,3 +17,5 @@ exec eventlogging-consumer "@$CONFIG" respawn +respawn limit 30 300 # Give up if started >30 times in last 5 minutes. +post-stop exec sleep 5 # Sleep 5 seconds between attempts to start. -- To view, visit https://gerrit.wikimedia.org/r/91580 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Idaefb8958cd996a5581d708e4a596f56dbdcb599 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Ori.livneh <[email protected]> Gerrit-Reviewer: Ori.livneh <[email protected]> Gerrit-Reviewer: jenkins-bot _______________________________________________ MediaWiki-commits mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
