Ori.livneh has submitted this change and it was merged.

Change subject: Make respawn behavior of EventLogging consumers more resilient
......................................................................


Make respawn behavior of EventLogging consumers more resilient

If the database goes into read-only mode (as happened today), the MySQL
consumer on vanadium fails to insert records. The time from service start to
crash is very quick, so it hits Upstart's default respawn thresholds and
Upstart gives up before the database has had a chance to recover. This patch
specifies a 5-second sleep interval between restarts and sets the respawn
threshold to >30 attempts in a five-minute period.

Change-Id: Idaefb8958cd996a5581d708e4a596f56dbdcb599
---
M modules/eventlogging/files/init/consumer.conf
1 file changed, 2 insertions(+), 0 deletions(-)

Approvals:
  Ori.livneh: Looks good to me, approved
  jenkins-bot: Verified



diff --git a/modules/eventlogging/files/init/consumer.conf 
b/modules/eventlogging/files/init/consumer.conf
index 0773276..025a17f 100644
--- a/modules/eventlogging/files/init/consumer.conf
+++ b/modules/eventlogging/files/init/consumer.conf
@@ -17,3 +17,5 @@
 exec eventlogging-consumer "@$CONFIG"
 
 respawn
+respawn limit 30 300     # Give up if started >30 times in last 5 minutes.
+post-stop exec sleep 5   # Sleep 5 seconds between attempts to start.

-- 
To view, visit https://gerrit.wikimedia.org/r/91580
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Idaefb8958cd996a5581d708e4a596f56dbdcb599
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Ori.livneh <[email protected]>
Gerrit-Reviewer: Ori.livneh <[email protected]>
Gerrit-Reviewer: jenkins-bot

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to