Ori.livneh has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/91580


Change subject: Make respawn behavior of EventLogging consumers more resilient
......................................................................

Make respawn behavior of EventLogging consumers more resilient

If the database goes into read-only mode (as happened today), the MySQL
consumer on vanadium fails to insert records. The time from service start to
crash is very quick, so it hits Upstart's default respawn thresholds and
Upstart gives up before the database has had a chance to recover. This patch
specifies a 5-second sleep interval between restarts and sets the respawn
threshold to >30 attempts in a five-minute period.

Change-Id: Idaefb8958cd996a5581d708e4a596f56dbdcb599
---
M modules/eventlogging/files/init/consumer.conf
1 file changed, 2 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/operations/puppet 
refs/changes/80/91580/1

diff --git a/modules/eventlogging/files/init/consumer.conf 
b/modules/eventlogging/files/init/consumer.conf
index 0773276..025a17f 100644
--- a/modules/eventlogging/files/init/consumer.conf
+++ b/modules/eventlogging/files/init/consumer.conf
@@ -17,3 +17,5 @@
 exec eventlogging-consumer "@$CONFIG"
 
 respawn
+respawn limit 30 300     # Give up if started >30 times in last 5 minutes.
+post-stop exec sleep 5   # Sleep 5 seconds between attempts to start.

-- 
To view, visit https://gerrit.wikimedia.org/r/91580
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Idaefb8958cd996a5581d708e4a596f56dbdcb599
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Ori.livneh <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to