Ori.livneh has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/300327

Change subject: Add alerting for MediaWiki exceptions and fatals
......................................................................

Add alerting for MediaWiki exceptions and fatals

Watch the rate per minute of MediaWiki exceptions and fatals. Warn if it
exceeds 15; panic if it exceeds 25. I hope we can gradually make these numbers
lower.

Bug: T140942
Change-Id: I638d270e52a559a5b6bc0f68788172869ca2d888
---
M modules/role/manifests/graphite/alerts.pp
1 file changed, 8 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/operations/puppet 
refs/changes/27/300327/1

diff --git a/modules/role/manifests/graphite/alerts.pp 
b/modules/role/manifests/graphite/alerts.pp
index c3031ca..c643d0d 100644
--- a/modules/role/manifests/graphite/alerts.pp
+++ b/modules/role/manifests/graphite/alerts.pp
@@ -46,5 +46,13 @@
         percentage  => 40,
     }
 
+    # Monitor MediaWiki fatals and exceptions.
+    monitoring::graphite_threshold { 'mediawiki_error_rate':
+        description => 'MediaWiki exceptions and fatals per minute',
+        metric      => 
'transformNull(sumSeries(logstash.rate.mediawiki.fatal.ERROR.sum, 
logstash.rate.mediawiki.exception.ERROR.sum), 0)',
+        warning      => 15,
+        critical     => 25,
+        from        => '5min',
+    }
 }
 

-- 
To view, visit https://gerrit.wikimedia.org/r/300327
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I638d270e52a559a5b6bc0f68788172869ca2d888
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Ori.livneh <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to