Ori.livneh has uploaded a new change for review.
https://gerrit.wikimedia.org/r/300327
Change subject: Add alerting for MediaWiki exceptions and fatals
......................................................................
Add alerting for MediaWiki exceptions and fatals
Watch the rate per minute of MediaWiki exceptions and fatals. Warn if it
exceeds 15; panic if it exceeds 25. I hope we can gradually make these numbers
lower.
Bug: T140942
Change-Id: I638d270e52a559a5b6bc0f68788172869ca2d888
---
M modules/role/manifests/graphite/alerts.pp
1 file changed, 8 insertions(+), 0 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/operations/puppet
refs/changes/27/300327/1
diff --git a/modules/role/manifests/graphite/alerts.pp
b/modules/role/manifests/graphite/alerts.pp
index c3031ca..c643d0d 100644
--- a/modules/role/manifests/graphite/alerts.pp
+++ b/modules/role/manifests/graphite/alerts.pp
@@ -46,5 +46,13 @@
percentage => 40,
}
+ # Monitor MediaWiki fatals and exceptions.
+ monitoring::graphite_threshold { 'mediawiki_error_rate':
+ description => 'MediaWiki exceptions and fatals per minute',
+ metric =>
'transformNull(sumSeries(logstash.rate.mediawiki.fatal.ERROR.sum,
logstash.rate.mediawiki.exception.ERROR.sum), 0)',
+ warning => 15,
+ critical => 25,
+ from => '5min',
+ }
}
--
To view, visit https://gerrit.wikimedia.org/r/300327
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I638d270e52a559a5b6bc0f68788172869ca2d888
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Ori.livneh <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits