Ottomata has uploaded a new change for review. ( 
https://gerrit.wikimedia.org/r/328239 )

Change subject: Alert on EventBus service HTTP error rate
......................................................................

Alert on EventBus service HTTP error rate

Bug: T153034
Change-Id: Id8701a8ef08512488bd316b8b34872980dfa6cfe
---
M modules/role/manifests/graphite/alerts.pp
1 file changed, 11 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/operations/puppet 
refs/changes/39/328239/1

diff --git a/modules/role/manifests/graphite/alerts.pp 
b/modules/role/manifests/graphite/alerts.pp
index d44f4c8..5e05451 100644
--- a/modules/role/manifests/graphite/alerts.pp
+++ b/modules/role/manifests/graphite/alerts.pp
@@ -55,5 +55,16 @@
         from        => '10min',
         percentage  => 70,
     }
+
+    # Monitor EventBus 4xx and 5xx HTTP response rate.
+    monitoring::graphite_threshold { 'eventbus_http_error_rate':
+        description => 'EventBus HTTP Error Rate (4xx + 5xx)',
+        metric      => 
'transformNull((sumSeries(eventbus.counters.eventlogging.service.EventHandler.POST.[45]*.rate))',
+        # If > 50% of datapoints over last 10 minutes is over thresholds, then 
alert.
+        warning     => 1,
+        critical    => 10,
+        from        => '10min',
+        percentage  => 50,
+    }
 }
 

-- 
To view, visit https://gerrit.wikimedia.org/r/328239
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id8701a8ef08512488bd316b8b34872980dfa6cfe
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Ottomata <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to