Filippo Giunchedi has uploaded a new change for review. https://gerrit.wikimedia.org/r/205603
Change subject: logging: update CirrusSearch thresholds ...................................................................... logging: update CirrusSearch thresholds adjust thresholds to improve SNR metric history is also available at https://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Miscellaneous%20eqiad&h=fluorine.eqiad.wmnet&r=hour&z=default&jr=&js=&st=1429622767&v=0.0167224080268&m=CirrusSearch-slow.log_line_rate&vl=lines%20per%20sec&ti=&z=large Bug: T84163 Change-Id: I71c527168aa18ffd44e386fa59e159612944cb20 --- M manifests/role/logging.pp 1 file changed, 7 insertions(+), 11 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/03/205603/1 diff --git a/manifests/role/logging.pp b/manifests/role/logging.pp index b8e2891..246c4bb 100644 --- a/manifests/role/logging.pp +++ b/manifests/role/logging.pp @@ -111,23 +111,19 @@ logster_options => '--output ganglia --metric-prefix CirrusSearch-slow.log', minute => "*/${cirrussearch_slow_log_check_interval}" } - # Alert if CirrusSearch-slow.log shows more than - # 10 slow searches within an hour. The logster - # job runs every $cirrussearch_slow_log_check_interval + # The logster job runs every $cirrussearch_slow_log_check_interval # minutes. We set retries to # 60 minutes / cirrussearch_slow_log_check_interval minutes) - # This should keep icinga from alerting - # us unless the alert thresholds are exceeded - # for more than an hour. + # This should keep icinga from alerting us unless the alert thresholds are + # exceeded for more than an hour. monitoring::ganglia { 'CirrusSearch-slow-queries': description => 'Slow CirrusSearch query rate', # this metric is output to ganglia by logster metric => 'CirrusSearch-slow.log_line_rate', - # line_rate metric is per second, so we need to alert if this - # metric goes over 0.000046296 / second. Let's round - # down to warning on 0.00004, or critical on 0.00008. - warning => '0.00004', - critical => '0.00008', + # warning -> 36 queries/h + # critical -> 360 queries/h + warning => '0.01', + critical => '0.1', normal_check_interval => $cirrussearch_slow_log_check_interval, retry_check_interval => $cirrussearch_slow_log_check_interval, retries => (60/$cirrussearch_slow_log_check_interval), -- To view, visit https://gerrit.wikimedia.org/r/205603 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I71c527168aa18ffd44e386fa59e159612944cb20 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Filippo Giunchedi <[email protected]> _______________________________________________ MediaWiki-commits mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
