Dzahn has uploaded a new change for review. (
https://gerrit.wikimedia.org/r/327695 )
Change subject: contint/zuul: skip Icinga monitoring if server not master
......................................................................
contint/zuul: skip Icinga monitoring if server not master
For zuul, if the FQDN of the current server does not match
the one set as "master" in Hiera, then skip the Icinga monitoring
part since the zuul service will not be running.
If one of the server becomes active, automatically add the monitoring
there without having to care about hostnames.
Bug: T150771
Change-Id: I50db45303a6b466abb15d7a8a5f61e75d2947cc7
---
M hieradata/role/common/ci/master.yaml
M modules/zuul/manifests/monitoring/server.pp
2 files changed, 24 insertions(+), 11 deletions(-)
git pull ssh://gerrit.wikimedia.org:29418/operations/puppet
refs/changes/95/327695/1
diff --git a/hieradata/role/common/ci/master.yaml
b/hieradata/role/common/ci/master.yaml
index c3813354..87e8621 100644
--- a/hieradata/role/common/ci/master.yaml
+++ b/hieradata/role/common/ci/master.yaml
@@ -8,3 +8,5 @@
debdeploy::grains:
debdeploy-ci:
value: standard
+
+contint::master_host: 'contint1001.wikimedia.org'
diff --git a/modules/zuul/manifests/monitoring/server.pp
b/modules/zuul/manifests/monitoring/server.pp
index 8eccc2d..ccbf8fe 100644
--- a/modules/zuul/manifests/monitoring/server.pp
+++ b/modules/zuul/manifests/monitoring/server.pp
@@ -4,18 +4,29 @@
#
class zuul::monitoring::server {
- nrpe::monitor_service { 'zuul':
- description => 'zuul_service_running',
- contact_group => 'contint',
- # Zuul has a main process and a fork which is the gearman
- # server. Thus we need two process running.
- nrpe_command => "/usr/lib/nagios/plugins/check_procs -w 2:2 -c 2:2
--ereg-argument-array '^/usr/share/python/zuul/bin/python
/usr/bin/zuul-server'",
- }
+ $master_host = hiera(contint::master_host)
- nrpe::monitor_service { 'zuul_gearman':
- description => 'zuul_gearman_service',
- contact_group => 'contint',
- nrpe_command => '/usr/lib/nagios/plugins/check_tcp -H 127.0.0.1 -p
4730 --timeout=2',
+ $monitoring_active = $master_host ? {
+ $::fqdn => true,
+ default => false
}
+ # only monitor these on the active master host
+ # zuul service will be stopped on the warm standby server
+ if $monitoring_active {
+ nrpe::monitor_service { 'zuul':
+ description => 'zuul_service_running',
+ contact_group => 'contint',
+ # Zuul has a main process and a fork which is the gearman
+ # server. Thus we need two process running.
+ nrpe_command => "/usr/lib/nagios/plugins/check_procs -w 2:2 -c
2:2 --ereg-argument-array '^/usr/share/python/zuul/bin/python
/usr/bin/zuul-server'",
+ }
+
+ nrpe::monitor_service { 'zuul_gearman':
+ description => 'zuul_gearman_service',
+ contact_group => 'contint',
+ nrpe_command => '/usr/lib/nagios/plugins/check_tcp -H 127.0.0.1
-p 4730 --timeout=2',
+ }
+
+ }
}
--
To view, visit https://gerrit.wikimedia.org/r/327695
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: newchange
Gerrit-Change-Id: I50db45303a6b466abb15d7a8a5f61e75d2947cc7
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Dzahn <[email protected]>
_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits