[MediaWiki-commits] [Gerrit] twmeproxy: monitor the memcache port as well - change (operations/puppet)
Faidon Liambotis has uploaded a new change for review. https://gerrit.wikimedia.org/r/120774 Change subject: twmeproxy: monitor the memcache port as well .. twmeproxy: monitor the memcache port as well We just had a complicated failure scenario where the root cause was that twemproxy had stopped listening on the 11211 port but kept running and listening on the 2 management socket. twemproxy didn't log anything anywhere and strace/gdb didn't reveal more, so this remains a puzzle for now. In the meantime, add an NRPE check to monitor 11211 as well so that at least we can be alerted if/when that happens again. Change-Id: I09c0fce64160e6d58f2a85599dd17c57e4aa923d --- M manifests/role/applicationserver.pp 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/74/120774/1 diff --git a/manifests/role/applicationserver.pp b/manifests/role/applicationserver.pp index d11d434..9aabe97 100644 --- a/manifests/role/applicationserver.pp +++ b/manifests/role/applicationserver.pp @@ -70,6 +70,10 @@ description = twemproxy process, nrpe_command = /usr/lib/nagios/plugins/check_procs -c 1:1 -u nobody -C nutcracker } + nrpe::monitor_service { 'twemproxy port': + description = 'twemproxy port', + nrpe_command = '/usr/lib/nagios/plugins/check_tcp -H 127.0.0.1 -p 11211 --timeout=2', + } } if $::realm == 'labs' { -- To view, visit https://gerrit.wikimedia.org/r/120774 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I09c0fce64160e6d58f2a85599dd17c57e4aa923d Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Faidon Liambotis fai...@wikimedia.org ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits
[MediaWiki-commits] [Gerrit] twmeproxy: monitor the memcache port as well - change (operations/puppet)
Faidon Liambotis has submitted this change and it was merged. Change subject: twmeproxy: monitor the memcache port as well .. twmeproxy: monitor the memcache port as well We just had a complicated failure scenario where the root cause was that twemproxy had stopped listening on the 11211 port but kept running and listening on the 2 management socket. twemproxy didn't log anything anywhere and strace/gdb didn't reveal more, so this remains a puzzle for now. In the meantime, add an NRPE check to monitor 11211 as well so that at least we can be alerted if/when that happens again. Change-Id: I09c0fce64160e6d58f2a85599dd17c57e4aa923d --- M manifests/role/applicationserver.pp 1 file changed, 4 insertions(+), 0 deletions(-) Approvals: Faidon Liambotis: Looks good to me, approved jenkins-bot: Verified diff --git a/manifests/role/applicationserver.pp b/manifests/role/applicationserver.pp index d11d434..9aabe97 100644 --- a/manifests/role/applicationserver.pp +++ b/manifests/role/applicationserver.pp @@ -70,6 +70,10 @@ description = twemproxy process, nrpe_command = /usr/lib/nagios/plugins/check_procs -c 1:1 -u nobody -C nutcracker } + nrpe::monitor_service { 'twemproxy port': + description = 'twemproxy port', + nrpe_command = '/usr/lib/nagios/plugins/check_tcp -H 127.0.0.1 -p 11211 --timeout=2', + } } if $::realm == 'labs' { -- To view, visit https://gerrit.wikimedia.org/r/120774 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I09c0fce64160e6d58f2a85599dd17c57e4aa923d Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Faidon Liambotis fai...@wikimedia.org Gerrit-Reviewer: Faidon Liambotis fai...@wikimedia.org Gerrit-Reviewer: jenkins-bot ___ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits