Perhaps the nagios charm could implement a form of "dead man switch" in a cron job or something ? I looked things up online but found nothing, perhaps the following could work.
You'd set up pagerduty with a new service where the first level of oncall is "off" and the next levels are the normal rotation. Alerts escalate after X min (say, 15). A cron job, every fifteen minutes, looks for an alert named "NAGIOS PAGERDUTY E2E CHECK" or something, and if it's not there, it creates this alert. Another cron job, every 5 minutes, acks the alert. Then, if pagerduty isn't reachable from the nagios unit, the E2E check alert will escalate and page the first person in the rotation. This may be doable with rulesets (https://support.pagerduty.com/docs/rulesets) but I haven't immediately found a way to do so. -- You received this bug notification because you are a member of Nagios Charm developers, which is subscribed to Nagios Charm. https://bugs.launchpad.net/bugs/1902142 Title: Nagios check for unreachable pagerduty Status in Nagios Charm: New Bug description: if enable_pagerduty=True, but nagios cannot reach pagerduty, there should be a new CRITICAL alert that pagerduty isn't reachable. Be sure to attempt to reach pagerduty through whatever proxies nagios+pagerduty services are configured with. To manage notifications about this bug go to: https://bugs.launchpad.net/charm-nagios/+bug/1902142/+subscriptions -- Mailing list: https://launchpad.net/~nagios-charmers Post to : [email protected] Unsubscribe : https://launchpad.net/~nagios-charmers More help : https://help.launchpad.net/ListHelp

