Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-12-07 Thread FTL Nagios
Hi,

Apologies for the delay, been very busy with other things.

Right I have put Nagios into Debug this morning and rerun the tests.

I let it get a couple of successful pings to the server then pulled the
network cable from it.

Behaviour is completely different this morning

The host check is behaving now and rechecking every 3 minutes as its told
too in the host template. I got my text and email alert to say the host was
down when I expected it!

But now its the service check that is running every 1 minute now, which its
not told too when in problem state.

My service template clearly states  when in problem state to retry_interval
of 3 minutes:

define service{
name service-server; The name of this host
template (used above in the checks)
check_period server_24x7; Server are monitored at
all times
check_interval 1; Server are checked every 1
minute when in OK state
retry_interval 3; Server checked every 3
minutes if in problem state
max_check_attempts 3; Server checked 3 times to
determine if its Up or Down state
notification_period server_24x7; Emails and Text are
sent out any time of day
notification_interval 3; Resend Notifications
every 3 minutes
notification_options c,r; Only send alerts for
servers in CRITICAL or RECOVERY state
notifications_enabled 0; Notifications are
disabled
contact_groups servers email, servers sms; Alerts sent
to contacts in these groups
event_handler_enabled 1; Host event handler is
enabled
process_perf_data 1; Performace data is
processed
retain_status_information1; Status Info is kept
between server restarts
retain_nonstatus_information 1; Non-Status information
is kept between server restarts
passive_checks_enabled 0; Passive Checks are
disabled
obsess_over_service 0 ; We do not obsess over
the server if in problem state
check_freshness  0 ; We do not check this
server for freshness
flap_detection_enabled 0; Flap Detection is
disabled
failure_prediction_enabled   0; We will wait for it to
actually fail thankyou!!
}

And even though its checking every minute, it went straight to Hard State on
the first check it detected it down and has stayed on check 1/3 Hard State
throughout


I really don't understand what is happening here.

The only thing different between this setup and my old nagios box is the
version - old box was 3.31, this new server is 3.4.1, I am using the same
config files that worked fine before.

Here is the debug logfiles of the above testing.

http://dl.dropbox.com/u/895609/nagios.debug1
http://dl.dropbox.com/u/895609/nagios.debug2


If you see anything please let me know, im getting angry with all the
alerts!!! :-)

Thankyou









-Original Message-
From: Giorgio Zarrelli [mailto:zarre...@linux.it] 
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio

quota chi=Andrew Thompson
 Hi Georgio,

 The whole test cfg I am using to try troubleshoot this can be found at:

 http://dl.dropbox.com/u/895609/test.cfg

 This is a direct copy of my main servers config but with the rest of 
 the servers and some templates for other server checks taken out



 Kind Regards
 Andrew

 From: Andrew Thompson
 Sent: 29 November 2012 16:11
 To: nagios-users@lists.sourceforge.net
 Subject: Nagios is ignoring the retry_interval setting

 Hi,

 My nagios box has decided to stop listening to the retry_interval 
 entry in my templates.

 My server template reads:

 define host{
  name   host-server
  check_period  server_24x7
  check_interval1
  retry_interval3
  max_check_attempts3
  notification_period   server_24x7
  notification_interval  3
  notification_options  d,r
  notifications_enabled  1
  contact_groupsservers email, servers sms
  event_handler_enabled  1
  process_perf_data 1
  retain_status_information1
  retain_nonstatus_information 1
  passive_checks_enabled  0
  obsess_over_host  0
  check_freshness  0
  flap_detection_enabled  0
  failure_prediction_enabled   0
  }

 Now this is what happens:


 * Server goes down at 1pm.

 * I check the next scheduled check and it clearly states 1.03pm

 * But at 1.01pm it checks again and then spits

Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-12-07 Thread FTL Nagios
Re-tested after changing the max file size of the debug file.

This one should contain everything from the moment I started Nagios to the
moment I stopped it during testing (approx. 10 minutes)

http://dl.dropbox.com/u/895609/nagios.debug

Thankyou

-Original Message-
From: FTL Nagios [mailto:ftlnag...@gmail.com] 
Sent: 07 December 2012 10:56
To: 'zarre...@linux.it'; 'Nagios Users List'
Subject: RE: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

Apologies for the delay, been very busy with other things.

Right I have put Nagios into Debug this morning and rerun the tests.

I let it get a couple of successful pings to the server then pulled the
network cable from it.

Behaviour is completely different this morning

The host check is behaving now and rechecking every 3 minutes as its told
too in the host template. I got my text and email alert to say the host was
down when I expected it!

But now its the service check that is running every 1 minute now, which its
not told too when in problem state.

My service template clearly states  when in problem state to retry_interval
of 3 minutes:

define service{
name service-server; The name of this host
template (used above in the checks)
check_period server_24x7; Server are monitored at
all times
check_interval 1; Server are checked every 1
minute when in OK state
retry_interval 3; Server checked every 3
minutes if in problem state
max_check_attempts 3; Server checked 3 times to
determine if its Up or Down state
notification_period server_24x7; Emails and Text are
sent out any time of day
notification_interval 3; Resend Notifications
every 3 minutes
notification_options c,r; Only send alerts for
servers in CRITICAL or RECOVERY state
notifications_enabled 0; Notifications are
disabled
contact_groups servers email, servers sms; Alerts sent
to contacts in these groups
event_handler_enabled 1; Host event handler is
enabled
process_perf_data 1; Performace data is
processed
retain_status_information1; Status Info is kept
between server restarts
retain_nonstatus_information 1; Non-Status information
is kept between server restarts
passive_checks_enabled 0; Passive Checks are
disabled
obsess_over_service 0 ; We do not obsess over
the server if in problem state
check_freshness  0 ; We do not check this
server for freshness
flap_detection_enabled 0; Flap Detection is
disabled
failure_prediction_enabled   0; We will wait for it to
actually fail thankyou!!
}

And even though its checking every minute, it went straight to Hard State on
the first check it detected it down and has stayed on check 1/3 Hard State
throughout


I really don't understand what is happening here.

The only thing different between this setup and my old nagios box is the
version - old box was 3.31, this new server is 3.4.1, I am using the same
config files that worked fine before.

Here is the debug logfiles of the above testing.

http://dl.dropbox.com/u/895609/nagios.debug1
http://dl.dropbox.com/u/895609/nagios.debug2


If you see anything please let me know, im getting angry with all the
alerts!!! :-)

Thankyou









-Original Message-
From: Giorgio Zarrelli [mailto:zarre...@linux.it]
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio

quota chi=Andrew Thompson
 Hi Georgio,

 The whole test cfg I am using to try troubleshoot this can be found at:

 http://dl.dropbox.com/u/895609/test.cfg

 This is a direct copy of my main servers config but with the rest of 
 the servers and some templates for other server checks taken out



 Kind Regards
 Andrew

 From: Andrew Thompson
 Sent: 29 November 2012 16:11
 To: nagios-users@lists.sourceforge.net
 Subject: Nagios is ignoring the retry_interval setting

 Hi,

 My nagios box has decided to stop listening to the retry_interval 
 entry in my templates.

 My server template reads:

 define host{
  name   host-server
  check_period  server_24x7
  check_interval1
  retry_interval3
  max_check_attempts3
  notification_period   server_24x7
  notification_interval  3
  notification_options  d,r
  notifications_enabled  1
  contact_groupsservers email, servers sms
  event_handler_enabled  1
  process_perf_data

Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-12-03 Thread FTL Nagios
Hi Georgio,

Apologies for the delay,

I am doing this first thing tomorrow morning (Tue 4th Dec)- I will post the
debug log then.

Thankyou


-Original Message-
From: Giorgio Zarrelli [mailto:zarre...@linux.it] 
Sent: 29 November 2012 19:24
To: Nagios Users List
Subject: Re: [Nagios-users] Nagios is ignoring the retry_interval setting

Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio

quota chi=Andrew Thompson
 Hi Georgio,

 The whole test cfg I am using to try troubleshoot this can be found at:

 http://dl.dropbox.com/u/895609/test.cfg

 This is a direct copy of my main servers config but with the rest of 
 the servers and some templates for other server checks taken out



 Kind Regards
 Andrew

 From: Andrew Thompson
 Sent: 29 November 2012 16:11
 To: nagios-users@lists.sourceforge.net
 Subject: Nagios is ignoring the retry_interval setting

 Hi,

 My nagios box has decided to stop listening to the retry_interval 
 entry in my templates.

 My server template reads:

 define host{
  name   host-server
  check_period  server_24x7
  check_interval1
  retry_interval3
  max_check_attempts3
  notification_period   server_24x7
  notification_interval  3
  notification_options  d,r
  notifications_enabled  1
  contact_groupsservers email, servers sms
  event_handler_enabled  1
  process_perf_data 1
  retain_status_information1
  retain_nonstatus_information 1
  passive_checks_enabled  0
  obsess_over_host  0
  check_freshness  0
  flap_detection_enabled  0
  failure_prediction_enabled   0
  }

 Now this is what happens:


 * Server goes down at 1pm.

 * I check the next scheduled check and it clearly states 1.03pm

 * But at 1.01pm it checks again and then spits out an email and
 text message saying the server is down.

 Completely ignoring the retry_interval setting!!!

 Id expect from the above:


 * 1pm server goes down

 * 1.03pm check 2 is done

 * 1.06pm check 3 is done and determined hard state.

 * At 1.06pm the notification should be sent out.

 Why is this, is something in my config wrong?

 Ubuntu 12.04 desktop and Nagios 3.4.1

 Thanks


 --
  Keep yourself connected to Go Parallel:
 VERIFY Test and improve your parallel project with help from experts 
 and peers.
 http://goparallel.sourceforge.net_
 __
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when 
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null




--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts and
peers. http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


--
Keep yourself connected to Go Parallel: 
BUILD Helping you discover the best ways to construct your parallel projects.
http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Gary Every
Your check_interval is set to 1, that takes precedence over retry_interval

g.;

On Nov 29, 2012, at 9:10 AM, Andrew Thompson and...@fulgent.co.uk wrote:

 Hi,
  
 My nagios box has decided to stop listening to the retry_interval entry in my 
 templates.
  
 My server template reads:
  
 define host{
  name   host-server 
  check_period  server_24x7  
  check_interval1
  retry_interval3
  max_check_attempts3
  notification_period   server_24x7  
  notification_interval  3
  notification_options  d,r
  notifications_enabled  1
  contact_groupsservers email, servers sms
  event_handler_enabled  1
  process_perf_data 1
  retain_status_information1 
  retain_nonstatus_information 1 
  passive_checks_enabled  0
  obsess_over_host  0
  check_freshness  0
  flap_detection_enabled  0
  failure_prediction_enabled   0 
  }
  
 Now this is what happens:
  
 · Server goes down at 1pm.
 · I check the next scheduled check and it clearly states 1.03pm
 · But at 1.01pm it checks again and then spits out an email and text 
 message saying the server is down.
  
 Completely ignoring the retry_interval setting!!!
  
 Id expect from the above:
  
 · 1pm server goes down
 · 1.03pm check 2 is done
 · 1.06pm check 3 is done and determined hard state.
 · At 1.06pm the notification should be sent out.
  
 Why is this, is something in my config wrong?
  
 Ubuntu 12.04 desktop and Nagios 3.4.1
  
 Thanks
  
  
 --
 Keep yourself connected to Go Parallel: 
 VERIFY Test and improve your parallel project with help from experts 
 and peers. 
 http://goparallel.sourceforge.net___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when reporting 
 any issue. 
 ::: Messages without supporting info will risk being sent to /dev/null

--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Giorgio Zarrelli
Hi,

wrong.

retry interval comes in when there a state change. check_interval is the
interval for normal checks. When there is a status change, the
retry_interval comes in ** until ** max_check_attempts is reached, then
check_interval kicks in again.



quota chi=Gary Every
 --
 Keep yourself connected to Go Parallel:
 VERIFY Test and improve your parallel project with help from experts
 and peers.
 http://goparallel.sourceforge.net___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null



--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Giorgio Zarrelli
Hi,

write here the host actual definition.

Moreover, if the define host you wrote in you email is a template, why I
do not see register 0?


quota chi=Andrew Thompson
 Hi,

 My nagios box has decided to stop listening to the retry_interval entry in
 my templates.

 My server template reads:

 define host{
  name   host-server
  check_period  server_24x7
  check_interval1
  retry_interval3
  max_check_attempts3
  notification_period   server_24x7
  notification_interval  3
  notification_options  d,r
  notifications_enabled  1
  contact_groupsservers email, servers sms
  event_handler_enabled  1
  process_perf_data 1
  retain_status_information1
  retain_nonstatus_information 1
  passive_checks_enabled  0
  obsess_over_host  0
  check_freshness  0
  flap_detection_enabled  0
  failure_prediction_enabled   0
  }

 Now this is what happens:


 * Server goes down at 1pm.

 * I check the next scheduled check and it clearly states 1.03pm

 * But at 1.01pm it checks again and then spits out an email and
 text message saying the server is down.

 Completely ignoring the retry_interval setting!!!

 Id expect from the above:


 * 1pm server goes down

 * 1.03pm check 2 is done

 * 1.06pm check 3 is done and determined hard state.

 * At 1.06pm the notification should be sent out.

 Why is this, is something in my config wrong?

 Ubuntu 12.04 desktop and Nagios 3.4.1

 Thanks


 --
 Keep yourself connected to Go Parallel:
 VERIFY Test and improve your parallel project with help from experts
 and peers.
 http://goparallel.sourceforge.net___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null



--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Andrew Thompson
Hi Georgio,

The whole test cfg I am using to try troubleshoot this can be found at:

http://dl.dropbox.com/u/895609/test.cfg

This is a direct copy of my main servers config but with the rest of the 
servers and some templates for other server checks taken out



Kind Regards
Andrew

From: Andrew Thompson
Sent: 29 November 2012 16:11
To: nagios-users@lists.sourceforge.net
Subject: Nagios is ignoring the retry_interval setting

Hi,

My nagios box has decided to stop listening to the retry_interval entry in my 
templates.

My server template reads:

define host{
 name   host-server
 check_period  server_24x7
 check_interval1
 retry_interval3
 max_check_attempts3
 notification_period   server_24x7
 notification_interval  3
 notification_options  d,r
 notifications_enabled  1
 contact_groupsservers email, servers sms
 event_handler_enabled  1
 process_perf_data 1
 retain_status_information1
 retain_nonstatus_information 1
 passive_checks_enabled  0
 obsess_over_host  0
 check_freshness  0
 flap_detection_enabled  0
 failure_prediction_enabled   0
 }

Now this is what happens:


* Server goes down at 1pm.

* I check the next scheduled check and it clearly states 1.03pm

* But at 1.01pm it checks again and then spits out an email and text 
message saying the server is down.

Completely ignoring the retry_interval setting!!!

Id expect from the above:


* 1pm server goes down

* 1.03pm check 2 is done

* 1.06pm check 3 is done and determined hard state.

* At 1.06pm the notification should be sent out.

Why is this, is something in my config wrong?

Ubuntu 12.04 desktop and Nagios 3.4.1

Thanks


--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] Nagios is ignoring the retry_interval setting

2012-11-29 Thread Giorgio Zarrelli
Hi,

do not seee anything wrong. Could you set debug=-1

repeat the problem and put the log online?

Giorgio

quota chi=Andrew Thompson
 Hi Georgio,

 The whole test cfg I am using to try troubleshoot this can be found at:

 http://dl.dropbox.com/u/895609/test.cfg

 This is a direct copy of my main servers config but with the rest of the
 servers and some templates for other server checks taken out



 Kind Regards
 Andrew

 From: Andrew Thompson
 Sent: 29 November 2012 16:11
 To: nagios-users@lists.sourceforge.net
 Subject: Nagios is ignoring the retry_interval setting

 Hi,

 My nagios box has decided to stop listening to the retry_interval entry in
 my templates.

 My server template reads:

 define host{
  name   host-server
  check_period  server_24x7
  check_interval1
  retry_interval3
  max_check_attempts3
  notification_period   server_24x7
  notification_interval  3
  notification_options  d,r
  notifications_enabled  1
  contact_groupsservers email, servers sms
  event_handler_enabled  1
  process_perf_data 1
  retain_status_information1
  retain_nonstatus_information 1
  passive_checks_enabled  0
  obsess_over_host  0
  check_freshness  0
  flap_detection_enabled  0
  failure_prediction_enabled   0
  }

 Now this is what happens:


 * Server goes down at 1pm.

 * I check the next scheduled check and it clearly states 1.03pm

 * But at 1.01pm it checks again and then spits out an email and
 text message saying the server is down.

 Completely ignoring the retry_interval setting!!!

 Id expect from the above:


 * 1pm server goes down

 * 1.03pm check 2 is done

 * 1.06pm check 3 is done and determined hard state.

 * At 1.06pm the notification should be sent out.

 Why is this, is something in my config wrong?

 Ubuntu 12.04 desktop and Nagios 3.4.1

 Thanks


 --
 Keep yourself connected to Go Parallel:
 VERIFY Test and improve your parallel project with help from experts
 and peers.
 http://goparallel.sourceforge.net___
 Nagios-users mailing list
 Nagios-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nagios-users
 ::: Please include Nagios version, plugin version (-v) and OS when
 reporting any issue.
 ::: Messages without supporting info will risk being sent to /dev/null



--
Keep yourself connected to Go Parallel: 
VERIFY Test and improve your parallel project with help from experts 
and peers. http://goparallel.sourceforge.net
___
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null