Re: [Nagios-users] Nagios checks and DNS queries
Dirk H. Schulz wrote: Hi folks, I have two servers running Nagios, one is 2.3.1 on Debian, the other 3.0.5 on CentOS. With both I have a peculiar problem: Both of the servers have 3 different nameserves in /etc/resolv.conf, but when the first nameserver fails, then more than half of the service checks fail (plugin timed out). The failure does not occur just shortly, but takes as long as the first nameserver is not running. This first nameserver in /etc/resolv.conf is not the master nameserver (all of them are slaves), so it is not a problem of the slave stopping answering when the master fails or any misconfiguration between the nameservers. This should not be occuring, but it can be reproduced reliably. Now I hope that there is some configuration item I overlooked, but googling did not deliver any hint. Any help is appreciated. Dirk Dirk, my solution was to run a slave name server on the Nagios server itself , restricted to only answer queries from localhost. Steve. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios checks and DNS queries
Steve Burton wrote: Dirk, my solution was to run a slave name server on the Nagios server itself , restricted to only answer queries from localhost. Steve. Why not set something like options timeout:1 attempts:1 in resolv.conf? From man resolv.conf:* timeout:*/n/ sets the amount of time the resolver will wait for a response from a remote name server before retrying the query via a different name server. Measured in seconds, the default is RES_TIMEOUT (currently 5, see resolv.h http://linux.die.net/include/resolv.h).* attempts:*/n/ sets the number of times the resolver will send a query to its name servers before giving up and returning an error to the calling application. The default is RES_DFLRETRY (currently 2, see resolv.h http://linux.die.net/include/resolv.h). With the defaults, you're looking at 10 seconds (2 attempts, 5s apart) before it moves onto the next server. Since 10 seconds is the default timeout for those checks, you'll always hit a timeout unless the DNS server becomes responsive again. -- Sean McAfee System Engineer Collaborative Fusion, Inc. [EMAIL PROTECTED] 412-422-3463 x 4025 5849 Forbes Avenue Pittsburgh, PA 15217 IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios checks and DNS queries
Sean McAfee wrote: Steve Burton wrote: Dirk, my solution was to run a slave name server on the Nagios server itself , restricted to only answer queries from localhost. Steve. Why not set something like options timeout:1 attempts:1 in resolv.conf? From man resolv.conf:* timeout:*/n/ sets the amount of time the resolver will wait for a response from a remote name server before retrying the query via a different name server. Measured in seconds, the default is RES_TIMEOUT (currently 5, see resolv.h http://linux.die.net/include/resolv.h).* attempts:*/n/ sets the number of times the resolver will send a query to its name servers before giving up and returning an error to the calling application. The default is RES_DFLRETRY (currently 2, see resolv.h http://linux.die.net/include/resolv.h). With the defaults, you're looking at 10 seconds (2 attempts, 5s apart) before it moves onto the next server. Since 10 seconds is the default timeout for those checks, you'll always hit a timeout unless the DNS server becomes responsive again. Sean, I reason I set up the slave server was so my nagios instance could monitor the 'real' DNS servers by name and check the host and other services on those hosts (they're Windows DCs) even if (or especially if) the DNS service had failed. Steve. __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Nagios checks and DNS queries
Steve Burton wrote: Sean, I reason I set up the slave server was so my nagios instance could monitor the 'real' DNS servers by name and check the host and other services on those hosts (they're Windows DCs) even if (or especially if) the DNS service had failed. That makes sense. We make use of the options in resolv.conf to allow us to reboot the DNS servers without causing any interruptions. For Dirk though, this isn't so much a Nagios or monitoring issue as it is a general *nix one. The real question here is why the DNS servers become unavailable so frequently. -- Sean McAfee System Engineer Collaborative Fusion, Inc. [EMAIL PROTECTED] 412-422-3463 x 4025 5849 Forbes Avenue Pittsburgh, PA 15217 IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message, which arise as a result of e-mail transmission. IMPORTANT: This message contains confidential information and is intended only for the individual named. If the reader of this message is not an intended recipient (or the individual responsible for the delivery of this message to an intended recipient), please be advised that any re-use, dissemination, distribution or copying of this message is prohibited. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] Nagios checks and DNS queries
Hi folks, I have two servers running Nagios, one is 2.3.1 on Debian, the other 3.0.5 on CentOS. With both I have a peculiar problem: Both of the servers have 3 different nameserves in /etc/resolv.conf, but when the first nameserver fails, then more than half of the service checks fail (plugin timed out). The failure does not occur just shortly, but takes as long as the first nameserver is not running. This first nameserver in /etc/resolv.conf is not the master nameserver (all of them are slaves), so it is not a problem of the slave stopping answering when the master fails or any misconfiguration between the nameservers. This should not be occuring, but it can be reproduced reliably. Now I hope that there is some configuration item I overlooked, but googling did not deliver any hint. Any help is appreciated. Dirk - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/ ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null