[Nagios-users] accommodate 7426 passive checks on nagios 3.0.3
Hi all, I have 7426 incoming passive checks on my nagios server. I turned on freshness check at every 60 seconds, check_result_reaper_frequency at 60 and max_check_result_reaper_time at 90. I am getting a lot of stale passive results. Anything off with these settings, or the rest of my config settings? passive checks: 7426 object_cache_file=/var/log/nagios/objects.cache precached_object_file=/var/log/nagios/objects.precache resource_file=/etc/nagios/resource.cfg status_file=/var/log/nagios/status.dat status_update_interval=20 nagios_user=nagios nagios_group=nagios check_external_commands=1 command_check_interval=-1 command_file=/var/log/nagios/rw/nagios.cmd external_command_buffer_slots=8192 lock_file=/var/run/nagios.pid temp_file=/var/log/nagios/nagios.tmp temp_path=/tmp event_broker_options=-1 log_rotation_method=d log_archive_path=/var/log/nagios/archives use_syslog=0 log_notifications=1 log_service_retries=1 log_host_retries=1 log_event_handlers=1 log_initial_states=0 log_external_commands=1 log_passive_checks=1 service_inter_check_delay_method=s max_service_check_spread=30 service_interleave_factor=s host_inter_check_delay_method=s max_host_check_spread=30 max_concurrent_checks=0 check_result_reaper_frequency=60 max_check_result_reaper_time=90 # check_result_path=/var/log/nagios/spool/checkresults max_check_result_file_age=3600 cached_host_check_horizon=15 cached_service_check_horizon=15 enable_predictive_host_dependency_checks=1 enable_predictive_service_dependency_checks=1 soft_state_dependencies=0 auto_reschedule_checks=0 auto_rescheduling_interval=30 auto_rescheduling_window=180 sleep_time=0.125 service_check_timeout=60 host_check_timeout=30 event_handler_timeout=30 notification_timeout=30 ocsp_timeout=5 perfdata_timeout=5 retain_state_information=1 state_retention_file=/var/log/nagios/retention.dat retention_update_interval=60 use_retained_program_state=0 use_retained_scheduling_info=1 retained_host_attribute_mask=0 retained_service_attribute_mask=0 retained_process_host_attribute_mask=0 retained_process_service_attribute_mask=0 retained_contact_host_attribute_mask=0 retained_contact_service_attribute_mask=0 interval_length=60 use_aggressive_host_checking=0 execute_service_checks=1 accept_passive_service_checks=1 execute_host_checks=1 accept_passive_host_checks=1 enable_notifications=1 enable_event_handlers=1 process_performance_data=0 obsess_over_services=0 obsess_over_hosts=0 translate_passive_host_checks=0 passive_host_checks_are_soft=0 check_for_orphaned_services=1 check_for_orphaned_hosts=1 check_service_freshness=1 service_freshness_check_interval=360 check_host_freshness=1 host_freshness_check_interval=60 additional_freshness_latency=15 enable_flap_detection=1 low_service_flap_threshold=5.0 high_service_flap_threshold=20.0 low_host_flap_threshold=5.0 high_host_flap_threshold=20.0 date_format=us p1_file=/usr/local/nagios/sbin/p1.pl enable_embedded_perl=1 use_embedded_perl_implicitly=1 illegal_object_name_chars=`~!$%^*|'?,()= illegal_macro_output_chars=`~$|' use_regexp_matching=1 use_true_regexp_matching=0 [EMAIL PROTECTED] [EMAIL PROTECTED] daemon_dumps_core=0 use_large_installation_tweaks=1 enable_environment_macros=0 free_child_process_memory=0 child_processes_fork_twice=0 debug_level=0 debug_verbosity=1 debug_file=/var/log/nagios/nagios.debug max_debug_file_size=100 TIA, Marc - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] accommodate 7426 passive checks on nagios 3.0.3
Hi Mark, Thanks for the response. I just realized that I've opened the old templates.cfg file (apologies, im an 1d1ot). My freshness_treshhold is actually 600 (10mins), but still im seeing this issue. template.cfg snippet === define service{ namee_passive active_checks_enabled 0 passive_checks_enabled 1 parallelize_check 0 obsess_over_service 0 check_freshness 1 freshness_threshold 600 check_command check_stale_passive notifications_enabled 1 event_handler_enabled 0 flap_detection_enabled 1 failure_prediction_enabled 0 process_perf_data 0 retain_status_information 1 retain_nonstatus_information1 is_volatile 0 check_periode_reboots max_check_attempts 1 normal_check_interval 1 retry_check_interval1 contact_groups e_server_team notification_optionsc notification_interval 0 notification_period e_reboots register0 } Thanks, Marc On Mon, Nov 17, 2008 at 11:25 PM, Mark Young [EMAIL PROTECTED] wrote: On Nov 16, 2008, at 9:16 PM, Marc Ismael wrote: Hi all, I have 7426 incoming passive checks on my nagios server. I turned on freshness check at every 60 seconds, check_result_reaper_frequency at 60 and max_check_result_reaper_time at 90. I am getting a lot of stale passive results. Anything off with these settings, or the rest of my config settings? You have some interesting choices with your settings. If you have the freshness and the reaper_frequency set to the same time of 60 seconds.The freshness threshold is the time in which Nagios should consider a check to be stale. This is done by looking at the last check's timestamp and comparing it to the threshold you set (60 seconds). While the reaper_frequency is the frequency in which Nagios will take all the collected passive results and process them, which you also have set at 60 seconds. You are setting up a condition where most of your checks are running close to stale and, given any processing time, with give you many stale results. Depending on your how powerful your system is, you will need to either increase your freshness threshold (try 300 seconds), decrease the reaper frequency, or do both. You may have to play around with the exact settings that will work with your system and the number of checks you are performing. I would recommend you start with increasing the freshness threshold. Good Luck! Mark Young ___ Nagios Enterprises, LLC Web:www.nagios.com - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] no used external command buffer slots
Hi, I have around 7362 incoming passive checks, and I'm trying to figure out a way how to get rid of stale check results by graphing with mrtg and working from there. I am under the impression that with these amount of passive checks, a generous number of external command buffer slots will be used. When I checked with nagiostats, it was zero. Could I be thinking wrong, or am I missing something? [EMAIL PROTECTED] nagios]# /usr/local/nagios/sbin/nagiostats --mrtg --data=TOTCMDBUF,USEDCMDBUF,PROGRUNTIME,NAGIOSVERPID 8192 0 0d 15h 34m 52s Nagios 3.0.3 (pid=3888) nagios 3.0.3 on redhat linux 5.1 x86 - This SF.Net email is sponsored by the Moblin Your Move Developer's challenge Build the coolest Linux based applications with Moblin SDK win great prizes Grand prize is a trip for two to an Open Source event anywhere in the world http://moblin-contest.org/redirect.php?banner_id=100url=/___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] nagios read nagios.cmd (Resource temporarily unavailable)
Hello Mailing list, This issue has been bothering me for quite some time, I'm getting a high number of stale passive check alerts. It seems like some passive checks are not being processed. I currently have 6596 incoming passive checks every 5 minutes. The rest of the relevant configuration are as follows: define service{ nametemplate_passive active_checks_enabled 0 passive_checks_enabled 1 parallelize_check 0 obsess_over_service 0 check_freshness 1 freshness_threshold 600 check_command check_stale_passive notifications_enabled 1 event_handler_enabled 0 flap_detection_enabled 1 failure_prediction_enabled 0 process_perf_data 0 retain_status_information 1 retain_nonstatus_information1 is_volatile 0 check_period24x7 max_check_attempts 1 normal_check_interval 1 retry_check_interval1 contact_groups admin notification_optionsc notification_interval 0 notification_period 24x7 register0 } # nagios.cfg max_check_result_reaper_time=15 check_result_reaper_frequency=5 service_freshness_check_interval=780 host_freshness_check_interval=90 status_update_interval=20 check_external_commands=1 command_check_interval=-1 external_command_buffer_slots=8192 event_broker_options=-1 use_syslog=0 log_notifications=1 log_service_retries=1 log_host_retries=1 log_event_handlers=1 log_initial_states=0 log_external_commands=1 log_passive_checks=1 max_service_check_spread=30 max_host_check_spread=30 max_concurrent_checks=0 max_check_result_file_age=3600 cached_host_check_horizon=15 cached_service_check_horizon=15 enable_predictive_host_dependency_checks=1 enable_predictive_service_dependency_checks=1 soft_state_dependencies=0 auto_reschedule_checks=0 auto_rescheduling_interval=30 auto_rescheduling_window=180 sleep_time=0.125 service_check_timeout=60 host_check_timeout=30 event_handler_timeout=30 notification_timeout=30 ocsp_timeout=5 perfdata_timeout=5 retain_state_information=1 retention_update_interval=60 use_retained_program_state=0 use_retained_scheduling_info=1 retained_host_attribute_mask=0 retained_service_attribute_mask=0 retained_process_host_attribute_mask=0 retained_process_service_attribute_mask=0 retained_contact_host_attribute_mask=0 retained_contact_service_attribute_mask=0 interval_length=60 use_aggressive_host_checking=0 execute_service_checks=1 accept_passive_service_checks=1 execute_host_checks=1 accept_passive_host_checks=1 enable_notifications=1 enable_event_handlers=1 process_performance_data=0 obsess_over_services=0 obsess_over_hosts=0 translate_passive_host_checks=0 passive_host_checks_are_soft=0 check_for_orphaned_services=1 check_for_orphaned_hosts=1 check_service_freshness=1 check_host_freshness=1 additional_freshness_latency=15 enable_flap_detection=1 low_service_flap_threshold=5.0 high_service_flap_threshold=20.0 low_host_flap_threshold=5.0 high_host_flap_threshold=20.0 p1_file=/usr/local/nagios/sbin/p1.pl enable_embedded_perl=1 use_embedded_perl_implicitly=1 use_regexp_matching=1 use_true_regexp_matching=0 daemon_dumps_core=0 use_large_installation_tweaks=1 enable_environment_macros=0 free_child_process_memory=0 child_processes_fork_twice=0 debug_level=0 debug_verbosity=1 max_debug_file_size=100 My current situation: nagios miss/fails to process approximately an average of 600 out of 6596 passive check results every 5 mins. I admint I don't know nagios that well, I started installing/using nagios only recently, and I don't know where/how to start troubleshooting this. I did install mrtg and did a good amount of trial and error with the config, especially max_check_result_reaper_time and check_result_reaper_frequency, but increasing or decreasing the values of these variables only worsens the current situation. However, this pstree output looks like a qualified starting point: [EMAIL PROTECTED] nagios]# pstree -cpG | grep nagios ├─nagios(7943)───{nagios}(7944) [EMAIL PROTECTED] tmp]# strace -s50 -p 7944 Process 7944 attached - interrupt to quit poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN, revents=POLLIN}], 1, 500) = 1 read(4,
Re: [Nagios-users] nagios read nagios.cmd (Resource temporarily unavailable)
Hi, I have decreased the number of incoming passive checks every 5 mins to 3206, but im still seeing the Resource Temporarily Unavailable messages. Not all the results are being processed still. Any ides? TIA, Marc 2008/11/22 Marc Ismael [EMAIL PROTECTED] Hello Mailing list, This issue has been bothering me for quite some time, I'm getting a high number of stale passive check alerts. It seems like some passive checks are not being processed. I currently have 6596 incoming passive checks every 5 minutes. The rest of the relevant configuration are as follows: define service{ nametemplate_passive active_checks_enabled 0 passive_checks_enabled 1 parallelize_check 0 obsess_over_service 0 check_freshness 1 freshness_threshold 600 check_command check_stale_passive notifications_enabled 1 event_handler_enabled 0 flap_detection_enabled 1 failure_prediction_enabled 0 process_perf_data 0 retain_status_information 1 retain_nonstatus_information1 is_volatile 0 check_period24x7 max_check_attempts 1 normal_check_interval 1 retry_check_interval1 contact_groups admin notification_optionsc notification_interval 0 notification_period 24x7 register0 } # nagios.cfg max_check_result_reaper_time=15 check_result_reaper_frequency=5 service_freshness_check_interval=780 host_freshness_check_interval=90 status_update_interval=20 check_external_commands=1 command_check_interval=-1 external_command_buffer_slots=8192 event_broker_options=-1 use_syslog=0 log_notifications=1 log_service_retries=1 log_host_retries=1 log_event_handlers=1 log_initial_states=0 log_external_commands=1 log_passive_checks=1 max_service_check_spread=30 max_host_check_spread=30 max_concurrent_checks=0 max_check_result_file_age=3600 cached_host_check_horizon=15 cached_service_check_horizon=15 enable_predictive_host_dependency_checks=1 enable_predictive_service_dependency_checks=1 soft_state_dependencies=0 auto_reschedule_checks=0 auto_rescheduling_interval=30 auto_rescheduling_window=180 sleep_time=0.125 service_check_timeout=60 host_check_timeout=30 event_handler_timeout=30 notification_timeout=30 ocsp_timeout=5 perfdata_timeout=5 retain_state_information=1 retention_update_interval=60 use_retained_program_state=0 use_retained_scheduling_info=1 retained_host_attribute_mask=0 retained_service_attribute_mask=0 retained_process_host_attribute_mask=0 retained_process_service_attribute_mask=0 retained_contact_host_attribute_mask=0 retained_contact_service_attribute_mask=0 interval_length=60 use_aggressive_host_checking=0 execute_service_checks=1 accept_passive_service_checks=1 execute_host_checks=1 accept_passive_host_checks=1 enable_notifications=1 enable_event_handlers=1 process_performance_data=0 obsess_over_services=0 obsess_over_hosts=0 translate_passive_host_checks=0 passive_host_checks_are_soft=0 check_for_orphaned_services=1 check_for_orphaned_hosts=1 check_service_freshness=1 check_host_freshness=1 additional_freshness_latency=15 enable_flap_detection=1 low_service_flap_threshold=5.0 high_service_flap_threshold=20.0 low_host_flap_threshold=5.0 high_host_flap_threshold=20.0 p1_file=/usr/local/nagios/sbin/p1.pl enable_embedded_perl=1 use_embedded_perl_implicitly=1 use_regexp_matching=1 use_true_regexp_matching=0 daemon_dumps_core=0 use_large_installation_tweaks=1 enable_environment_macros=0 free_child_process_memory=0 child_processes_fork_twice=0 debug_level=0 debug_verbosity=1 max_debug_file_size=100 My current situation: nagios miss/fails to process approximately an average of 600 out of 6596 passive check results every 5 mins. I admint I don't know nagios that well, I started installing/using nagios only recently, and I don't know where/how to start troubleshooting this. I did install mrtg and did a good amount of trial and error with the config, especially max_check_result_reaper_time and check_result_reaper_frequency, but increasing or decreasing the values of these variables only worsens the current situation. However, this pstree output looks like a qualified starting point: [EMAIL PROTECTED] nagios]# pstree -cpG | grep nagios ├─nagios(7943)───{nagios}(7944) [EMAIL PROTECTED] tmp]# strace -s50 -p 7944 Process 7944 attached - interrupt to quit poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN}], 1, 500) = 0 poll([{fd=4, events=POLLIN
[Nagios-users] Monitor lun statistics
Has anyone implemented / is there any effective way to monitor lun activity / health? Thank you, Marc -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] lun monitoring
Hi, Anyone implemented any sort of lun monitoring plugin? Just gathering ideas on what is already out there before I get my hands dirty. Thanks. Marc -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] lun monitoring
Andreas, I have absolutely *no clue* what you are talking about, this is the first email I sent for this month to nagios-users. I'm guilty on the other point you raised, I'm interested if there's nagios-users@lists.sourceforge.net something I can do to effectively measure how busy a lun is, e.g. is running an iostat against a lun and looking at %b as reliable as running it against a local disk. Nor am I anxious to let others solve my problems I've been reading and am currently reading some writeups on lun performance monitoring and I thought it wouldn't be a bad idea to throw a question onto this thread. I'm sorry if you saw it that way. On 2/3/09, Andreas Ericsson a...@op5.se wrote: Marc Ismael wrote: Hi, Anyone implemented any sort of lun monitoring plugin? Just gathering ideas on what is already out there before I get my hands dirty. Thanks. You're far too anxious to let others solve your problems. It was less than two hours ago you sent your earlier email, and you still haven't told us what google queries you (presumably unsuccessfully) tried or which other places you've looked for information. If you appear as a timesink, people will pour anything but time your way. -- Andreas Ericsson andreas.erics...@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] lun monitoring
Thanks Russell, I've already got the path and target visibility monitoring covered. How about in terms of performance though? Is there value in monitoring io, e.g. via iostat or another utility? On 2/3/09, Russell Adams rlad...@adamsinfoserv.com wrote: What would you monitor? Path availability would be the only item of note, and querying that information will vary by SAN driver and OS. Otherwise a LUN should show up as a disk with a filesystem that could be monitored with existing tools. On Tue, Feb 03, 2009 at 12:14:57AM +0800, Marc Ismael wrote: Hi, Anyone implemented any sort of lun monitoring plugin? Just gathering ideas on what is already out there before I get my hands dirty. Thanks. Marc -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Russell Adamsrlad...@adamsinfoserv.com PGP Key ID: 0x1160DCB3 http://www.adamsinfoserv.com/ Fingerprint:1723 D8CA 4280 1EC9 557F 66E8 1154 E018 1160 DCB3 -- This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] how macros on event handlers work
Hi, I'm slightly unsure about how i used event_handler in this situation [Fri Apr 10 05:38:16 2009];Nagios 3.0.6 starting... (PID=1017) [Fri Apr 10 05:38:16 2009];Local time is Fri Apr 10 05:38:16 EDT 2009 [Fri Apr 10 05:38:16 2009];LOG VERSION: 2.0 [Fri Apr 10 05:38:16 2009];Finished daemonizing... (New PID=1018) [Fri Apr 10 05:38:16 2009];INITIAL HOST STATE: remotehost;UP;HARD;1; [Fri Apr 10 05:38:16 2009];INITIAL HOST STATE: mynagios;UP;HARD;1; [Fri Apr 10 05:38:16 2009];INITIAL SERVICE STATE: remotehost;check_nagios_cronjob;OK;HARD;1;(null) [Fri Apr 10 05:38:16 2009];INITIAL SERVICE STATE: mynagios;Current Load;OK;HARD;1;(null) [Fri Apr 10 05:51:16 2009];Warning: The results of service 'check_nagios_cronjob' on host 'remotehost' are stale by 0d 0h 3m 0s (threshold=0d 0h 10m 0s). I'm forcing an immediate check of the service. [Fri Apr 10 05:51:21 2009];SERVICE ALERT: remotehost;check_nagios_cronjob;WARNING;HARD;1;WARNING: no information received from passive check (stale) [Fri Apr 10 05:51:21 2009];SERVICE EVENT HANDLER: remotehost;check_nagios_cronjob;(null);(null);(null);check_by_ssh define command{ command_namecheck_by_ssh command_line$USER1$/eventhandlers/check_by_ssh.pl $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTNAME$ } define service{ use passive_template service_description check_nagios_cronjob hostgroup_name testgroup check_command i_am_stale event_handler check_by_ssh freshness_treshold 600 } define command{ command_namei_am_stale command_line$USER1$/check_dummy 1 no information received from passive check (stale) } based on the configuration above, im expecting a passive check to timeout after 10 mins. Then check_command will kick in, but it just returns WARNING which is HARD (i've set max_check_attempts to 1). it will then execute the event_handler, passing the macros $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTNAME$ as argument to check_by_ssh.pl. But based on the log output above, these macros contain 'null'? i'm not sure why but im pretty sure macros are working since $HOSTNAME$ was passed correctly. Please give me a hint on what im missing. Thanks, Marc -- This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] how macros on event handlers work
Hi, I'm slightly unsure about how i used event_handler in this situation [Fri Apr 10 05:38:16 2009];Nagios 3.0.6 starting... (PID=1017) [Fri Apr 10 05:38:16 2009];Local time is Fri Apr 10 05:38:16 EDT 2009 [Fri Apr 10 05:38:16 2009];LOG VERSION: 2.0 [Fri Apr 10 05:38:16 2009];Finished daemonizing... (New PID=1018) [Fri Apr 10 05:38:16 2009];INITIAL HOST STATE: remotehost;UP;HARD;1; [Fri Apr 10 05:38:16 2009];INITIAL HOST STATE: mynagios;UP;HARD;1; [Fri Apr 10 05:38:16 2009];INITIAL SERVICE STATE: remotehost;check_nagios_cronjob;OK;HARD;1;(null) [Fri Apr 10 05:38:16 2009];INITIAL SERVICE STATE: mynagios;Current Load;OK;HARD;1;(null) [Fri Apr 10 05:51:16 2009];Warning: The results of service 'check_nagios_cronjob' on host 'remotehost' are stale by 0d 0h 3m 0s (threshold=0d 0h 10m 0s). I'm forcing an immediate check of the service. [Fri Apr 10 05:51:21 2009];SERVICE ALERT: remotehost;check_nagios_cronjob;WARNING;HARD;1;WARNING: no information received from passive check (stale) [Fri Apr 10 05:51:21 2009];SERVICE EVENT HANDLER: remotehost;check_nagios_cronjob;(null);(null);(null);check_by_ssh define command{ command_namecheck_by_ssh command_line$USER1$/eventhandlers/check_by_ssh.pl $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTNAME$ } define service{ use passive_template service_description check_nagios_cronjob hostgroup_name testgroup check_command i_am_stale event_handler check_by_ssh freshness_treshold 600 } define command{ command_namei_am_stale command_line$USER1$/check_dummy 1 no information received from passive check (stale) } based on the configuration above, im expecting a passive check to timeout after 10 mins. Then check_command will kick in, but it just returns WARNING which is HARD (i've set max_check_attempts to 1). it will then execute the event_handler, passing the macros $SERVICESTATE$ $SERVICESTATETYPE$ $SERVICEATTEMPT$ $HOSTNAME$ as argument to check_by_ssh.pl. But based on the log output above, these macros contain 'null'? i'm not sure why but im pretty sure macros are working since $HOSTNAME$ was passed correctly. Please give me a hint on what im missing. Thanks, Marc -- This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Is Clientless monitoring possible
On Tue, Apr 14, 2009 at 7:22 PM, sudhaka...@i2.com wrote: Hi, I have installed nagios on one of the system running redhat. Our network has around 25-30 server incluing windows and other network devices such as switches etc. Please let me know if we can monitor the server (Including windows) without installing any client on the remote host. Regards, Sudhakar -- This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null You can also take a look at nsca. It does require you to install and run a daemon on the nagios server, but atleast you don't need to worry about running daemons on the clients. You'll need to schedule the checks on the clients via cron or something else. -- This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] forcing a check script once every startup
Hi, Is there any way for me to force an active check when nagios starts up? I don't need to execute this check at regular intervals, just once every nagios startup. Thanks, Marc -- Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] parsing log files on remote hosts
On Fri, Apr 17, 2009 at 10:35 PM, kaouther mechri kmec...@gmail.com wrote: Hello All, I am seraching a way to parse some application log files on remote hosts and grep for specific words, I need to add this as a nagios check. Can anyone help me kind regards kaouther -- Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null I would imagine doing this via a simple shell or perl script and executing it via nrpe / nsca. -- Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Precompiled Solaris Binaries
On Fri, Apr 17, 2009 at 10:08 PM, Christopher McAtackney crist...@gmail.com wrote: Hi everyone, I was wondering if there were pre-compiled binaries of Nagios 3.0.6, NRPE 2.12 and Plugins 1.4.13 for Solaris 10 x86 available anywhere? Cheers, Chris Just throwing myself on the thread, I also was looking for these Marc.I -- Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] Odd errors in nagios.log
On Thu, Apr 16, 2009 at 5:43 PM, Martin A. Brooks mar...@antibodymx.netwrote: Jim Avery wrote: You don't have more than one instance of the Nagios daemon running do you? Try stopping the Nagios daemon, make sure all instances of nagios are stopped (using ps -ef | grep nagios), and kill any which remain then start the Nagios daemon using /etc/init.d/nagios start . I looked for the before, and there's definitely only one nagios instance running. The problem persists across nagios restarts and reboots. Thanks -- Martin A. Brooks | http://www.antibodymx.net/ | Anti-spam anti-virus Consultant| mar...@antibodymx.net | filtering. Inoculate antibodymx.net | m: +447792493388 | your mail system. try playing around with your check_result_reaper_* timings. I have a feeling that your check results are already stale before the reaper gets a chance to process them. tip: for a while, try setting max_check_result_file_age to 0 and see if you'd still get these errors. if not, then its a timing misconfiguration. cheers, Marc.I -- Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] forcing a check script once every startup
On Fri, Apr 17, 2009 at 11:55 PM, Jim Avery j...@jimavery.me.uk wrote: 2009/4/17 Marc Ismael marcism...@gmail.com: Hi, Is there any way for me to force an active check when nagios starts up? I don't need to execute this check at regular intervals, just once every nagios startup. Thanks, Marc Forgive me if I'm missing something here, but can you not just add the relevant command to the /etc/init.d/nagios startup script? I dont want to go that route. If there's a way / workaround to have it as a nagios-executable plugin that'd be great. -- Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] New to Nagios -- simple problem with data from nrpe
On Fri, Apr 17, 2009 at 11:39 PM, Brian O'Mahony brian.omah...@curamsoftware.com wrote: There seems to be a lot of documentation on Nagios all over the web. I must say im impressed with it, even though im still installing it. Basically I have the server set up and monitoring itself, and I have one remote machine, which I have set up 6 tests on just for testing. I have the nagios plugins and nrpe installed and everything is working in that regard. If I run the check_nrpe command with any of the base remote commands in the /etc/nagios/nrpe.cfg file, I get the output to my screen. [nag...@mido ~]$ /usr/lib/nagios/plugins/check_nrpe -H 172.16.165.248 -c check_root_disk DISK OK - free space: / 4460 MB (95% inode=98%);| /=208MB;4898;4908;0;4918 (IP is the remote machine) However when these tests are added to server.cfg files, I get Status==UNKNOWN and Status Information=(No output returned from plugin) I reckon its something simple that ive missed, but ive gotten a bit lost looking for it. Server: Rhel5.1 Nagios-3.0.6 installed from source Plugins 1.4.13-4 from rpm Plugins-nrpe 2.12-6 from rpm Remote Server: Rhel4u6 nagios-nrpe-2.5.2-1 nagios-plugins-nrpe-2.12-6 nagios-plugins-1.4.13-1.el4 Thanks B i would do the following - make sure that nrpe on the nagios server is connecting to the correct port at the nrpe server by comparing nrpe config files. (check encryption, port, privs, etc). - make sure that nrpe service on the clients are actually running and listening on the configured port. - check service and command definitions Also I would double check the nrpe pdf documentation -- there are examples there to get you up and running with nrpe. -- Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] forcing a check script once every startup
On Sat, Apr 18, 2009 at 1:19 AM, Marc Powell m...@ena.com wrote: On Apr 17, 2009, at 9:47 AM, Marc Ismael wrote: Hi, Is there any way for me to force an active check when nagios starts up? I don't need to execute this check at regular intervals, just once every nagios startup. Not directly but I'd try setting a really really long check_interval combined with retain_status_information 0, retain_nonstatus_information 0. -- Marc (I just kicked myself..) I have had this problem for a while, why didn't I think of that. Thanks Marc! Marc -- Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] unsubscribe
On Mon, Oct 25, 2010 at 5:10 PM, Cosmin Neagu cosmin.ne...@omnilogic.rowrote: Hello, I'm trying to use the plugin check_telnet from Nagios Exchange, but i'm having this error. nag...@mon2:/usr/local/nagios/libexec$ ./check_telnet.pl -H 172.31.1.211 Can't call method close on an undefined value at ./check_telnet.pl line 120. ...and this is line 120: $telnet-close; Is anyone using this plugin with success? I know that i can check tcp port 23, but i was curious to see how this plugin works. Can someone help me? The OS is ubuntu 10.4, nagios v3.2.3 -- Cosmin Neagu -- Nokia and ATT present the 2010 Calling All Innovators-North America contest Create new apps games for the Nokia N8 for consumers in U.S. and Canada $10 million total in prizes - $4M cash, 500 devices, nearly $6M in marketing Develop with Nokia Qt SDK, Web Runtime, or Java and Publish to Ovi Store http://p.sf.net/sfu/nokia-dev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null -- The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book Blueprint to a Billion shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null