FYI, in the time it's taking to wait for nagios to start polling anything after starting it up I decided to look at what it's doing...

This would explain why it starts up and sits around not consuming any cycles but not polling. Sleep left in the code? These entries in the log each come afer a few minutes (119 and 175 seconds apart) each..

This is running on 2.0b6, x86_64 arch, compiled from source with perlcache.

/eli

###FILE: nagios.log:
[1134076786] Finished daemonizing... (New PID=11914)
[1134076905] service_result_worker_thread(): poll(): EINTR (impossible)
[1134077080] service_result_worker_thread(): poll(): EINTR (impossible)


### GDB info:
Attaching to program: /usr/local/nagios/bin/nagios, process 11914
Reading symbols from /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi/CORE/libperl.so...(no debugging symbols found)...done. Loaded symbols for /usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi/CORE/libperl.so Reading symbols from /lib64/libnsl.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/tls/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/tls/libm.so.6
Reading symbols from /lib64/libcrypt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libutil.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libutil.so.1
Reading symbols from /lib64/tls/libpthread.so.0...
(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
[New Thread 182894164416 (LWP 11914)]
[New Thread 1094719840 (LWP 11917)]
[New Thread 1084229984 (LWP 11915)]
Loaded symbols for /lib64/tls/libpthread.so.0
Reading symbols from /lib64/tls/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/tls/libc.so.6
Reading symbols from /usr/lib64/libltdl.so.3...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libltdl.so.3
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x000000364700b9c5 in __nanosleep_nocancel ()
   from /lib64/tls/libpthread.so.0

(gdb) where
#0 0x000000364700b9c5 in __nanosleep_nocancel () from /lib64/tls/libpthread.so.0
#1  0x00000000004209aa in event_execution_loop ()
#2  0x000000000040efa0 in main ()

(gdb) info registers
rax            0xfffffffffffffdfc       -516
rbx            0x861bb0 8788912
rcx            0xffffffffffffffff       -1
rdx            0x2      2
rsi            0x0      0
rdi            0x7fbffff450     548682069072
rbp            0x0      0x0
rsp            0x7fbffff410     0x7fbffff410
r8             0x0      0
r9             0x2e8a   11914
r10            0x7fbffff301     548682068737
r11            0x202    514
r12            0x7fbffff450     548682069072
r13            0xffffffff       4294967295
r14            0xffffffff       4294967295
r15            0x7fbffffa08     548682070536
rip            0x364700b9c5     0x364700b9c5 <__nanosleep_nocancel+60>
eflags         0x202    514
cs             0x33     51
ss             0x2b     43
ds             0x0      0
es             0x0      0
fs             0x0      0
gs             0x0      0


Fred wrote:
I do the same thing with check_icmp except that I use sudo and create
a simple sudo entry like (see the CHECK_ICMP):

Cmnd_Alias CHECKALLSSHKEYS = /opt/hptc/nagios/libexec/check_keys # HP-HPTC-KeySync Cmnd_Alias CHECKSYSLOGALERTS = /opt/hptc/nagios/libexec/check_syslogalerts # HP-HPTC-SysLog
Cmnd_Alias CHECKSFS = /opt/hptc/nagios/libexec/check_sfs # HP-HPTC-SysLog
Cmnd_Alias CHECKLSF = /opt/hptc/nagios/libexec/check_lsf # HP-HPTC-CheckLSF
Cmnd_Alias CHECKICMP = /opt/hptc/nagios/libexec/check_icmp # HP-HPTC-CheckICMP nagios ALL = NOPASSWD: CHECKALLSSHKEYS,CHECKSYSLOGALERTS,CHECKSFS,CHECKLSF,CHECKICMP # HP-HPTC-Nagios

I just built the 2.0b5 and hope to give it a try in the next few days on a
700+ node system ... I am hoping that this *solves* the delay problem
that existed in the previous releases.

-FredC


*/Eli Stair <[EMAIL PROTECTED]>/* wrote:


    I'm running a fresh build of 2.0b5 on x86_64. After an initial start of
    nagios, it can take up to 10 minutes for the first host or service
    checks to begin. There is no CPU load by the nagios process during this
    time. I have over 1000 hosts to check, and have reduced the max
    host/service check spread in order to ensure that it is not "evening"
    out the time.

    This problem is NOT occuring on a 2.0b3 build, with the same exact
    configuration.

    After the checks DO start, it can take hours to finish. I've changed
    the user to root so that I can have the host check be check_icmp -t
    1 -p
    1.

    Unfortunately, even with this situation, having anywhere between 4 and
    64 hosts go down can make the "monitoring" aspect effectively useless.

    Any suggestions on the problem of startup lag?
    Any ways to further speed up the host check runs, aside from using
    check_icmp?

    Thanks,

    /eli

    ### inline nagios.cfg:


    [EMAIL PROTECTED] etc]# cat nagios.cfg | egrep -v "^#|^$"
    log_file=/var/log/nagios/nagios.log
    cfg_file=/usr/local/nagios/etc/checkcommands.cfg
    cfg_file=/usr/local/nagios/etc/misccommands.cfg
    cfg_dir=/usr/local/nagios/etc/config
    cfg_file=/usr/local/nagios/etc/timeperiods.cfg
    cfg_file=/usr/local/nagios/etc/contacts.cfg
    cfg_file=/usr/local/nagios/etc/contactgroups.cfg
    cfg_file=/usr/local/nagios/etc/hosts.cfg
    cfg_file=/usr/local/nagios/etc/hostgroups.cfg
    cfg_file=/usr/local/nagios/etc/customcommands.cfg
    cfg_file=/usr/local/nagios/etc/services.cfg
    object_cache_file=/usr/local/nagios/var/objects.cache
    resource_file=/usr/local/nagios/etc/resource.cfg
    status_file=/usr/local/nagios/var/status.dat
    nagios_user=root
    nagios_group=root
    check_external_commands=1
    command_check_interval=-1
    command_file=/usr/local/nagios/var/rw/nagios.cmd
    comment_file=/usr/local/nagios/var/comments.dat
    downtime_file=/usr/local/nagios/var/downtime.dat
    lock_file=/usr/local/nagios/var/nagios.lock
    temp _file=/usr/local/nagios/var/nagios.tmp
    event_broker_options=-1
    log_rotation_method=d
    log_archive_path=/var/log/nagios/archives
    use_syslog=1
    log_notifications=1
    log_service_retries=1
    log_host_retries=1
    log_event_handlers=1
    log_initial_states=0
    log_external_commands=1
    log_passive_checks=1
    service_inter_check_delay_method=s
    max_service_check_spread=15
    service_interleave_factor=s
    host_inter_check_delay_method=s
    max_host_check_spread=10
    max_concurrent_checks=0
    service_reaper_frequency=15
    auto_reschedule_checks=0
    auto_rescheduling_interval=30
    auto_rescheduling_window=180
    sleep_time=0.25
    service_check_timeout=60
    host_check_timeout=30
    event_handler_timeout=30
    notification_timeout=30
    ocsp_timeout=5
    perfdata_timeout=5
    retain_state_information=1
    state_retention_file=/usr/local/nagios/var/retention.dat
    retention_update_interval=0
    use_retained_program_state=1
    use_retained_scheduling_info=0
    interv al_length=60
    use_aggressive_host_checking=0
    execute_service_checks=1
    accept_passive_service_checks=0
    execute_host_checks=1
    accept_passive_host_checks=1
    enable_notifications=1
    enable_event_handlers=1
    process_performance_data=0
    obsess_over_services=0
    check_for_orphaned_services=0
    check_service_freshness=1
    service_freshness_check_interval=60
    check_host_freshness=1
    host_freshness_check_interval=60
    aggregate_status_updates=1
    status_update_interval=15
    enable_flap_detection=0
    low_service_flap_threshold=5.0
    high_service_flap_threshold=20.0
    low_host_flap_threshold=5.0
    high_host_flap_threshold=20.0
    date_format=iso8601
    illegal_object_name_chars=`~!$%^&*|'"<>?,()=
    illegal_macro_output_chars=`~$&|'"<>
    use_regexp_matching=0
    use_true_regexp_matching=0
    admin_email=nagios
    admin_pager=pagenagios
    daemon_dumps_core=0



    -------------------------------------------------------
    This SF.net email is sponsored by: Splunk Inc. Do you grep through
    log files
    for problems? Stop! Download the new AJAX search engine that makes
    searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
    http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
    _______________________________________________
    Nagios-users mailing list
    [email protected]
    https://lists.sourceforge.net/lists/listinfo/nagios-users
    ::: Please include Nagios version, plugin version (-v) and OS when
    reporting any issue.
    ::: Messages without supporting info will risk being sent to /dev/null







-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nagios-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null

Reply via email to