On 10/09/2014 10:52 AM, Michael Friedrich wrote:

Am 09.10.2014 um 15:55 schrieb Jim Miller:


I've inherited an icinga 1.8.4 and Check_mk 1.2.4b6 server.  I don't
know if this is a new or existing issues but the icinga daemon crashes
any time I check "Host notifications" or "Host/Svc notification" within
Check_MK for any host or service.

I am _truely_ struggling with how to troubleshoot this issue.  I've
enabled debugging and coredumps (where are the coredumps and what are
they named? I can't find any).  I've also run 'strace -f -o
/tmp/strace_icinga.strace /usr/bin/icinga /etc/icinga/icinga.cfg' but I
can't see/find a common issue.



Most likely the recent mk_livestatus versions require a more uptodate
Icinga version (1.11+). My guess is that livestatus tries to access
undefined symbols in memory and causes the core process to crash.

You could try either an upgrade, or fetch a gdb backtrace running icinga
core in foreground, and then firing a query through livestatus.


I don't have a version with the dbg symbols included, installed from epel
Here's what GDB show when icinga crashes:

Program received signal SIGSEGV, Segmentation fault.
0x0000003848432925 in ?? ()
(gdb) bt
#0  0x0000003848432925 in ?? ()
#1  0x00000000004046ac in ?? ()
#2  0x0000000000000002 in ?? ()
#3  0x00007fff41168b00 in ?? ()
#4  0x0000000000659040 in ?? ()
#5  0x00007fff41168a70 in ?? ()
#6  0x0000000000000000 in ?? ()
(gdb) quit
A debugging session is active.


Here's what strace looks like when icinga crashes
nanosleep({0, 250000000}, NULL)         = ? ERESTART_RESTARTBLOCK (To be 
restarted)
--- SIGCHLD (Child exited) @ 0 (0) ---
restart_syscall(<... resuming interrupted call ...>) = 0
write(3, "[1413033030.967753] [008.1] [pid"..., 59) = 59
lseek(3, 0, SEEK_CUR)                   = 67151841
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3543, ...}) = 0
write(3, "[1413033030.968324] [008.1] [pid"..., 95) = 95
lseek(3, 0, SEEK_CUR)                   = 67151936
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3543, ...}) = 0
write(3, "[1413033030.968773] [008.1] [pid"..., 95) = 95
lseek(3, 0, SEEK_CUR)                   = 67152031
write(3, "[1413033030.969052] [008.1] [pid"..., 74) = 74
lseek(3, 0, SEEK_CUR)                   = 67152105
wait4(-1, NULL, WNOHANG, NULL)          = 15959
wait4(-1, NULL, WNOHANG, NULL)          = -1 ECHILD (No child processes)
write(3, "[1413033030.969594] [008.2] [pid"..., 96) = 96
lseek(3, 0, SEEK_CUR)                   = 67152201
write(3, "[1413033030.969836] [001.0] [pid"..., 69) = 69
lseek(3, 0, SEEK_CUR)                   = 67152270
write(3, "[1413033030.970163] [064.1] [pid"..., 68) = 68
lseek(3, 0, SEEK_CUR)                   = 67152338
write(3, "[1413033030.970421] [064.2] [pid"..., 76) = 76
lseek(3, 0, SEEK_CUR)                   = 67152414
write(3, "[1413033030.970685] [064.2] [pid"..., 76) = 76
lseek(3, 0, SEEK_CUR)                   = 67152490
nanosleep({0, 250000000},  <unfinished ...>
+++ killed by SIGSEGV +++
Segmentation fault








I'm not sure if this is related or not but /var/log lv filled up (yea
the irony is not lost on me) and after increasing the volume size and
poking around the website I noticed the issue.

Here's the config file for icinga:

log_file=/var/log/icinga/icinga.log
cfg_dir=/etc/icinga/conf.d
cfg_file=/etc/icinga/objects/localhost.cfg
cfg_dir=/etc/icinga/modules
object_cache_file=/var/spool/icinga/objects.cache
precached_object_file=/var/spool/icinga/objects.precache
resource_file=/etc/icinga/resource.cfg
status_file=/var/spool/icinga/status.dat
status_update_interval=10
icinga_user=icinga
icinga_group=icinga
check_external_commands=1
command_check_interval=-1
command_file=/var/spool/icinga/cmd/icinga.cmd
external_command_buffer_slots=32768
lock_file=/var/run/icinga.pid
temp_file=/tmp/icinga.tmp
temp_path=/tmp
log_rotation_method=d
log_archive_path=/var/log/icinga/archives
use_daemon_log=1
use_syslog=0
use_syslog_local_facility=0
syslog_local_facility=5
log_notifications=1
log_service_retries=1
log_host_retries=1
log_event_handlers=1
log_initial_states=1
log_current_states=1
log_external_commands=1
log_passive_checks=0
log_long_plugin_output=0
service_inter_check_delay_method=s
max_service_check_spread=5
service_interleave_factor=s
host_inter_check_delay_method=s
max_host_check_spread=5
max_concurrent_checks=1000
check_result_reaper_frequency=1
max_check_result_reaper_time=600
check_result_path=/var/spool/icinga/checkresults
max_check_result_file_age=3600
cached_host_check_horizon=15
cached_service_check_horizon=15
enable_predictive_host_dependency_checks=1
enable_predictive_service_dependency_checks=1
soft_state_dependencies=0
auto_reschedule_checks=0
auto_rescheduling_interval=30
auto_rescheduling_window=180
sleep_time=0.25
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5
retain_state_information=1
state_retention_file=/var/spool/icinga/retention.dat
retention_update_interval=60
use_retained_program_state=1
dump_retained_host_service_states_to_neb=1
use_retained_scheduling_info=1
retained_host_attribute_mask=0
retained_service_attribute_mask=0
retained_process_host_attribute_mask=0
retained_process_service_attribute_mask=0
retained_contact_host_attribute_mask=0
retained_contact_service_attribute_mask=0
interval_length=60
use_aggressive_host_checking=0
execute_service_checks=1
accept_passive_service_checks=1
execute_host_checks=1
accept_passive_host_checks=1
enable_notifications=1
enable_event_handlers=1
process_performance_data=1
obsess_over_services=0
obsess_over_hosts=0
translate_passive_host_checks=0
passive_host_checks_are_soft=0
check_for_orphaned_services=1
check_for_orphaned_hosts=1
service_check_timeout_state=u
check_service_freshness=1
service_freshness_check_interval=60
check_host_freshness=1
host_freshness_check_interval=60
additional_freshness_latency=15
enable_flap_detection=1
low_service_flap_threshold=5.0
high_service_flap_threshold=20.0
low_host_flap_threshold=5.0
high_host_flap_threshold=20.0
date_format=iso8601
p1_file=/usr/lib64/icinga/p1.pl
enable_embedded_perl=0
use_embedded_perl_implicitly=1
stalking_event_handlers_for_hosts=0
stalking_event_handlers_for_services=0
stalking_notifications_for_hosts=0
stalking_notifications_for_services=0
illegal_object_name_chars=`~!$%^&*|'"<>?,()=
illegal_macro_output_chars=`~$&|'"<>
keep_unknown_macros=0
use_regexp_matching=0
use_true_regexp_matching=0
admin_email=icinga@localhost
admin_pager=pageicinga@localhost
daemon_dumps_core=1
use_large_installation_tweaks=1
enable_environment_macros=0
child_processes_fork_twice=0
debug_level=-1
debug_verbosity=2
debug_file=/var/log/icinga/icinga.debug
max_debug_file_size=100000000
event_profiling_enabled=0
broker_module=/usr/lib/check_mk/livestatus.o /var/spool/icinga/cmd/live
event_broker_options=-1

_______________________________________________
icinga-users mailing list
[email protected]<mailto:[email protected]>
https://lists.icinga.org/mailman/listinfo/icinga-users





-- 
Michael Friedrich, DI (FH)
Application Developer

NETWAYS GmbH | Deutschherrnstr. 15-19 | D-90429 Nuernberg
Tel: +49 911 92885-0 | Fax: +49 911 92885-77
GF: Julian Hein, Bernd Erk | AG Nuernberg HRB18461
http://www.netways.de | 
[email protected]<mailto:[email protected]>

** Puppet Camp Duesseldorf 2014 - Oktober - netways.de/puppetcamp **
** OSMC 2014 - November - netways.de/osmc **
** OpenNebula Conf 2014 - Dezember - opennebulaconf.com **
** OSDC 2015 - April - osdc.de **
_______________________________________________
icinga-users mailing list
[email protected]<mailto:[email protected]>
https://lists.icinga.org/mailman/listinfo/icinga-users



_______________________________________________
icinga-users mailing list
[email protected]
https://lists.icinga.org/mailman/listinfo/icinga-users

Reply via email to