Hi list,
 
We're about to migrate from nagios to icinga. After dealing with some
kind of trouble, we now have a problem, which we - until now - can't
solve on our own. Before opening a new issue about the bug (?), I want
to ask you, if there is an error in our setup, I've missed an email
about this issue on the list or it's even a (new) bug in icinga:
 
Most of our host- and service-checks are being passively checked in a
regular interval (distributed monitoring). But some of our (passive)
checks receive no regular check results, they receive only a check
result if the state of the particular host/service changes. In many
cases, it is possible, that there is no state-change /
check-result-update for about 2 weeks or even longer. 
 
The ClassicUI displays everything correct: last check, duration, number
of host-/service-checks etc. (In this case: last check was on
30-08-2011). All entries in status.dat and retention.dat are okay and
up-to-date.
 
If we activate the "glorious three" (idomod/ido2db/mysql) and do a
'/etc/init-d/icinga start', the following scenario occurs: the retained
information seems to be only transferred to the database if they're not
too old (I guess, about a week or so and it matters both, host- and
service-checks). So they're not written to our database (I've checked
tables icinga_servicechecks and icinga_servicestatus; in icinga_objects
everything is correct), hence icinga-web doesn't display them and
calculates a wrong number of host-/service-checks. And some more
curious: if we submit passive service- and host-check-results via
interface or command-line, the results will be displayed correct - for
about a week and then they're "gone" (although max_age for service- and
host-checks is one day)... At this time, we have a difference of about
800 service-checks between ClassicUI and Icinga-web (or better: between
status.dat/retention.dat and ido) and this is the only problem that
prevents us from migrating to icinga.
 
For a better view on our setup, I've added some extracts of our config:
 
Ido2db:
max_timedevents_age=60
max_systemcommands_age=1440
max_servicechecks_age=1440
max_hostchecks_age=1440
max_eventhandlers_age=10080
max_externalcommands_age=0
max_logentries_age=0
max_acknowledgements_age=0
clean_realtime_tables_on_core_startup=1
clean_config_tables_on_core_startup=1
trim_db_interval=60
housekeeping_thread_startup_delay=60
 
 
idomod:
output_buffer_items=5000
file_rotation_interval=14400
file_rotation_timeout=60
reconnect_interval=1
reconnect_warning_interval=15
data_processing_options=-1
config_output_options=2
 
 
Any ideas?
 
Thanks in advance and kind regards
 
Klaus
 
 
 
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
icinga-users mailing list
icinga-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/icinga-users

Reply via email to