Hi. I'm running icinga 1.7.0 with idoutils compiled in, all from the FreeBSD ports on FreeBSD 8.2. I've got an issue where idomod is regularly logging connection issues. ido2db children are segfaulting, but I'm unable to find the cause and don't know if it's even related. Some googling has found similar errors from a couple of years ago, but those threads all seem to conclude with a fix being applied to the code at much earlier revisions than this.
In my icinga.log (debug level 7) I'm getting the following: [1342572821] idomod: Successfully reconnected to data sink! 0 items lost, 10 queued items to flush. [1342572821] idomod: Successfully flushed 10 queued items to data sink. [1342572843] idomod: Error writing to data sink! Some output may get lost... [1342572843] idomod: Please check remote ido2db log, database connection or SSL Parameters In my idomod.debug (debug level 3) at the same time: [1342572843.163290] [001.2] [pid=37454] idomod_write_to_sink( 202: 1=300 2=0 3=0 4=1342572843.163263 73=1342572843 74=262144 72=idomod: Error writing to data sink! Some output may get lost... 999 ) ido2db.log is set with debug_level 3 as well, but isn't logging anything with that same timestamp (even just rounding to the nearest second). The timestamp itself just appears in lines like this slightly after the event: [1342572863.110588] [001.2] [pid=37907] [tid=34380726720] ido2db_handle_client_input() line: 18, type: 4, VAL: 1342572843.163263 [1342572863.110600] [001.2] [pid=37907] [tid=34380726720] ido2db_add_input_data_item() start [1342572863.110611] [001.2] [pid=37907] [tid=34380726720] ido2db_add_input_data_item(1342572843.163263) [1342572863.110623] [001.2] [pid=37907] [tid=34380726720] ido2db_start_input_data() end [1342572863.110634] [001.2] [pid=37907] [tid=34380726720] ido2db_handle_client_input() end It does occasionally log these, although I haven't found any where the timestamps match up to other logs: [1342572863.110887] [001.2] [pid=37907] [tid=34380726720] ido2db_add_input_data_item(idomod: Error writing to data sink! Some output may get lost...) And I'm periodically seeing segfaults of child processes: [1342572868.037778] [001.2] [pid=37907] [tid=34380726720] Child caught signal '11' exiting [1342572868.037789] [001.2] [pid=37907] [tid=34380726720] ido2db_child_sighandler() end [1342572868.037804] [001.2] [pid=37907] [tid=34380726720] ido2db_db_disconnect() start [1342572868.037817] [001.2] [pid=37907] [tid=34380726720] ido2db_db_disconnect() already disconnected [1342572868.037828] [001.2] [pid=37907] [tid=34380726720] ido2db_db_deinit() start [1342572868.037843] [001.2] [pid=37907] [tid=34380726720] ido2db_free_cached_object_ids() start [1342572868.037880] [001.2] [pid=37907] [tid=34380726720] ido2db_free_cached_object_ids() end [1342572868.037894] [001.2] [pid=37907] [tid=34380726720] ido2db_db_deinit() end [1342572868.038158] [001.2] [pid=37164] [tid=34380726720] ido2db_parent_sighandler() start [1342572868.038196] [001.2] [pid=37164] [tid=34380726720] processing signal '20' [1342572868.038210] [001.2] [pid=37164] [tid=34380726720] cleanup children that exit, so we don't have zombies >From postgresql I'm getting these, every time there's a flush of data: Jul 18 01:45:17 mon1 postgres[2132]: [2-1] LOG: connection received: host=127.0.0.1 port=38972 Jul 18 01:45:17 mon1 postgres[2132]: [3-1] LOG: connection authorized: user=icinga database=icinga Jul 18 01:45:17 mon1 postgres[2132]: [4-1] LOG: unexpected EOF on client connection Jul 18 01:45:17 mon1 postgres[2132]: [5-1] LOG: disconnection: session time: 0:00:00.117 user=icinga database=icinga host=127.0.0.1 port=38972 I'm not able to piece all of this together into a coherent root cause. Is anyone able to explain what I'm seeing here, and what the real problem is? Is there further information collection I can do? ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ icinga-users mailing list icinga-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/icinga-users