Hi.  I'm running icinga 1.7.0 with idoutils compiled in, all from the FreeBSD 
ports on FreeBSD 8.2.  I've got an issue where idomod is regularly logging 
connection issues.  ido2db children are segfaulting, but I'm unable to find the 
cause and don't know if it's even related.  Some googling has found similar 
errors from a couple of years ago, but those threads all seem to conclude with 
a fix being applied to the code at much earlier revisions than this.

In my icinga.log (debug level 7) I'm getting the following:
[1342572821] idomod: Successfully reconnected to data sink!  0 items lost, 10 
queued items to flush.
[1342572821] idomod: Successfully flushed 10 queued items to data sink.
[1342572843] idomod: Error writing to data sink!  Some output may get lost...
[1342572843] idomod: Please check remote ido2db log, database connection or SSL 
Parameters

In my idomod.debug (debug level 3) at the same time:
[1342572843.163290] [001.2] [pid=37454] idomod_write_to_sink(
202:
1=300
2=0
3=0
4=1342572843.163263
73=1342572843
74=262144
72=idomod: Error writing to data sink!  Some output may get lost...
999

)

ido2db.log is set with debug_level 3 as well, but isn't logging anything with 
that same timestamp (even just rounding to the nearest second).  The timestamp 
itself just appears in lines like this slightly after the event:
[1342572863.110588] [001.2] [pid=37907] [tid=34380726720] 
ido2db_handle_client_input() line: 18, type: 4, VAL: 1342572843.163263
[1342572863.110600] [001.2] [pid=37907] [tid=34380726720] 
ido2db_add_input_data_item() start
[1342572863.110611] [001.2] [pid=37907] [tid=34380726720] 
ido2db_add_input_data_item(1342572843.163263)
[1342572863.110623] [001.2] [pid=37907] [tid=34380726720] 
ido2db_start_input_data() end
[1342572863.110634] [001.2] [pid=37907] [tid=34380726720] 
ido2db_handle_client_input() end

It does occasionally log these, although I haven't found any where the 
timestamps match up to other logs:
[1342572863.110887] [001.2] [pid=37907] [tid=34380726720] 
ido2db_add_input_data_item(idomod: Error writing to data sink!  Some output may 
get lost...)

And I'm periodically seeing segfaults of child processes:
[1342572868.037778] [001.2] [pid=37907] [tid=34380726720] Child caught signal 
'11' exiting
[1342572868.037789] [001.2] [pid=37907] [tid=34380726720] 
ido2db_child_sighandler() end
[1342572868.037804] [001.2] [pid=37907] [tid=34380726720] 
ido2db_db_disconnect() start
[1342572868.037817] [001.2] [pid=37907] [tid=34380726720] 
ido2db_db_disconnect() already disconnected
[1342572868.037828] [001.2] [pid=37907] [tid=34380726720] ido2db_db_deinit() 
start
[1342572868.037843] [001.2] [pid=37907] [tid=34380726720] 
ido2db_free_cached_object_ids() start
[1342572868.037880] [001.2] [pid=37907] [tid=34380726720] 
ido2db_free_cached_object_ids() end
[1342572868.037894] [001.2] [pid=37907] [tid=34380726720] ido2db_db_deinit() end
[1342572868.038158] [001.2] [pid=37164] [tid=34380726720] 
ido2db_parent_sighandler() start
[1342572868.038196] [001.2] [pid=37164] [tid=34380726720] processing signal '20'
[1342572868.038210] [001.2] [pid=37164] [tid=34380726720] cleanup children that 
exit, so we don't have zombies

>From postgresql I'm getting these, every time there's a flush of data:
Jul 18 01:45:17 mon1 postgres[2132]: [2-1] LOG:  connection received: 
host=127.0.0.1 port=38972
Jul 18 01:45:17 mon1 postgres[2132]: [3-1] LOG:  connection authorized: 
user=icinga database=icinga
Jul 18 01:45:17 mon1 postgres[2132]: [4-1] LOG:  unexpected EOF on client 
connection
Jul 18 01:45:17 mon1 postgres[2132]: [5-1] LOG:  disconnection: session time: 
0:00:00.117 user=icinga database=icinga host=127.0.0.1 port=38972


I'm not able to piece all of this together into a coherent root cause. Is 
anyone able to explain what I'm seeing here, and what the real problem is?  Is 
there further information collection I can do?



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
icinga-users mailing list
icinga-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/icinga-users

Reply via email to