Hi Claudio,
Yea thats all I see in /var/log/messages ..
/var/log/daemon had;
2014-04-28T01:02:21.238360+01:00 mg1311 ospf6d[25154]: send_rtmsg:
action 1, prefix ::/0: File exists
2014-04-28T01:02:22.048344+01:00 mg1311 ospfd[19386]: desync;
scheduling fib reload
2014-04-28T01:02:32.518186+01:00 mg1311 ospfd[19386]: reloading
interface list and routing table
2014-04-28T01:02:59.077656+01:00 mg1311 bgpd[18490]: nexthop
185.25.32.22 now valid: directly connected
2014-04-28T01:03:01.762457+01:00 mg1311 bgpd[18490]: dispatch_imsg in
main: pipe closed
2014-04-28T01:03:01.762473+01:00 mg1311 bgpd[18490]: Lost child: route
decision engine exited
2014-04-28T01:03:01.762756+01:00 mg1311 bgpd[18490]: kernel routing
table 0 (Loc-RIB) decoupled
2014-04-28T01:03:01.778424+01:00 mg1311 bgpd[18490]: Terminating
It had been up for about a month before this. bgpd on the carp backup
is still up and running fine and never dies.
It's always been the master that falls over. I might make the other
firewall the master and see if it still crashes on the same box.
Is there anyway I can increase the session engine logging?
Running OpenBSD 5.4 release.
Thanks, Andy.
On Mon 28 Apr 2014 10:02:27 BST, Claudio Jeker wrote:
On Mon, Apr 28, 2014 at 09:47:51AM +0100, Andy wrote:
Hi,
We have an issue where every now and then OpenBGPD dies unexpectedly.
2014-04-28T00:00:53.860621+01:00 mg1311 bgpd[18490]: nexthop
2001:7f8:17::a571:1 now valid: via fe80:c::7e69:f6ff:fe68:2210
2014-04-28T00:01:02.241936+01:00 mg1311 bgpd[18490]: nexthop
2001:7f8:17::a571:1 now valid: via fe80:c::7e69:f6ff:fe68:2210
2014-04-28T00:01:25.538996+01:00 mg1311 bgpd[18490]: nexthop
2001:7f8:17::a571:1 now valid: via fe80:c::7e69:f6ff:fe68:2210
2014-04-28T00:01:42.241747+01:00 mg1311 bgpd[18490]: nexthop
2001:7f8:17::a571:1 now valid: via fe80:c::7e69:f6ff:fe68:2210
2014-04-28T00:01:52.241976+01:00 mg1311 bgpd[18490]: nexthop
2001:7f8:17::a571:1 now valid: via fe80:c::7e69:f6ff:fe68:2210
2014-04-28T00:01:55.690285+01:00 mg1311 bgpd[18490]: nexthop
2001:7f8:17::a571:1 now valid: via fe80:c::7e69:f6ff:fe68:2210
2014-04-28T00:30:20.227396+01:00 mg1311 bgpd[18490]: nexthop 5.57.80.89 now
valid: via 185.25.32.2
2014-04-28T01:02:59.077656+01:00 mg1311 bgpd[18490]: nexthop 185.25.32.22
now valid: directly connected
2014-04-28T01:03:01.762457+01:00 mg1311 bgpd[18490]: dispatch_imsg in main:
pipe closed
2014-04-28T01:03:01.762473+01:00 mg1311 bgpd[18490]: Lost child: route
decision engine exited
2014-04-28T01:03:01.762756+01:00 mg1311 bgpd[18490]: kernel routing table 0
(Loc-RIB) decoupled
2014-04-28T01:03:01.778424+01:00 mg1311 bgpd[18490]: Terminating
This happens maybe once a month on only one of our two firewalls (carp
pair).
We run a iBGP full mesh between both OpenBSD servers and and our two cisco
routers. We also run an iBGP between the two OpenBSD firewalls.
Does anyone have any ideas how we can start debugging this?
This is all the log you get? Nothing from the session engine about
quitting? The RDE exited via a regualr exit() call and therefor I think
that it smells like an issue with the SE. Now why is there no log about
the SE crashing or exiting... that should not be possible.
Also what version of OpenBSD and therefore bgpd are you using?