Re: OpenBGPD crashing

Stuart Henderson Tue, 29 Apr 2014 20:17:16 -0700

On 2014-04-28, Andy <[email protected]> wrote:
>
> Yea thats all I see in /var/log/messages ..
>
> /var/log/daemon had;
> 2014-04-28T01:02:21.238360+01:00 mg1311 ospf6d[25154]: send_rtmsg: action 1, 
> prefix ::/0: File exists
> 2014-04-28T01:02:22.048344+01:00 mg1311 ospfd[19386]: desync; scheduling fib 
> reload
> 2014-04-28T01:02:32.518186+01:00 mg1311 ospfd[19386]: reloading interface 
> list and routing table
> 2014-04-28T01:02:59.077656+01:00 mg1311 bgpd[18490]: nexthop 185.25.32.22 now 
> valid: directly connected
> 2014-04-28T01:03:01.762457+01:00 mg1311 bgpd[18490]: dispatch_imsg in main: 
> pipe closed
> 2014-04-28T01:03:01.762473+01:00 mg1311 bgpd[18490]: Lost child: route 
> decision engine exited
> 2014-04-28T01:03:01.762756+01:00 mg1311 bgpd[18490]: kernel routing table 0 
> (Loc-RIB) decoupled
> 2014-04-28T01:03:01.778424+01:00 mg1311 bgpd[18490]: Terminating
>
> It had been up for about a month before this. bgpd on the carp backup 
> is still up and running fine and never dies.


It might be interesting to monitor memory use (symon is fairly good
for this), one failure mode is that bgpd can't handle nexthop revalidation
churn fast enough and the rde eats all memory (or at least runs into
login.conf datasize limits - make sure these are sufficiently high
if you haven't already done so). I don't remember all of the various ways
this can show up in logs, but what you're seeing here does ring a bell.

I work around this by only feeding a partial table to routers which are
particularly likely to suffer ospf churn ... not ideal but stability
improved a lot after doing that. (It used to happen more often but
there was a change in 5.2 which seriously reduced the situations that
trigger this).

> It's always been the master that falls over. I might make the other 
> firewall the master and see if it still crashes on the same box.
>
> Is there anyway I can increase the session engine logging?

There's bgpd_flags="-v" if you're not already using it - I'm not too
sure if it will actually give you any more information but worth a try.
Make sure syslogd does actually log the messages - I use memory buffer
logs for these,

syslogd_flags="-s /var/run/syslogd.sock"

with this the top of syslog.conf:

!!bgpd
*.*                                     :256:bgpd
daemon.info                             /var/log/daemon
!*

Re: OpenBGPD crashing

Reply via email to