Le 18/10/2019 à 13:22, Claudio Jeker a écrit :
> On Fri, Oct 18, 2019 at 12:55:02AM -0700, Sacha wrote:
>> Dear all,
>>
>>  first of all sorry if this bug report is not complete, the issue is on our
>> production firewalls and each test cut all our AS network, we have to be in
>> the datacenter to go further.
> This is not good.
>
>>  We have 2 firewalls on master/slave Carp failover, with BGPD and OSPF.
>>  After upgrading on 6.6, we have an issue when we reboot one of our two
>> firewalls, it make the other crash the BGPD daemon (our AS is no more
>> announced).
>>  This occurs even on master and slave firewall, when we reboot one the other
>> looses it's bgp.
>>  What we know so far is if we stop ospf & ospf6 daemons before the reboot,
>> there is no more issue.
>>  I'm going to the datacenter this afternoon, I will try to reproduce with
>> more logs.
>>  All ideas for debugging are welcome.
>>
Just back from the datacenter after some tests.

Let's have some names to make it easier: Firewall 1 usualy the master is
Cerbere1, Firewall 2 is Cerbere2

The issue occurs only if we shutdown Cerbere2 (idenpendantly of his
state of carp master/slave): the bgpd on Cerbere1 shuts down:

Oct 18 15:16:35 cerbere1 bgpd[74950]: session engine exiting
Oct 18 15:16:41 cerbere1 bgpd[91574]: kernel routing table 0 (Loc-RIB)
decoupled
Oct 18 15:16:42 cerbere1 bgpd[91574]: route decision engine terminated;
signal 11
Oct 18 15:16:42 cerbere1 bgpd[91574]: terminating

We tried to reproduce the issue when shuting down Cerbere2, no problem.
We will check if all the configurations are the sames.

The strange thing is when I launch bgpd on Cerbere1 from shell (bgpd -dv
-c /etc/bgpd.conf) I have no issue (tested twice !).

> Check /var/log/daemon what did bgpd log before going down?
> I would be interested to see the bgpd related syslog output.
>
> You can increase logging with bgpctl log verbose or just run bgpd
> in debug more (bgpd -dvv).
>
> If one of the process crashes (normally by a SIGSEGV or similar signal)
> then set the sysctl kern.nosuidcoredump=3 and create a directory called
> /var/crash/bgpd. Also make sure your limit for the coredumpsize is high
> enough. This should allow you to get a coredump of the crashing process.
> Once you have a core it should be possible to get a backtrace.
>
Finaly, it's not a process crash it just a clean shutdown, but it is not
excepected and we don't know why.


Sacha.

Reply via email to