On Wed, Jan 31, 2018 at 09:26:51PM +0100, Markus Berner wrote:
> > I'm running into a NULL pointer dereference after updating from Linux
> 4.1.6 to
> > 4.14.11 (see kernel log below).
> We are running into the same problem on our production machine, running
> CoreOS 1576.5.0 Stable with the 4.14.11 kernel on a KVM Cloud VM. It is not
> as easy to reproduce though in our case – we observed a total of 5 crashes
> in the last 2 weeks - all except one on the production machine.
> > I still can't reproduce it with my tests. This is probably some race
> > triggered due to your aggressive roadwarrior setup which I don't have.
> We have a similar setup to Tobias
> - 2 Network Interfaces (KVM/virtio): Public and local VLAN
> - Strongswan VPN in Tunnel mode between local VLAN and on-premise network,
> running in a Docker container
> - Quite a few iptables NAT and forwarding rules regarding other local Docker
> Some Observations:
> - The workaround of locking the IRQs of the Rx/Tx queues of all network
> interfaces to CPU0 Tobias described a while back did not prevent the crashes
> in our case
> - The bug does not seem to correlate with load in our case, but load in
> general is quite low.
> I am happy to help if I can, but unfortunately our possibilities are a bit
> limited; both due to lack of kernel dev know-how as well as trying out
> changes to configuration on the production machine. I subscribed to LKML
> only now to respond, so I hope the reply works (and to the correct message).
Thanks for offering help, but I fear we have to wait until
Tobias has bisected it.