Public bug reported:
Hey!
After upgrading a few VPN to 4.15.0-38.41 (either Xenial or Bionic), we
get random crashes. This also happens with the 4.18 in bionic-proposed.
These crashes didn't happen with 4.4 from Xenial. Here is a stack trace:
[ 31.154360] BUG: unable to handle kernel NULL pointer dereference at
0000000000000038
[ 31.162233] PGD 0 P4D 0
[ 31.164786] Oops: 0000 [#1] SMP PTI
[ 31.168291] CPU: 5 PID: 42 Comm: ksoftirqd/5 Not tainted 4.18.0-11-generic
#12~18.04.1-Ubuntu
[ 31.176854] Hardware name: Supermicro Super Server/X10SDV-4C-7TP4F, BIOS
1.0b 11/21/2016
[ 31.184980] RIP: 0010:vti_rcv_cb+0xb9/0x1a0 [ip_vti]
[ 31.189962] Code: 8b 44 24 70 0f c8 89 87 b4 00 00 00 48 8b 86 20 05 00 00
8b 80 f8 14 00 00 85 c0 75 05 48 85 d2 74 0e 48 8b 43 58 48 83 e0 fe <f6> 40 38
04 74 7d 44 89 b3 b4 00 00 00 49 8b 44 24 20 48 39 86 20
[ 31.208916] RSP: 0018:ffffbc61832e3920 EFLAGS: 00010246
[ 31.214160] RAX: 0000000000000000 RBX: ffff9a3504964a00 RCX: 0000000000000002
[ 31.221328] RDX: ffff9a351add4080 RSI: ffff9a351aa08000 RDI: ffff9a3504964a00
[ 31.228485] RBP: ffffbc61832e3940 R08: 0000000000000004 R09: ffffffffc0aa612b
[ 31.235643] R10: 0008f09b99881884 R11: 1884bd4e2d6b1fac R12: ffff9a3507b31900
[ 31.242803] R13: ffff9a3507b31000 R14: 0000000000000000 R15: ffff9a3504964a00
[ 31.249964] FS: 0000000000000000(0000) GS:ffff9a35bfd40000(0000)
knlGS:0000000000000000
[ 31.258077] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 31.263848] CR2: 0000000000000038 CR3: 000000041a40a003 CR4: 00000000003606e0
[ 31.271004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 31.278163] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 31.285320] Call Trace:
[ 31.287789] xfrm4_rcv_cb+0x4a/0x70
[ 31.291297] xfrm_input+0x58f/0x8f0
[ 31.294807] vti_input+0xaa/0x110 [ip_vti]
[ 31.298926] vti_rcv+0x33/0x3c [ip_vti]
[ 31.302783] xfrm4_esp_rcv+0x39/0x50
[ 31.306375] ip_local_deliver_finish+0x62/0x200
[ 31.310923] ip_local_deliver+0xdf/0xf0
[ 31.314775] ? ip_rcv_finish+0x420/0x420
[ 31.318718] ip_rcv_finish+0x126/0x420
[ 31.322486] ip_rcv+0x28f/0x360
[ 31.325655] ? inet_del_offload+0x40/0x40
[ 31.329686] __netif_receive_skb_core+0x48c/0xb70
[ 31.334413] ? kmem_cache_alloc+0xb4/0x1d0
[ 31.338532] ? __build_skb+0x2b/0xf0
[ 31.342128] __netif_receive_skb+0x18/0x60
[ 31.346244] ? __netif_receive_skb+0x18/0x60
[ 31.350536] netif_receive_skb_internal+0x45/0xe0
[ 31.355263] napi_gro_receive+0xc5/0xf0
[ 31.359141] mlx5e_handle_rx_cqe+0x1b2/0x5d0 [mlx5_core]
[ 31.364476] ? skb_release_all+0x24/0x30
[ 31.368430] mlx5e_poll_rx_cq+0xd3/0x990 [mlx5_core]
[ 31.373432] mlx5e_napi_poll+0x9b/0xc60 [mlx5_core]
[ 31.378333] ? __switch_to_asm+0x34/0x70
[ 31.382270] ? __switch_to_asm+0x40/0x70
[ 31.386214] ? __switch_to_asm+0x34/0x70
[ 31.391056] ? __switch_to_asm+0x40/0x70
[ 31.395905] ? __switch_to_asm+0x34/0x70
[ 31.400743] net_rx_action+0x140/0x3a0
[ 31.405379] ? __switch_to+0xad/0x500
[ 31.409887] __do_softirq+0xe4/0x2bb
[ 31.414448] run_ksoftirqd+0x2b/0x40
[ 31.418862] smpboot_thread_fn+0xfc/0x170
[ 31.423700] kthread+0x121/0x140
[ 31.427701] ? sort_range+0x30/0x30
[ 31.432040] ? kthread_create_worker_on_cpu+0x70/0x70
[ 31.437816] ret_from_fork+0x35/0x40
[ 31.442219] Modules linked in: esp6 authenc echainiv xfrm6_mode_tunnel
xfrm4_mode_tunnel xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4
af_key xfrm_algo ip_vti ip_tunnel ip6_vti ip6_tunnel tunnel6 8021q garp mrp stp
llc bonding ipt_REJECT nf_reject_ipv4 nfnetlink_log n
fnetlink xt_NFLOG xt_hl xt_limit xt_nat xt_TCPMSS xt_HL xt_comment xt_tcpudp
xt_multiport xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_connmark xt_mark iptable_mangle xt_CT
nf_conntrack xt_addrtype iptable_raw bpfilter ipmi_ssif gpio_
ich intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
kvm irqbypass intel_cstate intel_rapl_perf input_leds joydev mei_me
intel_pch_thermal ioatdma mei lpc_ich ipmi_si ipmi_devintf ipmi_msghandler
acpi_pad mac_hid sch_fq_codel
[ 31.519488] ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
libcrc32c raid0 multipath linear mlx5_ib ib_uverbs ib
_core raid1 hid_generic usbhid hid crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel ast pcbc ttm drm_kms_helper aesni_intel syscopyarea
aes_x86_64 sysfillrect mxm_wmi crypto_simd sysimgblt cryptd glue_helper
fb_sys_fops mlx5_core ixgbe igb mpt3sas drm ahci tls libahci i2c_algo_bit m
lxfw raid_class dca devlink mdio scsi_transport_sas wmi
[ 31.578877] CR2: 0000000000000038
[ 31.583249] ---[ end trace c4bada38847a0075 ]---
Upgrading to mainline 4.18.17 seems to solve the issue. It's difficult
to bissect as it doesn't happen often. 4.18.17 contains
c473a489d4098969ffafda913e1ad71da31b1104 (xfrm: Fix NULL pointer
dereference when skb_dst_force clears the dst_entry) but it doesn't
match the stacktrace (stacktrace is input, patch is output and forward).
There is also fdb06c787b34fd397f28f515105627307d615025 (xfrm: Fix NULL
pointer dereference when skb_dst_force clears the dst_entry) which is
also in 4.17 and may better match the problem but I am unsure what it
means to have several transformations (we use VTI interfaces, but other
than that, we don't do anything fancy).
Hardware is Mellanox ConnectX-4 Lx (no ESP offload).
May I suggest upgrade 4.18 to 4.18.17 and to backport these two patches
to Bionic 4.15?
Thanks.
** Affects: linux (Ubuntu)
Importance: Undecided
Status: Incomplete
** Tags: cosmic
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1802480
Title:
Crash when using IPsec VTI interfaces on 4.15 and 4.18.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1802480/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs