This bug is missing log files that will aid in diagnosing the problem.
While running an Ubuntu kernel (not a mainline or third-party kernel)
please enter the following command in a terminal window:

apport-collect 1802480

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable
to run this command, please add a comment stating that fact and change
the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.

** Changed in: linux (Ubuntu)
       Status: New => Incomplete

** Tags added: cosmic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1802480

Title:
  Crash when using IPsec VTI interfaces on 4.15 and 4.18.

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hey!

  After upgrading a few VPN to 4.15.0-38.41 (either Xenial or Bionic),
  we get random crashes. This also happens with the 4.18 in bionic-
  proposed. These crashes didn't happen with 4.4 from Xenial. Here is a
  stack trace:

  [   31.154360] BUG: unable to handle kernel NULL pointer dereference at 
0000000000000038
  [   31.162233] PGD 0 P4D 0
  [   31.164786] Oops: 0000 [#1] SMP PTI
  [   31.168291] CPU: 5 PID: 42 Comm: ksoftirqd/5 Not tainted 4.18.0-11-generic 
#12~18.04.1-Ubuntu
  [   31.176854] Hardware name: Supermicro Super Server/X10SDV-4C-7TP4F, BIOS 
1.0b 11/21/2016
  [   31.184980] RIP: 0010:vti_rcv_cb+0xb9/0x1a0 [ip_vti]
  [   31.189962] Code: 8b 44 24 70 0f c8 89 87 b4 00 00 00 48 8b 86 20 05 00 00 
8b 80 f8 14 00 00 85 c0 75 05 48 85 d2 74 0e 48 8b 43 58 48 83 e0 fe <f6> 40 38 
04 74 7d 44 89 b3 b4 00 00 00 49 8b 44 24 20 48 39 86 20
  [   31.208916] RSP: 0018:ffffbc61832e3920 EFLAGS: 00010246
  [   31.214160] RAX: 0000000000000000 RBX: ffff9a3504964a00 RCX: 
0000000000000002
  [   31.221328] RDX: ffff9a351add4080 RSI: ffff9a351aa08000 RDI: 
ffff9a3504964a00
  [   31.228485] RBP: ffffbc61832e3940 R08: 0000000000000004 R09: 
ffffffffc0aa612b
  [   31.235643] R10: 0008f09b99881884 R11: 1884bd4e2d6b1fac R12: 
ffff9a3507b31900
  [   31.242803] R13: ffff9a3507b31000 R14: 0000000000000000 R15: 
ffff9a3504964a00
  [   31.249964] FS:  0000000000000000(0000) GS:ffff9a35bfd40000(0000) 
knlGS:0000000000000000
  [   31.258077] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   31.263848] CR2: 0000000000000038 CR3: 000000041a40a003 CR4: 
00000000003606e0
  [   31.271004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [   31.278163] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
  [   31.285320] Call Trace:
  [   31.287789]  xfrm4_rcv_cb+0x4a/0x70
  [   31.291297]  xfrm_input+0x58f/0x8f0
  [   31.294807]  vti_input+0xaa/0x110 [ip_vti]
  [   31.298926]  vti_rcv+0x33/0x3c [ip_vti]
  [   31.302783]  xfrm4_esp_rcv+0x39/0x50
  [   31.306375]  ip_local_deliver_finish+0x62/0x200
  [   31.310923]  ip_local_deliver+0xdf/0xf0
  [   31.314775]  ? ip_rcv_finish+0x420/0x420
  [   31.318718]  ip_rcv_finish+0x126/0x420
  [   31.322486]  ip_rcv+0x28f/0x360
  [   31.325655]  ? inet_del_offload+0x40/0x40
  [   31.329686]  __netif_receive_skb_core+0x48c/0xb70
  [   31.334413]  ? kmem_cache_alloc+0xb4/0x1d0
  [   31.338532]  ? __build_skb+0x2b/0xf0
  [   31.342128]  __netif_receive_skb+0x18/0x60
  [   31.346244]  ? __netif_receive_skb+0x18/0x60
  [   31.350536]  netif_receive_skb_internal+0x45/0xe0
  [   31.355263]  napi_gro_receive+0xc5/0xf0
  [   31.359141]  mlx5e_handle_rx_cqe+0x1b2/0x5d0 [mlx5_core]
  [   31.364476]  ? skb_release_all+0x24/0x30
  [   31.368430]  mlx5e_poll_rx_cq+0xd3/0x990 [mlx5_core]
  [   31.373432]  mlx5e_napi_poll+0x9b/0xc60 [mlx5_core]
  [   31.378333]  ? __switch_to_asm+0x34/0x70
  [   31.382270]  ? __switch_to_asm+0x40/0x70
  [   31.386214]  ? __switch_to_asm+0x34/0x70
  [   31.391056]  ? __switch_to_asm+0x40/0x70
  [   31.395905]  ? __switch_to_asm+0x34/0x70
  [   31.400743]  net_rx_action+0x140/0x3a0
  [   31.405379]  ? __switch_to+0xad/0x500
  [   31.409887]  __do_softirq+0xe4/0x2bb
  [   31.414448]  run_ksoftirqd+0x2b/0x40
  [   31.418862]  smpboot_thread_fn+0xfc/0x170
  [   31.423700]  kthread+0x121/0x140
  [   31.427701]  ? sort_range+0x30/0x30
  [   31.432040]  ? kthread_create_worker_on_cpu+0x70/0x70
  [   31.437816]  ret_from_fork+0x35/0x40
  [   31.442219] Modules linked in: esp6 authenc echainiv xfrm6_mode_tunnel 
xfrm4_mode_tunnel xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 
af_key xfrm_algo ip_vti ip_tunnel ip6_vti ip6_tunnel tunnel6 8021q garp mrp stp 
llc bonding ipt_REJECT nf_reject_ipv4 nfnetlink_log n
  fnetlink xt_NFLOG xt_hl xt_limit xt_nat xt_TCPMSS xt_HL xt_comment xt_tcpudp 
xt_multiport xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4 
nf_defrag_ipv4 nf_nat_ipv4 nf_nat xt_connmark xt_mark iptable_mangle xt_CT 
nf_conntrack xt_addrtype iptable_raw bpfilter ipmi_ssif gpio_
  ich intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp 
kvm_intel kvm irqbypass intel_cstate intel_rapl_perf input_leds joydev mei_me 
intel_pch_thermal ioatdma mei lpc_ich ipmi_si ipmi_devintf ipmi_msghandler 
acpi_pad mac_hid sch_fq_codel
  [   31.519488]  ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid0 multipath linear mlx5_ib ib_uverbs ib
  _core raid1 hid_generic usbhid hid crct10dif_pclmul crc32_pclmul 
ghash_clmulni_intel ast pcbc ttm drm_kms_helper aesni_intel syscopyarea 
aes_x86_64 sysfillrect mxm_wmi crypto_simd sysimgblt cryptd glue_helper 
fb_sys_fops mlx5_core ixgbe igb mpt3sas drm ahci tls libahci i2c_algo_bit m
  lxfw raid_class dca devlink mdio scsi_transport_sas wmi
  [   31.578877] CR2: 0000000000000038
  [ 31.583249] ---[ end trace c4bada38847a0075 ]---

  Upgrading to mainline 4.18.17 seems to solve the issue. It's difficult
  to bissect as it doesn't happen often. 4.18.17 contains
  c473a489d4098969ffafda913e1ad71da31b1104 (xfrm: Fix NULL pointer
  dereference when skb_dst_force clears the dst_entry) but it doesn't
  match the stacktrace (stacktrace is input, patch is output and
  forward). There is also fdb06c787b34fd397f28f515105627307d615025
  (xfrm: Fix NULL pointer dereference when skb_dst_force clears the
  dst_entry) which is also in 4.17 and may better match the problem but
  I am unsure what it means to have several transformations (we use VTI
  interfaces, but other than that, we don't do anything fancy).

  Hardware is Mellanox ConnectX-4 Lx (no ESP offload).

  May I suggest upgrade 4.18 to 4.18.17 and to backport these two
  patches to Bionic 4.15?

  Thanks.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1802480/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to