** Description changed:

+ BugLink: https://bugs.launchpad.net/bugs/2139322
+ 
+ [Impact]
+ 
+ 
  Enable mlx5 ovs hardware offload on 6.8 kernel, we see different issues on 
our production environment,
  it only happens under real and heavy workloads.
  
  Issue 1, general protection fault:
  
  [75202.650580] general protection fault, probably for non-canonical address 
0x9cad655f9b42c237: 0000 [#1] PREEMPT SMP NOPTI
  [75202.661464] CPU: 29 PID: 0 Comm: swapper/29 Kdump: loaded Not tainted 
6.8.0-51-generic #52~22.04.1-Ubuntu
  [75202.671039] Hardware name: Dell Inc. PowerEdge R7525/0H3K7P, BIOS 2.15.2 
04/02/2024
  [75202.678701] RIP: 0010:kmalloc_trace+0xd7/0x360
  [75202.683158] Code: 83 78 10 00 48 8b 38 0f 84 36 02 00 00 48 85 ff 0f 84 2d 
02 00 00 41 8b 44 24 28 49 8b 9c 24 b8 00 00 00 49 8b 34 24 48 01 f8 <48> 33 18 
48 89 c1 48 89 f8 48 0f c9 48 31 cb 48 8d 8a 00 20 00 00
  [75202.701933] RSP: 0018:ffffabfc19a08990 EFLAGS: 00010282
  [75202.707166] RAX: 9cad655f9b42c237 RBX: 1c00e25717636e48 RCX: 
0000000000000000
  [75202.714310] RDX: 000000bec1e5c01d RSI: 000000000003b980 RDI: 
9cad655f9b42c1b7
  [75202.721449] RBP: ffffabfc19a089e0 R08: 0000000000000000 R09: 
0000000000000000
  [75202.728593] R10: ffffabfc19a08a00 R11: 0000000000000000 R12: 
ffff94db00050c00
  [75202.735735] R13: 0000000000000920 R14: 00000000000000d8 R15: 
0000000000000000
  [75202.742876] FS:  0000000000000000(0000) GS:ffff95da7cc80000(0000) 
knlGS:0000000000000000
  [75202.750971] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [75202.756722] CR2: 00007a5f6af90010 CR3: 0000010263b44002 CR4: 
0000000000f70ef0
  [75202.763866] PKRU: 55555554
  [75202.766581] Call Trace:
  [75202.769033]  <IRQ>
  [75202.771053]  ? show_regs+0x6d/0x80
  [75202.774483]  ? die_addr+0x37/0xa0
  [75202.777807]  ? exc_general_protection+0x1db/0x480
  [75202.782525]  ? asm_exc_general_protection+0x27/0x30
  [75202.787412]  ? kmalloc_trace+0xd7/0x360
  [75202.791261]  ? flow_offload_alloc+0x64/0x120 [nf_flow_table]
  [75202.796938]  flow_offload_alloc+0x64/0x120 [nf_flow_table]
  [75202.802431]  ? nf_conntrack_in+0x113/0x360 [nf_conntrack]
  [75202.807846]  ? flow_offload_alloc+0x64/0x120 [nf_flow_table]
  [75202.813517]  tcf_ct_flow_table_process_conn+0xc2/0x1e0 [act_ct]
  [75202.819444]  tcf_ct_act+0x6c8/0xae0 [act_ct]
  [75202.823726]  tcf_action_exec+0xbc/0x190
  [75202.827571]  __tcf_classify+0xcb/0x1f0
  [75202.831332]  tcf_classify+0xff/0x260
  [75202.834920]  tc_run+0xa3/0x110
  [75202.837987]  __netif_receive_skb_core.constprop.0+0x459/0xf90
  [75202.843744]  ? dev_gro_receive+0xc0/0x350
  [75202.847763]  ? srso_alias_return_thunk+0x5/0xfbef5
  [75202.852565]  ? napi_gro_receive+0x73/0x220
  [75202.856675]  __netif_receive_skb_list_core+0xfd/0x250
  [75202.861736]  netif_receive_skb_list_internal+0x1a3/0x2d0
  [75202.867056]  ? srso_alias_return_thunk+0x5/0xfbef5
  [75202.871858]  ? mlx5e_rx_cq_process_basic_cqe_comp+0x2f7/0x310 [mlx5_core]
  [75202.878752]  napi_complete_done+0x74/0x1c0
  [75202.882855]  mlx5e_napi_poll+0x190/0x7b0 [mlx5_core]
  [75202.887911]  __napi_poll+0x33/0x200
  [75202.891753]  net_rx_action+0x181/0x2e0
  [75202.895849]  handle_softirqs+0xdb/0x340
  [75202.900027]  __irq_exit_rcu+0xd9/0x100
  [75202.904103]  irq_exit_rcu+0xe/0x20
  [75202.907828]  common_interrupt+0xa4/0xb0
  [75202.911983]  </IRQ>
  [75202.914387]  <TASK>
  [75202.916786]  asm_common_interrupt+0x27/0x40
  [75202.921258] RIP: 0010:mwait_idle+0x50/0x80
  
  This is caused by use-after-free in slab (kmalloc-256).
  
- 
  Issue 2, soft lockup:
  
  [148720.717134] watchdog: BUG: soft lockup - CPU#3 stuck for 7923s! 
[swapper/3:0]
- [148720.725207] Modules linked in: act_csum act_pedit act_tunnel_key 
vhost_net vhost tap vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd xt_CT 
xt_tcpudp nft_compat nf_tables veth 
+ [148720.725207] Modules linked in: act_csum act_pedit act_tunnel_key 
vhost_net vhost tap vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd xt_CT 
xt_tcpudp nft_compat nf_tables veth
  act_ct nf_flow_table nf_conntrack_netlink nvme_fabrics nvme_keyring xfs 
dm_crypt act_skbedit act_vlan act_mirred cls_matchall geneve ip6_udp_tunnel 
udp_tunnel nfnetlink_cttimeout nfnet
  link act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nf_nat 
8021q garp mrp stp llc bonding sunrpc binfmt_misc nls_iso8859_1 mlx5_vdpa 
vringh vhost_iotlb vdpa intel_rapl_ms
  r intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm irqbypass rapl 
dell_wmi video ledtrig_audio sparse_keymap dell_smbios dcdbas 
dell_wmi_descriptor wmi_bmof ipmi_ssif ccp ptdma k1
  0temp acpi_power_meter ipmi_si acpi_ipmi ipmi_devintf ipmi_msghandler mac_hid 
dm_service_time sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua 
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 msr efi_pstore ip_tables x_tables 
autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov
  [148720.725328]  async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c mlx5_ib ib_uverbs macsec ib_core ses enclosure raid1 raid0 bcache 
mlx5_core crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic 
ghash_clmulni_intel mlxfw mpt3sas sha256_ssse3 nvme psample ahci sha1_ssse3 
raid_class tg3 nvme_core tls libahci xhci_pci mgag200 nvme_auth 
scsi_transport_sas i2c_algo_bit pci_hyperv_intf i2c_piix4 xhci_pci_renesas wmi 
aesni_intel crypto_simd cryptd
  [148720.725385] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G        
     L     6.8.0-57-generic #59~22.04.1-Ubuntu
  [148720.725388] Hardware name: Dell Inc. PowerEdge R7525/0H3K7P, BIOS 2.16.3 
09/10/2024
  [148720.725390] RIP: 0010:flow_offload_hash_cmp+0x1f/0x40 [nf_flow_table]
  [148720.725398] Code: 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 
8b 47 08 ba 32 00 00 00 48 8d 7e 08 48 89 c6 48 89 e5 e8 62 4a b6 fa 5d <85> c0 
0f 95 c0 0f b6 c0 31 d2 31 f6 31 ff e9 b9 3b ee fa 66 66 2e
  [148720.725401] RSP: 0018:ffffad9f403fc928 EFLAGS: 00000246
  [148720.725404] RAX: 0000000000000004 RBX: ffff8a8f9a3c3a40 RCX: 
0000000000000000
  [148720.725406] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
  [148720.725409] RBP: ffffad9f403fc990 R08: 0000000000000000 R09: 
000000000000003c
  [148720.725411] R10: 000000000000003c R11: 0000000000000000 R12: 
ffff89b49b080000
  [148720.725413] R13: 0000000000000000 R14: ffff89b49b09e6b8 R15: 
ffff89b2ba69ea58
  [148720.725415] FS:  0000000000000000(0000) GS:ffff8a8f3bf80000(0000) 
knlGS:0000000000000000
  [148720.725417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [148720.725419] CR2: 000056c0ae793900 CR3: 000000021d904002 CR4: 
0000000000f70ef0
  [148720.725421] PKRU: 55555554
  [148720.725423] Call Trace:
  [148720.725426]  <IRQ>
  [148720.725428]  ? show_regs+0x6d/0x80
  [148720.725435]  ? watchdog_timer_fn+0x206/0x290
  [148720.725441]  ? __pfx_watchdog_timer_fn+0x10/0x10
  [148720.725445]  ? __hrtimer_run_queues+0x112/0x2a0
  [148720.725450]  ? srso_alias_return_thunk+0x5/0xfbef5
  [148720.725457]  ? hrtimer_interrupt+0xf6/0x250
  [148720.725462]  ? __sysvec_apic_timer_interrupt+0x51/0x120
  [148720.725467]  ? sysvec_apic_timer_interrupt+0x3b/0xd0
  [148720.725473]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
  [148720.725479]  ? flow_offload_hash_cmp+0x1f/0x40 [nf_flow_table]
  [148720.725484]  ? flow_offload_lookup+0xb2/0x180 [nf_flow_table]
  [148720.725491]  tcf_ct_flow_table_lookup.isra.0+0x244/0x6b0 [act_ct]
  [148720.725494]  ? srso_alias_return_thunk+0x5/0xfbef5
  [148720.725499]  ? ovs_dp_process_packet+0x1af/0x220 [openvswitch]
  [148720.725518]  tcf_ct_act+0x23d/0xae0 [act_ct]
  [148720.725524]  tcf_action_exec+0xbc/0x190
  [148720.725531]  __tcf_classify+0xcb/0x1f0
  [148720.725535]  tcf_classify+0xff/0x260
  [148720.725539]  tc_run+0xa3/0x110
  [148720.725543]  ? srso_alias_return_thunk+0x5/0xfbef5
  [148720.725547]  __netif_receive_skb_core.constprop.0+0x459/0xf90
  [148720.725552]  ? dev_gro_receive+0x150/0x350
  [148720.725557]  ? srso_alias_return_thunk+0x5/0xfbef5
  [148720.725560]  ? napi_gro_receive+0x73/0x220
  [148720.725564]  __netif_receive_skb_list_core+0xfd/0x250
  [148720.725569]  netif_receive_skb_list_internal+0x1a3/0x2d0
  [148720.725573]  ? srso_alias_return_thunk+0x5/0xfbef5
  [148720.725578]  ? mlx5e_rx_cq_process_basic_cqe_comp+0x2f7/0x310 [mlx5_core]
  [148720.725688]  napi_complete_done+0x74/0x1c0
  [148720.725693]  mlx5e_napi_poll+0x190/0x7b0 [mlx5_core]
  [148720.725782]  __napi_poll+0x33/0x200
  [148720.725786]  net_rx_action+0x181/0x2e0
  [148720.725792]  handle_softirqs+0xdb/0x340
  [148720.725799]  __irq_exit_rcu+0xd9/0x100
  [148720.725802]  irq_exit_rcu+0xe/0x20
  
  before soft lockup, we see some error messages from mlx5, e.g.:
  
  [486111.016058] mlx5_core 0000:41:00.1 ens3f1: NETDEV WATCHDOG: CPU: 119: 
transmit queue 0 timed out 17547 ms
  [486111.025773] mlx5_core 0000:41:00.1 ens3f1: TX timeout detected
  [486111.031726] mlx5_core 0000:41:00.1 ens3f1: TX timeout on queue: 0, SQ: 
0x11d0, CQ: 0x1487, SQ Cons: 0xae7a SQ Prod: 0xaec3, usecs since last trans: 
17562000
  [486111.045845] mlx5_core 0000:41:00.1 ens3f1: EQ 0x7: Cons = 0x8ac57014, 
irqn = 0x5f5
  
- 
  Kernel cmdline:
  GRUB_CMDLINE_LINUX_DEFAULT="console=tty0 console=ttyS0,115200n8 
nvme_core.multipath=0 amd_iommu=on iommu=pt probe_vf=0 
transparent_hugepage=never hugepagesz=1G hugepages=1536 default_hugepagesz=1G"
+ 
+ [Fix]
+ 
+ This upstream commit fixes it:
+ 
+ commit 03428ca5cee9f0792edc996c06ce4514816af1fb
+ Author: Florian Westphal <[email protected]>
+ Date:   Tue Jan 14 00:50:36 2025 +0100
+ 
+     netfilter: conntrack: rework offload nf_conn timeout extension logic
+ 
+ This patch fixes ct use-after-free and packet gets stuck issues, which
+ should be related to the above two call traces.
+ 
+ 
+ [Test Plan]
+ 
+ This issue can only be reproduced on our production environment with mlx5 NIC 
and ovs hw-offload enabled.
+ We need to run the kernel on the environment for few weeks to confirm it's 
fixed.
+ 
+ [Where problems could occur]
+ 
+ The patch makes sure to take a refcount on ct and test offload bits, it could 
prevent ct being used after it's removed.
+ And also modifies flow offload teardown logic, if there is anything wrong, 
the ovs flow offload might be broken.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2139322

Title:
  Enable mlx5 ovs hardware offload causes multiple issues

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2139322/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to