Public bug reported:

Hi, cross posting this from
https://github.com/canonical/lxd/issues/12161

I've got a lxd cluster running across 3 VMs using the fan bridge. I'm
using a dev revision of LXD based on 6413a948. Creating a container
causes the trace in the attached syslog snippet; this causes the
container creation process to hang indefinitely. ssh logins, `lxc shell
cluster1`, and `ps -aux` also hang.

Apr 29 17:15:01 cluster1 kernel: [  161.250951] ------------[ cut here 
]------------
Apr 29 17:15:01 cluster1 kernel: [  161.250957] Voluntary context switch within 
RCU read-side critical section!
Apr 29 17:15:01 cluster1 kernel: [  161.250990] WARNING: CPU: 2 PID: 510 at 
kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [  161.251003] Modules linked in: nft_masq 
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 vxlan 
ip6_udp_tunnel udp_tunnel dummy br
idge stp llc zfs(PO) spl(O) nf_tables libcrc32c nfnetlink vhost_vsock vhost 
vhost_iotlb binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common 
kvm_intel kvm irqbypass crct10dif
_pclmul crc32_pclmul virtio_gpu polyval_clmulni polyval_generic 
ghash_clmulni_intel sha256_ssse3 sha1_ssse3 virtio_dma_buf aesni_intel 
vmw_vsock_virtio_transport 9pnet_virtio xhci_
pci drm_shmem_helper i2c_i801 ahci 9pnet vmw_vsock_virtio_transport_common 
xhci_pci_renesas drm_kms_helper libahci crypto_simd joydev virtio_input cryptd 
lpc_ich virtiofs i2c_smbus
 vsock psmouse input_leds mac_hid serio_raw rapl qemu_fw_cfg vmgenid nfsd 
dm_multipath auth_rpcgss scsi_dh_rdac nfs_acl lockd scsi_dh_emc scsi_dh_alua 
grace sch_fq_codel drm sunrpc
 efi_pstore virtio_rng ip_tables x_tables autofs4
Apr 29 17:15:01 cluster1 kernel: [  161.251085] CPU: 2 PID: 510 Comm: nmbd 
Tainted: P           O       6.5.0-28-generic #29~22.04.1-Ubuntu
Apr 29 17:15:01 cluster1 kernel: [  161.251089] Hardware name: QEMU Standard PC 
(Q35 + ICH9, 2009)/LXD, BIOS unknown 2/2/2022
Apr 29 17:15:01 cluster1 kernel: [  161.251091] RIP: 
0010:rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [  161.251095] Code: 08 f0 83 44 24 fc 00 48 
89 de 4c 89 f7 e8 d1 af ff ff e9 1e fe ff ff 48 c7 c7 d0 60 56 88 c6 05 e6 27 
40 02 01 e8 79 b2 f2 ff
<0f> 0b e9 bd fd ff ff a9 ff ff ff 7f 0f 84 75 fe ff ff 65 48 8b 3c
Apr 29 17:15:01 cluster1 kernel: [  161.251098] RSP: 0018:ffffb9cbc11dbbc8 
EFLAGS: 00010046
Apr 29 17:15:01 cluster1 kernel: [  161.251101] RAX: 0000000000000000 RBX: 
ffff941ef7cb3f80 RCX: 0000000000000000
Apr 29 17:15:01 cluster1 kernel: [  161.251103] RDX: 0000000000000000 RSI: 
0000000000000000 RDI: 0000000000000000
Apr 29 17:15:01 cluster1 kernel: [  161.251104] RBP: ffffb9cbc11dbbe8 R08: 
0000000000000000 R09: 0000000000000000
Apr 29 17:15:01 cluster1 kernel: [  161.251106] R10: 0000000000000000 R11: 
0000000000000000 R12: 0000000000000000
Apr 29 17:15:01 cluster1 kernel: [  161.251111] R13: ffff941d893e9980 R14: 
0000000000000000 R15: ffff941d80ad7a80
Apr 29 17:15:01 cluster1 kernel: [  161.251113] FS:  00007c7dcbdb8a00(0000) 
GS:ffff941ef7c80000(0000) knlGS:0000000000000000
Apr 29 17:15:01 cluster1 kernel: [  161.251115] CS:  0010 DS: 0000 ES: 0000 
CR0: 0000000080050033
Apr 29 17:15:01 cluster1 kernel: [  161.251117] CR2: 00005a30877ae488 CR3: 
0000000105888003 CR4: 0000000000170ee0
Apr 29 17:15:01 cluster1 kernel: [  161.251122] Call Trace:
Apr 29 17:15:01 cluster1 kernel: [  161.251128]  <TASK>
Apr 29 17:15:01 cluster1 kernel: [  161.251133]  ? show_regs+0x6d/0x80
Apr 29 17:15:01 cluster1 kernel: [  161.251145]  ? __warn+0x89/0x160
Apr 29 17:15:01 cluster1 kernel: [  161.251152]  ? 
rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [  161.251155]  ? report_bug+0x17e/0x1b0
Apr 29 17:15:01 cluster1 kernel: [  161.251172]  ? handle_bug+0x46/0x90
Apr 29 17:15:01 cluster1 kernel: [  161.251187]  ? exc_invalid_op+0x18/0x80
Apr 29 17:15:01 cluster1 kernel: [  161.251190]  ? asm_exc_invalid_op+0x1b/0x20
Apr 29 17:15:01 cluster1 kernel: [  161.251202]  ? 
rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [  161.251205]  ? 
rcu_note_context_switch+0x2a7/0x2f0
Apr 29 17:15:01 cluster1 kernel: [  161.251208]  __schedule+0xcc/0x750
Apr 29 17:15:01 cluster1 kernel: [  161.251218]  schedule+0x63/0x110
Apr 29 17:15:01 cluster1 kernel: [  161.251222]  
schedule_hrtimeout_range_clock+0xbc/0x130
Apr 29 17:15:01 cluster1 kernel: [  161.251238]  ? 
__pfx_hrtimer_wakeup+0x10/0x10
Apr 29 17:15:01 cluster1 kernel: [  161.251245]  
schedule_hrtimeout_range+0x13/0x30
Apr 29 17:15:01 cluster1 kernel: [  161.251248]  ep_poll+0x33f/0x390
Apr 29 17:15:01 cluster1 kernel: [  161.251254]  ? 
__pfx_ep_autoremove_wake_function+0x10/0x10
Apr 29 17:15:01 cluster1 kernel: [  161.251257]  do_epoll_wait+0xdb/0x100
Apr 29 17:15:01 cluster1 kernel: [  161.251259]  __x64_sys_epoll_wait+0x6f/0x110
Apr 29 17:15:01 cluster1 kernel: [  161.251265]  do_syscall_64+0x5b/0x90
Apr 29 17:15:01 cluster1 kernel: [  161.251270]  ? do_epoll_ctl+0x3cb/0x860
Apr 29 17:15:01 cluster1 kernel: [  161.251273]  ? __task_pid_nr_ns+0x6c/0xc0
Apr 29 17:15:01 cluster1 kernel: [  161.251279]  ? 
exit_to_user_mode_prepare+0x30/0xb0
Apr 29 17:15:01 cluster1 kernel: [  161.251284]  ? 
syscall_exit_to_user_mode+0x37/0x60
Apr 29 17:15:01 cluster1 kernel: [  161.251286]  ? do_syscall_64+0x67/0x90
Apr 29 17:15:01 cluster1 kernel: [  161.251288]  ? 
syscall_exit_to_user_mode+0x37/0x60
Apr 29 17:15:01 cluster1 kernel: [  161.251300]  ? do_syscall_64+0x67/0x90
Apr 29 17:15:01 cluster1 kernel: [  161.251304]  ? 
syscall_exit_to_user_mode+0x37/0x60
Apr 29 17:15:01 cluster1 kernel: [  161.251306]  ? do_syscall_64+0x67/0x90
Apr 29 17:15:01 cluster1 kernel: [  161.251309]  ? do_syscall_64+0x67/0x90
Apr 29 17:15:01 cluster1 kernel: [  161.251313]  
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Apr 29 17:15:01 cluster1 kernel: [  161.251316] RIP: 0033:0x7c7dcf325dea
Apr 29 17:15:01 cluster1 kernel: [  161.251333] Code: 48 83 c8 ff c3 66 2e 0f 
1f 84 00 00 00 00 00 90 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 
15 b8 e8 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28 
89 54 24 18
Apr 29 17:15:01 cluster1 kernel: [  161.251335] RSP: 002b:00007ffdde5e0278 
EFLAGS: 00000246 ORIG_RAX: 00000000000000e8
Apr 29 17:15:01 cluster1 kernel: [  161.251338] RAX: ffffffffffffffda RBX: 
00005a30877a2ea0 RCX: 00007c7dcf325dea
Apr 29 17:15:01 cluster1 kernel: [  161.251340] RDX: 0000000000000001 RSI: 
00007ffdde5e02ac RDI: 0000000000000005
Apr 29 17:15:01 cluster1 kernel: [  161.251341] RBP: 00005a3087794590 R08: 
00000000000f423f R09: 00007ffdde5e0357
Apr 29 17:15:01 cluster1 kernel: [  161.251343] R10: 00000000000003e8 R11: 
0000000000000246 R12: 00005a30877a2f30
Apr 29 17:15:01 cluster1 kernel: [  161.251345] R13: 00000000000003e8 R14: 
0000000000000090 R15: 000000000000000a
Apr 29 17:15:01 cluster1 kernel: [  161.251348]  </TASK>
Apr 29 17:15:01 cluster1 kernel: [  161.251349] ---[ end trace 0000000000000000 
]---

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "syslog"
   https://bugs.launchpad.net/bugs/2064176/+attachment/5772646/+files/syslog

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2064176

Title:
  LXD fan bridge causes blocked tasks

Status in linux package in Ubuntu:
  New

Bug description:
  Hi, cross posting this from
  https://github.com/canonical/lxd/issues/12161

  I've got a lxd cluster running across 3 VMs using the fan bridge. I'm
  using a dev revision of LXD based on 6413a948. Creating a container
  causes the trace in the attached syslog snippet; this causes the
  container creation process to hang indefinitely. ssh logins, `lxc
  shell cluster1`, and `ps -aux` also hang.

  Apr 29 17:15:01 cluster1 kernel: [  161.250951] ------------[ cut here 
]------------
  Apr 29 17:15:01 cluster1 kernel: [  161.250957] Voluntary context switch 
within RCU read-side critical section!
  Apr 29 17:15:01 cluster1 kernel: [  161.250990] WARNING: CPU: 2 PID: 510 at 
kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x2a7/0x2f0
  Apr 29 17:15:01 cluster1 kernel: [  161.251003] Modules linked in: nft_masq 
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 vxlan 
ip6_udp_tunnel udp_tunnel dummy br
  idge stp llc zfs(PO) spl(O) nf_tables libcrc32c nfnetlink vhost_vsock vhost 
vhost_iotlb binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common 
kvm_intel kvm irqbypass crct10dif
  _pclmul crc32_pclmul virtio_gpu polyval_clmulni polyval_generic 
ghash_clmulni_intel sha256_ssse3 sha1_ssse3 virtio_dma_buf aesni_intel 
vmw_vsock_virtio_transport 9pnet_virtio xhci_
  pci drm_shmem_helper i2c_i801 ahci 9pnet vmw_vsock_virtio_transport_common 
xhci_pci_renesas drm_kms_helper libahci crypto_simd joydev virtio_input cryptd 
lpc_ich virtiofs i2c_smbus
   vsock psmouse input_leds mac_hid serio_raw rapl qemu_fw_cfg vmgenid nfsd 
dm_multipath auth_rpcgss scsi_dh_rdac nfs_acl lockd scsi_dh_emc scsi_dh_alua 
grace sch_fq_codel drm sunrpc
   efi_pstore virtio_rng ip_tables x_tables autofs4
  Apr 29 17:15:01 cluster1 kernel: [  161.251085] CPU: 2 PID: 510 Comm: nmbd 
Tainted: P           O       6.5.0-28-generic #29~22.04.1-Ubuntu
  Apr 29 17:15:01 cluster1 kernel: [  161.251089] Hardware name: QEMU Standard 
PC (Q35 + ICH9, 2009)/LXD, BIOS unknown 2/2/2022
  Apr 29 17:15:01 cluster1 kernel: [  161.251091] RIP: 
0010:rcu_note_context_switch+0x2a7/0x2f0
  Apr 29 17:15:01 cluster1 kernel: [  161.251095] Code: 08 f0 83 44 24 fc 00 48 
89 de 4c 89 f7 e8 d1 af ff ff e9 1e fe ff ff 48 c7 c7 d0 60 56 88 c6 05 e6 27 
40 02 01 e8 79 b2 f2 ff
  <0f> 0b e9 bd fd ff ff a9 ff ff ff 7f 0f 84 75 fe ff ff 65 48 8b 3c
  Apr 29 17:15:01 cluster1 kernel: [  161.251098] RSP: 0018:ffffb9cbc11dbbc8 
EFLAGS: 00010046
  Apr 29 17:15:01 cluster1 kernel: [  161.251101] RAX: 0000000000000000 RBX: 
ffff941ef7cb3f80 RCX: 0000000000000000
  Apr 29 17:15:01 cluster1 kernel: [  161.251103] RDX: 0000000000000000 RSI: 
0000000000000000 RDI: 0000000000000000
  Apr 29 17:15:01 cluster1 kernel: [  161.251104] RBP: ffffb9cbc11dbbe8 R08: 
0000000000000000 R09: 0000000000000000
  Apr 29 17:15:01 cluster1 kernel: [  161.251106] R10: 0000000000000000 R11: 
0000000000000000 R12: 0000000000000000
  Apr 29 17:15:01 cluster1 kernel: [  161.251111] R13: ffff941d893e9980 R14: 
0000000000000000 R15: ffff941d80ad7a80
  Apr 29 17:15:01 cluster1 kernel: [  161.251113] FS:  00007c7dcbdb8a00(0000) 
GS:ffff941ef7c80000(0000) knlGS:0000000000000000
  Apr 29 17:15:01 cluster1 kernel: [  161.251115] CS:  0010 DS: 0000 ES: 0000 
CR0: 0000000080050033
  Apr 29 17:15:01 cluster1 kernel: [  161.251117] CR2: 00005a30877ae488 CR3: 
0000000105888003 CR4: 0000000000170ee0
  Apr 29 17:15:01 cluster1 kernel: [  161.251122] Call Trace:
  Apr 29 17:15:01 cluster1 kernel: [  161.251128]  <TASK>
  Apr 29 17:15:01 cluster1 kernel: [  161.251133]  ? show_regs+0x6d/0x80
  Apr 29 17:15:01 cluster1 kernel: [  161.251145]  ? __warn+0x89/0x160
  Apr 29 17:15:01 cluster1 kernel: [  161.251152]  ? 
rcu_note_context_switch+0x2a7/0x2f0
  Apr 29 17:15:01 cluster1 kernel: [  161.251155]  ? report_bug+0x17e/0x1b0
  Apr 29 17:15:01 cluster1 kernel: [  161.251172]  ? handle_bug+0x46/0x90
  Apr 29 17:15:01 cluster1 kernel: [  161.251187]  ? exc_invalid_op+0x18/0x80
  Apr 29 17:15:01 cluster1 kernel: [  161.251190]  ? 
asm_exc_invalid_op+0x1b/0x20
  Apr 29 17:15:01 cluster1 kernel: [  161.251202]  ? 
rcu_note_context_switch+0x2a7/0x2f0
  Apr 29 17:15:01 cluster1 kernel: [  161.251205]  ? 
rcu_note_context_switch+0x2a7/0x2f0
  Apr 29 17:15:01 cluster1 kernel: [  161.251208]  __schedule+0xcc/0x750
  Apr 29 17:15:01 cluster1 kernel: [  161.251218]  schedule+0x63/0x110
  Apr 29 17:15:01 cluster1 kernel: [  161.251222]  
schedule_hrtimeout_range_clock+0xbc/0x130
  Apr 29 17:15:01 cluster1 kernel: [  161.251238]  ? 
__pfx_hrtimer_wakeup+0x10/0x10
  Apr 29 17:15:01 cluster1 kernel: [  161.251245]  
schedule_hrtimeout_range+0x13/0x30
  Apr 29 17:15:01 cluster1 kernel: [  161.251248]  ep_poll+0x33f/0x390
  Apr 29 17:15:01 cluster1 kernel: [  161.251254]  ? 
__pfx_ep_autoremove_wake_function+0x10/0x10
  Apr 29 17:15:01 cluster1 kernel: [  161.251257]  do_epoll_wait+0xdb/0x100
  Apr 29 17:15:01 cluster1 kernel: [  161.251259]  
__x64_sys_epoll_wait+0x6f/0x110
  Apr 29 17:15:01 cluster1 kernel: [  161.251265]  do_syscall_64+0x5b/0x90
  Apr 29 17:15:01 cluster1 kernel: [  161.251270]  ? do_epoll_ctl+0x3cb/0x860
  Apr 29 17:15:01 cluster1 kernel: [  161.251273]  ? __task_pid_nr_ns+0x6c/0xc0
  Apr 29 17:15:01 cluster1 kernel: [  161.251279]  ? 
exit_to_user_mode_prepare+0x30/0xb0
  Apr 29 17:15:01 cluster1 kernel: [  161.251284]  ? 
syscall_exit_to_user_mode+0x37/0x60
  Apr 29 17:15:01 cluster1 kernel: [  161.251286]  ? do_syscall_64+0x67/0x90
  Apr 29 17:15:01 cluster1 kernel: [  161.251288]  ? 
syscall_exit_to_user_mode+0x37/0x60
  Apr 29 17:15:01 cluster1 kernel: [  161.251300]  ? do_syscall_64+0x67/0x90
  Apr 29 17:15:01 cluster1 kernel: [  161.251304]  ? 
syscall_exit_to_user_mode+0x37/0x60
  Apr 29 17:15:01 cluster1 kernel: [  161.251306]  ? do_syscall_64+0x67/0x90
  Apr 29 17:15:01 cluster1 kernel: [  161.251309]  ? do_syscall_64+0x67/0x90
  Apr 29 17:15:01 cluster1 kernel: [  161.251313]  
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
  Apr 29 17:15:01 cluster1 kernel: [  161.251316] RIP: 0033:0x7c7dcf325dea
  Apr 29 17:15:01 cluster1 kernel: [  161.251333] Code: 48 83 c8 ff c3 66 2e 0f 
1f 84 00 00 00 00 00 90 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 
15 b8 e8 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5e c3 0f 1f 44 00 00 48 83 ec 28 
89 54 24 18
  Apr 29 17:15:01 cluster1 kernel: [  161.251335] RSP: 002b:00007ffdde5e0278 
EFLAGS: 00000246 ORIG_RAX: 00000000000000e8
  Apr 29 17:15:01 cluster1 kernel: [  161.251338] RAX: ffffffffffffffda RBX: 
00005a30877a2ea0 RCX: 00007c7dcf325dea
  Apr 29 17:15:01 cluster1 kernel: [  161.251340] RDX: 0000000000000001 RSI: 
00007ffdde5e02ac RDI: 0000000000000005
  Apr 29 17:15:01 cluster1 kernel: [  161.251341] RBP: 00005a3087794590 R08: 
00000000000f423f R09: 00007ffdde5e0357
  Apr 29 17:15:01 cluster1 kernel: [  161.251343] R10: 00000000000003e8 R11: 
0000000000000246 R12: 00005a30877a2f30
  Apr 29 17:15:01 cluster1 kernel: [  161.251345] R13: 00000000000003e8 R14: 
0000000000000090 R15: 000000000000000a
  Apr 29 17:15:01 cluster1 kernel: [  161.251348]  </TASK>
  Apr 29 17:15:01 cluster1 kernel: [  161.251349] ---[ end trace 
0000000000000000 ]---

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2064176/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to