> With the mainline kernel 6.16.rc2, we are unable to test due to other
issues.

> We are experiencing continuous drive reconnects just after running the
test.

Can you dump all the system logs starting from the boot-up process using
`journalctl -b > syslog.log`? It seems like the clue might be in an
earlier point than the logs you provided (apport will collect "recent
enough" data, not the complete log).

By the way, is the rootfs of this system also on the NVMeOF remote
mount?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2112304

Title:
  All Drives disconnected for both the paths when maim (Medusa IO) is
  blocked continuously followed by automatic network config reset

Status in linux package in Ubuntu:
  New

Bug description:
  Steps:

  Connect the hosts to the NVMeOF enclosure and run the IOs on the
  drives.

  Observation :

  Observing the IO Tool process got blocked continuously, followed by
  all drives connected.

  *From Syslog :*
  {noformat}
  2025-05-18T02:49:10.268427+05:30 blr-r29-26u kernel: INFO: task maim:1009904 
blocked for more than 122 seconds.
  2025-05-18T02:49:10.268464+05:30 blr-r29-26u kernel:       Tainted: P         
  O       6.8.0-55-generic #57-Ubuntu
  2025-05-18T02:49:10.268490+05:30 blr-r29-26u kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
  2025-05-18T02:49:10.270665+05:30 blr-r29-26u kernel: task:maim            
state:D stack:0     pid:1009904 tgid:1004274 ppid:975377 flags:0x00004002
  2025-05-18T02:49:10.270691+05:30 blr-r29-26u kernel: Call Trace:
  2025-05-18T02:49:10.270695+05:30 blr-r29-26u kernel:  <TASK>
  2025-05-18T02:49:10.270698+05:30 blr-r29-26u kernel:  __schedule+0x27c/0x6b0
  2025-05-18T02:49:10.270700+05:30 blr-r29-26u kernel:  schedule+0x33/0x110
  2025-05-18T02:49:10.270703+05:30 blr-r29-26u kernel:  io_schedule+0x46/0x80
  2025-05-18T02:49:10.270709+05:30 blr-r29-26u kernel:  
folio_wait_bit_common+0x136/0x330
  2025-05-18T02:49:10.270713+05:30 blr-r29-26u kernel:  ? 
__pfx_wake_page_function+0x10/0x10
  2025-05-18T02:49:10.270715+05:30 blr-r29-26u kernel:  folio_wait_bit+0x18/0x30
  2025-05-18T02:49:10.270718+05:30 blr-r29-26u kernel:  
folio_wait_writeback+0x2b/0xa0
  2025-05-18T02:49:10.270720+05:30 blr-r29-26u kernel:  
__filemap_fdatawait_range+0x93/0x110
  2025-05-18T02:49:10.270723+05:30 blr-r29-26u kernel:  
file_write_and_wait_range+0x93/0xc0
  2025-05-18T02:49:10.270727+05:30 blr-r29-26u kernel:  
ext4_sync_file+0x8d/0x380
  2025-05-18T02:49:10.270730+05:30 blr-r29-26u kernel:  
vfs_fsync_range+0x4b/0xa0
  2025-05-18T02:49:10.270733+05:30 blr-r29-26u kernel:  ? __fdget+0xc7/0xf0
  2025-05-18T02:49:10.270736+05:30 blr-r29-26u kernel:  
__x64_sys_fsync+0x3c/0x70
  2025-05-18T02:49:10.270738+05:30 blr-r29-26u kernel:  
x64_sys_call+0x2550/0x25a0
  2025-05-18T02:49:10.270741+05:30 blr-r29-26u kernel:  do_syscall_64+0x7f/0x180
  2025-05-18T02:49:10.270745+05:30 blr-r29-26u kernel:  ? 
filemap_get_entry+0xe5/0x160
  2025-05-18T02:49:10.270748+05:30 blr-r29-26u kernel:  ? 
__block_commit_write+0x82/0xc0
  2025-05-18T02:49:10.270752+05:30 blr-r29-26u kernel:  ? 
block_write_end+0x4a/0xd0
  2025-05-18T02:49:10.270756+05:30 blr-r29-26u kernel:  ? 
copy_page_from_iter_atomic+0xed/0x690
  2025-05-18T02:49:10.270758+05:30 blr-r29-26u kernel:  ? 
radix_tree_lookup+0xd/0x20
  2025-05-18T02:49:10.270760+05:30 blr-r29-26u kernel:  ? 
balance_dirty_pages_ratelimited_flags+0x140/0x3b0
  2025-05-18T02:49:10.270763+05:30 blr-r29-26u kernel:  ? 
balance_dirty_pages_ratelimited+0x10/0x20
  2025-05-18T02:49:10.270765+05:30 blr-r29-26u kernel:  ? 
generic_perform_write+0x155/0x230
  2025-05-18T02:49:10.270766+05:30 blr-r29-26u kernel:  ? vfs_write+0x325/0x480
  2025-05-18T02:49:10.270768+05:30 blr-r29-26u kernel:  ? vfs_write+0x325/0x480
  2025-05-18T02:49:10.270773+05:30 blr-r29-26u kernel:  ? 
__f_unlock_pos+0x12/0x20
  2025-05-18T02:49:10.270774+05:30 blr-r29-26u kernel:  ? ksys_write+0xe6/0x100
  2025-05-18T02:49:10.270776+05:30 blr-r29-26u kernel:  ? 
syscall_exit_to_user_mode+0x86/0x260
  2025-05-18T02:49:10.270779+05:30 blr-r29-26u kernel:  ? 
do_syscall_64+0x8c/0x180
  2025-05-18T02:49:10.270780+05:30 blr-r29-26u kernel:  ? 
irqentry_exit_to_user_mode+0x7b/0x260
  2025-05-18T02:49:10.270781+05:30 blr-r29-26u kernel:  ? 
irqentry_exit+0x43/0x50
  2025-05-18T02:49:10.270783+05:30 blr-r29-26u kernel:  ? 
common_interrupt+0x54/0xb0
  2025-05-18T02:49:10.270785+05:30 blr-r29-26u kernel:  
entry_SYSCALL_64_after_hwframe+0x78/0x80
  2025-05-18T02:49:10.270789+05:30 blr-r29-26u kernel: RIP: 0033:0x7207f251ee0c
  2025-05-18T02:49:10.270791+05:30 blr-r29-26u kernel: RSP: 
002b:000072068b7fc3c0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
  2025-05-18T02:49:10.270793+05:30 blr-r29-26u kernel: RAX: ffffffffffffffda 
RBX: 000072068b7fc440 RCX: 00007207f251ee0c
  2025-05-18T02:49:10.270797+05:30 blr-r29-26u kernel: RDX: 0000000000000000 
RSI: 0000000000000000 RDI: 0000000000000308
  2025-05-18T02:49:10.270799+05:30 blr-r29-26u kernel: RBP: 000072068b7fc3d0 
R08: 0000000000000000 R09: 000072068b7fc440
  2025-05-18T02:49:10.270803+05:30 blr-r29-26u kernel: R10: 00000000196e4390 
R11: 0000000000000293 R12: 000072068b7fc4f0
  2025-05-18T02:49:10.270806+05:30 blr-r29-26u kernel: R13: 00000000196e4390 
R14: 00000000196e4398 R15: 000000000000004f
  2025-05-18T02:49:10.270811+05:30 blr-r29-26u kernel:  </TASK>
  {noformat}
  After continuous maim process block able to see network config got reset 
which leads to loss of RNIC IPs.

  *From Syslog :*
  {noformat}
  2025-05-18T03:01:44.164571+05:30 blr-r29-26u kernel: systemd[1]: 
systemd-journald.service: Main process exited, code=killed, status=9/KILL
  2025-05-18T03:01:05.264346+05:30 blr-r29-26u systemd[1]: 
systemd-networkd.service: Failed with result 'timeout'.
  2025-05-18T03:01:05.264988+05:30 blr-r29-26u systemd[1]: Failed to start 
systemd-networkd.service - Network Configuration.
  2025-05-18T03:01:05.266544+05:30 blr-r29-26u systemd[1]: 
systemd-networkd.service: Scheduled restart job, restart counter is at 2.
  2025-05-18T03:01:05.266758+05:30 blr-r29-26u systemd[1]: 
netplan-ovs-cleanup.service - OpenVSwitch configuration for cleanup was skipped 
because of an unmet condition check 
(ConditionFileIsExecutable=/usr/bin/ovs-vsctl).
  2025-05-18T03:01:05.290811+05:30 blr-r29-26u systemd[1]: Starting 
systemd-networkd.service - Network Configuration...
  2025-05-18T03:01:05.418246+05:30 blr-r29-26u systemd-networkd[1010564]: lo: 
Link UP
  2025-05-18T03:01:05.418503+05:30 blr-r29-26u systemd-networkd[1010564]: lo: 
Gained carrier
  2025-05-18T03:01:05.418725+05:30 blr-r29-26u systemd-networkd[1010564]: eno1: 
Link UP
  2025-05-18T03:01:44.164663+05:30 blr-r29-26u kernel: systemd[1]: 
systemd-journald.service: Failed with result 'timeout'.
  2025-05-18T03:01:44.164665+05:30 blr-r29-26u kernel: systemd[1]: Failed to 
start systemd-journald.service - Journal Service.
  2025-05-18T03:01:44.164667+05:30 blr-r29-26u kernel: systemd[1]: 
systemd-journald.service: Scheduled restart job, restart counter is at 2.
  2025-05-18T03:01:05.422331+05:30 blr-r29-26u systemd-networkd[1010564]: eno2: 
Link UP
  2025-05-18T03:01:44.164671+05:30 blr-r29-26u kernel: systemd[1]: Starting 
systemd-journald.service - Journal Service...
  2025-05-18T03:01:44.164672+05:30 blr-r29-26u kernel: 
systemd-journald[1010540]: Collecting audit messages is disabled.
  2025-05-18T03:01:44.164674+05:30 blr-r29-26u kernel: 
systemd-journald[1010540]: File 
/var/log/journal/7c9e984161af4278a958cf98d0e66061/system.journal corrupted or 
uncleanly shut down, renaming and replacing.
  2025-05-18T03:01:05.423549+05:30 blr-r29-26u systemd-networkd[1010564]: eno2: 
Gained carrier
  2025-05-18T03:01:44.164676+05:30 blr-r29-26u kernel: systemd[1]: 
systemd-networkd.service: State 'final-sigterm' timed out. Killing.
  2025-05-18T03:01:44.164684+05:30 blr-r29-26u kernel: systemd[1]: 
systemd-networkd.service: Killing process 1010382 (systemd-network) with signal 
SIGKILL.
  2025-05-18T03:01:05.426972+05:30 blr-r29-26u systemd-networkd[1010564]: eno3: 
Link UP
  2025-05-18T03:01:44.164685+05:30 blr-r29-26u kernel: systemd[1]: 
snapd.service: Processes still around after SIGKILL. Ignoring.
  2025-05-18T03:01:05.430531+05:30 blr-r29-26u systemd-networkd[1010564]: eno4: 
Link UP
  2025-05-18T03:01:44.164697+05:30 blr-r29-26u kernel: systemd[1]: Started 
systemd-journald.service - Journal Service.
  2025-05-18T03:01:44.164706+05:30 blr-r29-26u kernel: nvme nvme42: I/O tag 0 
(e000) type 4 opcode 0x18 (Admin Cmd) QID 0 timeout
  2025-05-18T03:01:44.164707+05:30 blr-r29-26u kernel: nvme nvme42: starting 
error recovery
  2025-05-18T03:01:05.432624+05:30 blr-r29-26u systemd-networkd[1010564]: 
enp130s0f0np0: Link UP
  2025-05-18T03:01:44.164714+05:30 blr-r29-26u kernel: nvme nvme42: failed 
nvme_keep_alive_end_io error=10
  2025-05-18T03:01:05.432740+05:30 blr-r29-26u systemd-networkd[1010564]: 
enp130s0f0np0: Gained carrier
  2025-05-18T03:01:44.164724+05:30 blr-r29-26u kernel: nvme nvme42: 
Reconnecting in 10 seconds...
  {noformat}
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Jun 10 06:16 seq
   crw-rw---- 1 root audio 116, 33 Jun 10 06:16 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.28.1-0ubuntu3.7
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/timer', 
'/dev/snd/seq'] failed with exit code 1:
  CRDA: N/A
  CasperMD5CheckResult: unknown
  DistroRelease: Ubuntu 24.04
  InstallationDate: Installed on 2024-04-22 (414 days ago)
  InstallationMedia: Ubuntu-Server 24.04 LTS "Noble Numbat" - Beta amd64 
(20240410.1)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb:
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
   Bus 001 Device 002: ID 8087:800a Intel Corp. Hub
   Bus 001 Device 003: ID 413c:a001 Dell Computer Corp. Hub
   Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
   Bus 002 Device 002: ID 8087:8002 Intel Corp. 8 channel internal hub
  Lsusb-t:
   /:  Bus 001.Port 001: Dev 001, Class=root_hub, Driver=ehci-pci/2p, 480M
       |__ Port 001: Dev 002, If 0, Class=Hub, Driver=hub/6p, 480M
           |__ Port 006: Dev 003, If 0, Class=Hub, Driver=hub/6p, 480M
   /:  Bus 002.Port 001: Dev 001, Class=root_hub, Driver=ehci-pci/2p, 480M
       |__ Port 001: Dev 002, If 0, Class=Hub, Driver=hub/8p, 480M
  MachineType: Dell Inc. PowerEdge R730
  NonfreeKernelModules: zfs
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   LANG=en_US.UTF-8
   PATH=(custom, no user)
   SHELL=/bin/bash
   TERM=xterm-256color
  ProcFB: 0 simpledrmdrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.8.0-60-generic 
root=/dev/mapper/ubuntu--vg-ubuntu--lv ro intel_iommu=off
  ProcVersionSignature: Ubuntu 6.8.0-60.63-generic 6.8.12
  RebootRequiredPkgs: Error: path contained symlinks.
  RelatedPackageVersions:
   linux-restricted-modules-6.8.0-60-generic N/A
   linux-backports-modules-6.8.0-60-generic  N/A
   linux-firmware                            20240318.git3b128b60-0ubuntu2.12
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags: noble
  Uname: Linux 6.8.0-60-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 001/22/2018
  dmi.bios.release: 2.7
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 2.7.1
  dmi.board.name: 072T6D
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A08
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr2.7.1:bd001/22/2018:br2.7:svnDellInc.:pnPowerEdgeR730:pvr:rvnDellInc.:rn072T6D:rvrA08:cvnDellInc.:ct23:cvr:skuSKU=NotProvided;ModelName=PowerEdgeR730:
  dmi.product.name: PowerEdge R730
  dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R730
  dmi.sys.vendor: Dell Inc.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Jun 19 06:15 seq
   crw-rw---- 1 root audio 116, 33 Jun 19 06:15 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay'
  ApportVersion: 2.28.1-0ubuntu3.7
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: N/A
  CasperMD5CheckResult: unknown
  DistroRelease: Ubuntu 24.04
  InstallationDate: Installed on 2024-04-22 (423 days ago)
  InstallationMedia: Ubuntu-Server 24.04 LTS "Noble Numbat" - Beta amd64 
(20240410.1)
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig'
  Lsusb:
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
   Bus 001 Device 002: ID 8087:800a Intel Corp. Hub
   Bus 001 Device 003: ID 413c:a001 Dell Computer Corp. Hub
   Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
   Bus 002 Device 002: ID 8087:8002 Intel Corp. 8 channel internal hub
  Lsusb-t:
   /:  Bus 001.Port 001: Dev 001, Class=root_hub, Driver=ehci-pci/2p, 480M
       |__ Port 001: Dev 002, If 0, Class=Hub, Driver=hub/6p, 480M
           |__ Port 006: Dev 003, If 0, Class=Hub, Driver=hub/6p, 480M
   /:  Bus 002.Port 001: Dev 001, Class=root_hub, Driver=ehci-pci/2p, 480M
       |__ Port 001: Dev 002, If 0, Class=Hub, Driver=hub/8p, 480M
  MachineType: Dell Inc. PowerEdge R730
  NonfreeKernelModules: zfs
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   LANG=en_US.UTF-8
   PATH=(custom, no user)
   SHELL=/bin/bash
   TERM=xterm-256color
  ProcFB: 0 mgag200drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-6.8.0-64-generic 
root=/dev/mapper/ubuntu--vg-ubuntu--lv ro intel_iommu=off
  ProcVersionSignature: Ubuntu 6.8.0-64.67-generic 6.8.12
  RelatedPackageVersions:
   linux-restricted-modules-6.8.0-64-generic N/A
   linux-backports-modules-6.8.0-64-generic  N/A
   linux-firmware                            20240318.git3b128b60-0ubuntu2.13
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill'
  Tags: noble
  Uname: Linux 6.8.0-64-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: N/A
  _MarkForUpload: True
  dmi.bios.date: 001/22/2018
  dmi.bios.release: 2.7
  dmi.bios.vendor: Dell Inc.
  dmi.bios.version: 2.7.1
  dmi.board.name: 072T6D
  dmi.board.vendor: Dell Inc.
  dmi.board.version: A08
  dmi.chassis.type: 23
  dmi.chassis.vendor: Dell Inc.
  dmi.modalias: 
dmi:bvnDellInc.:bvr2.7.1:bd001/22/2018:br2.7:svnDellInc.:pnPowerEdgeR730:pvr:rvnDellInc.:rn072T6D:rvrA08:cvnDellInc.:ct23:cvr:skuSKU=NotProvided;ModelName=PowerEdgeR730:
  dmi.product.name: PowerEdge R730
  dmi.product.sku: SKU=NotProvided;ModelName=PowerEdge R730
  dmi.sys.vendor: Dell Inc.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2112304/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to