Public bug reported:

[Impact]
The system with multiple MD RAIDs sometimes hangs while rebooting, that's 
because of the systemd can't stop and close the MD disk.

[Fix]
This commit fixes the issue, and this issue has been introduced by 12a6caf27324 
("md: only delete entries from all_mddevs when the disk is freed") after v6.0

https://patchwork.kernel.org/project/linux-
raid/patch/[email protected]/

[Test case]
1. Reboot the system with multiple MD RAIDs at least 10 times.
2. Make sure the system can reboot successfully every time.
3. You should not see error messages like below.

[ 205.360738] systemd-shutdown[1]: Stopping MD devices.
[ 205.366384] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
[ 205.373327] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
[ 205.380427] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
[ 205.388257] systemd-shutdown[1]: Stopping MD /dev/md127 (9:127).
[ 205.394880] systemd-shutdown[1]: Failed to sync MD block device /dev/md127, 
ignoring: Input/output error
[ 205.404975] md: md127 stopped.
[ 205.470491] systemd-shutdown[1]: Stopping MD /dev/md126 (9:126).
[ 205.770179] md: md126: resync interrupted.
[ 205.776258] md126: detected capacity change from 1900396544 to 0
[ 205.783349] md: md126 stopped.
[ 205.862258] systemd-shutdown[1]: Stopping MD /dev/md125 (9:125).
[ 205.862435] md: md126 stopped.
[ 205.868376] systemd-shutdown[1]: Failed to sync MD block device /dev/md125, 
ignoring: Input/output error
[ 205.872845] block device autoloading is deprecated and will be removed.
[ 205.880955] md: md125 stopped.
[ 205.934349] systemd-shutdown[1]: Stopping MD /dev/md124p2 (259:7).
[ 205.947707] systemd-shutdown[1]: Could not stop MD /dev/md124p2: Device or 
resource busy
[ 205.957004] systemd-shutdown[1]: Stopping MD /dev/md124p1 (259:6).
[ 205.964177] systemd-shutdown[1]: Could not stop MD /dev/md124p1: Device or 
resource busy
[ 205.973155] systemd-shutdown[1]: Stopping MD /dev/md124 (9:124).
[ 205.979789] systemd-shutdown[1]: Could not stop MD /dev/md124: Device or 
resource busy
[ 205.988475] systemd-shutdown[1]: Not all MD devices stopped, 4 left.

[Where problems could occur]
It fixes the data race issue, should not introduce any regression.

** Affects: hwe-next
     Importance: Undecided
         Status: New

** Affects: linux (Ubuntu)
     Importance: Undecided
     Assignee: AceLan Kao (acelankao)
         Status: In Progress

** Affects: linux-oem-6.5 (Ubuntu)
     Importance: Undecided
         Status: Invalid

** Affects: linux (Ubuntu Jammy)
     Importance: Undecided
         Status: Invalid

** Affects: linux-oem-6.5 (Ubuntu Jammy)
     Importance: Undecided
     Assignee: AceLan Kao (acelankao)
         Status: In Progress

** Affects: linux (Ubuntu Mantic)
     Importance: Undecided
     Assignee: AceLan Kao (acelankao)
         Status: In Progress

** Affects: linux-oem-6.5 (Ubuntu Mantic)
     Importance: Undecided
         Status: Invalid


** Tags: oem-priority originate-from-2025253 somerville

** Also affects: linux-oem-6.5 (Ubuntu)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Mantic)
   Importance: Undecided
       Status: New

** Also affects: linux-oem-6.5 (Ubuntu Mantic)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Jammy)
   Importance: Undecided
       Status: New

** Also affects: linux-oem-6.5 (Ubuntu Jammy)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Jammy)
       Status: New => In Progress

** Changed in: linux (Ubuntu Mantic)
       Status: New => In Progress

** Changed in: linux (Ubuntu Mantic)
     Assignee: (unassigned) => AceLan Kao (acelankao)

** Changed in: linux-oem-6.5 (Ubuntu Jammy)
       Status: New => In Progress

** Changed in: linux-oem-6.5 (Ubuntu Mantic)
       Status: New => Invalid

** Changed in: linux-oem-6.5 (Ubuntu Jammy)
     Assignee: (unassigned) => AceLan Kao (acelankao)

** Tags added: oem-priority originate-from-2025253 somerville

** Changed in: linux (Ubuntu Jammy)
       Status: In Progress => Invalid

** Description changed:

  [Impact]
  The system with multiple MD RAIDs sometimes hangs while rebooting, that's 
because of the systemd can't stop and close the MD disk.
  
  [Fix]
- This commit fixes the issue.
+ This commit fixes the issue, and this issue has been introduced by 
12a6caf27324 ("md: only delete entries from all_mddevs when the disk is freed") 
after v6.0
  
  https://patchwork.kernel.org/project/linux-
  raid/patch/[email protected]/
  
  [Test case]
  1. Reboot the system with multiple MD RAIDs at least 10 times.
  2. Make sure the system can reboot successfully every time.
  3. You should not see error messages like below.
  
  [ 205.360738] systemd-shutdown[1]: Stopping MD devices.
  [ 205.366384] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  [ 205.373327] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  [ 205.380427] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  [ 205.388257] systemd-shutdown[1]: Stopping MD /dev/md127 (9:127).
  [ 205.394880] systemd-shutdown[1]: Failed to sync MD block device /dev/md127, 
ignoring: Input/output error
  [ 205.404975] md: md127 stopped.
  [ 205.470491] systemd-shutdown[1]: Stopping MD /dev/md126 (9:126).
  [ 205.770179] md: md126: resync interrupted.
  [ 205.776258] md126: detected capacity change from 1900396544 to 0
  [ 205.783349] md: md126 stopped.
  [ 205.862258] systemd-shutdown[1]: Stopping MD /dev/md125 (9:125).
  [ 205.862435] md: md126 stopped.
  [ 205.868376] systemd-shutdown[1]: Failed to sync MD block device /dev/md125, 
ignoring: Input/output error
  [ 205.872845] block device autoloading is deprecated and will be removed.
  [ 205.880955] md: md125 stopped.
  [ 205.934349] systemd-shutdown[1]: Stopping MD /dev/md124p2 (259:7).
  [ 205.947707] systemd-shutdown[1]: Could not stop MD /dev/md124p2: Device or 
resource busy
  [ 205.957004] systemd-shutdown[1]: Stopping MD /dev/md124p1 (259:6).
  [ 205.964177] systemd-shutdown[1]: Could not stop MD /dev/md124p1: Device or 
resource busy
  [ 205.973155] systemd-shutdown[1]: Stopping MD /dev/md124 (9:124).
  [ 205.979789] systemd-shutdown[1]: Could not stop MD /dev/md124: Device or 
resource busy
  [ 205.988475] systemd-shutdown[1]: Not all MD devices stopped, 4 left.
  
  [Where problems could occur]
  It fixes the data race issue, should not introduce any regression.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2036184

Title:
  Infiniate systemd loop when power off the machine with multiple MD
  RAIDs

Status in HWE Next:
  New
Status in linux package in Ubuntu:
  In Progress
Status in linux-oem-6.5 package in Ubuntu:
  Invalid
Status in linux source package in Jammy:
  Invalid
Status in linux-oem-6.5 source package in Jammy:
  In Progress
Status in linux source package in Mantic:
  In Progress
Status in linux-oem-6.5 source package in Mantic:
  Invalid

Bug description:
  [Impact]
  The system with multiple MD RAIDs sometimes hangs while rebooting, that's 
because of the systemd can't stop and close the MD disk.

  [Fix]
  This commit fixes the issue, and this issue has been introduced by 
12a6caf27324 ("md: only delete entries from all_mddevs when the disk is freed") 
after v6.0

  https://patchwork.kernel.org/project/linux-
  raid/patch/[email protected]/

  [Test case]
  1. Reboot the system with multiple MD RAIDs at least 10 times.
  2. Make sure the system can reboot successfully every time.
  3. You should not see error messages like below.

  [ 205.360738] systemd-shutdown[1]: Stopping MD devices.
  [ 205.366384] systemd-shutdown[1]: sd-device-enumerator: Scan all dirs
  [ 205.373327] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/bus
  [ 205.380427] systemd-shutdown[1]: sd-device-enumerator: Scanning /sys/class
  [ 205.388257] systemd-shutdown[1]: Stopping MD /dev/md127 (9:127).
  [ 205.394880] systemd-shutdown[1]: Failed to sync MD block device /dev/md127, 
ignoring: Input/output error
  [ 205.404975] md: md127 stopped.
  [ 205.470491] systemd-shutdown[1]: Stopping MD /dev/md126 (9:126).
  [ 205.770179] md: md126: resync interrupted.
  [ 205.776258] md126: detected capacity change from 1900396544 to 0
  [ 205.783349] md: md126 stopped.
  [ 205.862258] systemd-shutdown[1]: Stopping MD /dev/md125 (9:125).
  [ 205.862435] md: md126 stopped.
  [ 205.868376] systemd-shutdown[1]: Failed to sync MD block device /dev/md125, 
ignoring: Input/output error
  [ 205.872845] block device autoloading is deprecated and will be removed.
  [ 205.880955] md: md125 stopped.
  [ 205.934349] systemd-shutdown[1]: Stopping MD /dev/md124p2 (259:7).
  [ 205.947707] systemd-shutdown[1]: Could not stop MD /dev/md124p2: Device or 
resource busy
  [ 205.957004] systemd-shutdown[1]: Stopping MD /dev/md124p1 (259:6).
  [ 205.964177] systemd-shutdown[1]: Could not stop MD /dev/md124p1: Device or 
resource busy
  [ 205.973155] systemd-shutdown[1]: Stopping MD /dev/md124 (9:124).
  [ 205.979789] systemd-shutdown[1]: Could not stop MD /dev/md124: Device or 
resource busy
  [ 205.988475] systemd-shutdown[1]: Not all MD devices stopped, 4 left.

  [Where problems could occur]
  It fixes the data race issue, should not introduce any regression.

To manage notifications about this bug go to:
https://bugs.launchpad.net/hwe-next/+bug/2036184/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to