The shutdown now also hangs during a resync. So it behaves
inconsistently. I update the subject and the description to reflect
this.

** Summary changed:

- Reboot hangs once RAID1 resynced
+ Shutdown hangs in md kworker after "Reached target Shutdown."

** Description changed:

  I'm booting a fully patched 16.04 from an Intel Rapid Storage Technology
  enterprise RAID1 volume (ThinkServer TS140 with two SATA
  ST1000NM0033-9ZM drives, ext4 root partition, no LVM, UEFI mode).
  
- If the RAID volume is recovering or resyncing for whatever reason, then
- `sudo systemctl reboot` and `sudo systemctl poweroff` work fine (I had
- to `sudo systemctl --now disable lvm2-lvmetad lvm2-lvmpolld
- lvm2-monitor` in order to consistently get that). However, once the
- recovery/resync is complete and clean, the reboot and poweroff commands
- above hang forever after "Reached target Shutdown.". Note that issuing
- `sudo swapoff -a` beforehand (suggested in the bug #1464917) does not
- help.
+ If the RAID volume is recovering or resyncing for whatever reason, then `sudo 
systemctl reboot` and `sudo systemctl poweroff` work fine (I had to `sudo 
systemctl --now disable lvm2-lvmetad lvm2-lvmpolld lvm2-monitor` in order to 
consistently get that). However, once the recovery/resync is complete and 
clean, the reboot and poweroff commands above hang forever after "Reached 
target Shutdown.". Note that issuing `sudo swapoff -a` beforehand (suggested in 
the bug #1464917) does not help.
+ [EDIT]Actually, the shutdown also hangs from time to time during a resync. 
But I've never seen it succeed once the resync is complete.[/EDIT]
  
  Then, if the server has been forcibly restarted with the power button,
  the Intel Matrix Storage Manager indicates a "Normal" status for the
  RAID1 volume, but Ubuntu then resyncs the volume anyway:
  
  [    1.223649] md: bind<sda>
  [    1.228426] md: bind<sdb>
  [    1.230030] md: bind<sdb>
  [    1.230738] md: bind<sda>
  [    1.232985] usbcore: registered new interface driver usbhid
  [    1.233494] usbhid: USB HID core driver
  [    1.234022] md: raid1 personality registered for level 1
  [    1.234876] md/raid1:md126: not clean -- starting background reconstruction
  [    1.234956] input: CHESEN USB Keyboard as 
/devices/pci0000:00/0000:00:14.0/usb3/3-10/3-10:1.0/0003:0A81:0101.0001/input/input5
  [    1.236273] md/raid1:md126: active with 2 out of 2 mirrors
  [    1.236797] md126: detected capacity change from 0 to 1000202043392
  [    1.246271] md: md126 switched to read-write mode.
  [    1.246834] md: resync of RAID array md126
  [    1.247325] md: minimum _guaranteed_  speed: 1000 KB/sec/disk.
  [    1.247503]  md126: p1 p2 p3 p4
  [    1.248269] md: using maximum available idle IO bandwidth (but not more 
than 200000 KB/sec) for resync.
  [    1.248774] md: using 128k window, over a total of 976759940k.
  
  Note that the pain of "resync upon every (re)boot" cannot even be a bit
  relieved thanks to bitmaps because mdadm does not support them for IMSM
  containers:
  
  $ sudo mdadm --grow --bitmap=internal /dev/md126
  mdadm: Cannot add bitmaps to sub-arrays yet
  
  I also get this in syslog during boot when the individual drives are
  detected, but this seems to be harmless:
  
  May 30 17:26:07 wssrv1 systemd-udevd[608]: Process '/sbin/mdadm --incremental 
/dev/sdb --offroot' failed with exit code 1.
  May 30 17:26:07 wssrv1 systemd-udevd[608]: Process '/lib/udev/hdparm' failed 
with exit code 1.
  
  May 30 17:26:07 wssrv1 systemd-udevd[606]: Process '/sbin/mdadm --incremental 
/dev/sda --offroot' failed with exit code 1.
  May 30 17:26:07 wssrv1 systemd-udevd[606]: Process '/lib/udev/hdparm' failed 
with exit code 1.
  
  During a resync, `sudo sh -c 'echo idle >
  /sys/block/md126/md/sync_action'` actually stops it as expected, but it
  restarts immediately though nothing seems to have triggered it:
  
  May 30 18:17:02 wssrv1 kernel: [ 3106.826710] md: md126: resync interrupted.
  May 30 18:17:02 wssrv1 kernel: [ 3106.836320] md: checkpointing resync of 
md126.
  May 30 18:17:02 wssrv1 kernel: [ 3106.836623] md: resync of RAID array md126
  May 30 18:17:02 wssrv1 kernel: [ 3106.836625] md: minimum _guaranteed_  
speed: 1000 KB/sec/disk.
  May 30 18:17:02 wssrv1 kernel: [ 3106.836626] md: using maximum available 
idle IO bandwidth (but not more than 200000 KB/sec) for resync.
  May 30 18:17:02 wssrv1 kernel: [ 3106.836627] md: using 128k window, over a 
total of 976759940k.
  May 30 18:17:02 wssrv1 kernel: [ 3106.836628] md: resuming resync of md126 
from checkpoint.
  May 30 18:17:02 wssrv1 mdadm[982]: RebuildStarted event detected on md device 
/dev/md/Volume0
  
- I attach screenshots of the hanging shutdown log after a `sudo sh -c
- 'echo 8 > /proc/sys/kernel/printk'`. The second screenshot shows that
- the kernel has deadlocked in md_write_start(). Note that `sudo systemctl
- start debug-shell` is unusable on this machine at this point because
- Ctrl+Alt+F9 brings tty9 without any keyboard.
+ I attach screenshots of the hanging shutdown log after a `sudo sh -c 'echo 8 
> /proc/sys/kernel/printk'`. The second screenshot shows that the kernel has 
deadlocked in md_write_start(). Note that `sudo systemctl start debug-shell` is 
unusable on this machine at this point because Ctrl+Alt+F9 brings tty9 without 
any keyboard.
+ [EDIT]But I can still switch back to tty1.[/EDIT]
  
  I have also tried with much lower values for vm.dirty_background_ratio
  and vm.dirty_ratio, but to no avail.
  
  Linux 4.6.0-040600-generic_4.6.0-040600.201605151930_amd64 from
  http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.6-yakkety/ did not help
  either.
  
  More information below:
  
  $ lsb_release -rd
  Description:  Ubuntu 16.04 LTS
  Release:      16.04
  
  $ uname -a
  Linux wssrv1 4.4.0-22-generic #40-Ubuntu SMP Thu May 12 22:03:46 UTC 2016 
x86_64 x86_64 x86_64 GNU/Linux
  
  $ apt-cache policy systemd
  systemd:
-   Installed: 229-4ubuntu6
-   Candidate: 229-4ubuntu6
-   Version table:
-  *** 229-4ubuntu6 500
-         500 http://fr.archive.ubuntu.com/ubuntu xenial-updates/main amd64 
Packages
-         100 /var/lib/dpkg/status
-      229-4ubuntu4 500
-         500 http://fr.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
+   Installed: 229-4ubuntu6
+   Candidate: 229-4ubuntu6
+   Version table:
+  *** 229-4ubuntu6 500
+         500 http://fr.archive.ubuntu.com/ubuntu xenial-updates/main amd64 
Packages
+         100 /var/lib/dpkg/status
+      229-4ubuntu4 500
+         500 http://fr.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
  
  $ cat /proc/mdstat
- Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] 
[raid10] 
+ Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] 
[raid10]
  md126 : active raid1 sda[1] sdb[0]
-       976759808 blocks super external:/md127/0 [2/2] [UU]
-       [>....................]  resync =  3.3% (32651584/976759940) 
finish=85.9min speed=183164K/sec
-       
+       976759808 blocks super external:/md127/0 [2/2] [UU]
+       [>....................]  resync =  3.3% (32651584/976759940) 
finish=85.9min speed=183164K/sec
+ 
  md127 : inactive sdb[1](S) sda[0](S)
-       5288 blocks super external:imsm
-        
+       5288 blocks super external:imsm
+ 
  unused devices: <none>
  
  $ sudo mdadm -D /dev/md127
  /dev/md127:
-         Version : imsm
-      Raid Level : container
-   Total Devices : 2
+         Version : imsm
+      Raid Level : container
+   Total Devices : 2
  
  Working Devices : 2
  
- 
-            UUID : e9bb2216:cb1bbc0f:96943390:bb65943c
-   Member Arrays : /dev/md/Volume0
- 
-     Number   Major   Minor   RaidDevice
- 
-        0       8        0        -        /dev/sda
-        1       8       16        -        /dev/sdb
+            UUID : e9bb2216:cb1bbc0f:96943390:bb65943c
+   Member Arrays : /dev/md/Volume0
+ 
+     Number   Major   Minor   RaidDevice
+ 
+        0       8        0        -        /dev/sda
+        1       8       16        -        /dev/sdb
  
  $ sudo mdadm -D /dev/md126
  /dev/md126:
-       Container : /dev/md/imsm0, member 0
-      Raid Level : raid1
-      Array Size : 976759808 (931.51 GiB 1000.20 GB)
-   Used Dev Size : 976759940 (931.51 GiB 1000.20 GB)
-    Raid Devices : 2
-   Total Devices : 2
- 
-           State : clean, resyncing 
-  Active Devices : 2
+       Container : /dev/md/imsm0, member 0
+      Raid Level : raid1
+      Array Size : 976759808 (931.51 GiB 1000.20 GB)
+   Used Dev Size : 976759940 (931.51 GiB 1000.20 GB)
+    Raid Devices : 2
+   Total Devices : 2
+ 
+           State : clean, resyncing
+  Active Devices : 2
  Working Devices : 2
-  Failed Devices : 0
-   Spare Devices : 0
- 
-   Resync Status : 5% complete
- 
- 
-            UUID : 3d724b1d:ac75cddb:600ac81a:ccdc2090
-     Number   Major   Minor   RaidDevice State
-        1       8        0        0      active sync   /dev/sda
-        0       8       16        1      active sync   /dev/sdb
+  Failed Devices : 0
+   Spare Devices : 0
+ 
+   Resync Status : 5% complete
+ 
+            UUID : 3d724b1d:ac75cddb:600ac81a:ccdc2090
+     Number   Major   Minor   RaidDevice State
+        1       8        0        0      active sync   /dev/sda
+        0       8       16        1      active sync   /dev/sdb
  
  $ sudo mdadm -E /dev/sda
  /dev/sda:
-           Magic : Intel Raid ISM Cfg Sig.
-         Version : 1.1.00
-     Orig Family : 92b6e3e4
-          Family : 92b6e3e4
-      Generation : 00000075
-      Attributes : All supported
-            UUID : e9bb2216:cb1bbc0f:96943390:bb65943c
-        Checksum : 5ad6e3c8 correct
-     MPB Sectors : 2
-           Disks : 2
-    RAID Devices : 1
- 
-   Disk00 Serial : Z1W50P5E
-           State : active
-              Id : 00000000
-     Usable Size : 1953519880 (931.51 GiB 1000.20 GB)
+           Magic : Intel Raid ISM Cfg Sig.
+         Version : 1.1.00
+     Orig Family : 92b6e3e4
+          Family : 92b6e3e4
+      Generation : 00000075
+      Attributes : All supported
+            UUID : e9bb2216:cb1bbc0f:96943390:bb65943c
+        Checksum : 5ad6e3c8 correct
+     MPB Sectors : 2
+           Disks : 2
+    RAID Devices : 1
+ 
+   Disk00 Serial : Z1W50P5E
+           State : active
+              Id : 00000000
+     Usable Size : 1953519880 (931.51 GiB 1000.20 GB)
  
  [Volume0]:
-            UUID : 3d724b1d:ac75cddb:600ac81a:ccdc2090
-      RAID Level : 1 <-- 1
-         Members : 2 <-- 2
-           Slots : [UU] <-- [UU]
-     Failed disk : none
-       This Slot : 0
-      Array Size : 1953519616 (931.51 GiB 1000.20 GB)
-    Per Dev Size : 1953519880 (931.51 GiB 1000.20 GB)
-   Sector Offset : 0
-     Num Stripes : 7630936
-      Chunk Size : 64 KiB <-- 64 KiB
-        Reserved : 0
-   Migrate State : repair
-       Map State : normal <-- normal
-      Checkpoint : 201165 (512)
-     Dirty State : dirty
- 
-   Disk01 Serial : Z1W519DN
-           State : active
-              Id : 00000001
-     Usable Size : 1953519880 (931.51 GiB 1000.20 GB)
+            UUID : 3d724b1d:ac75cddb:600ac81a:ccdc2090
+      RAID Level : 1 <-- 1
+         Members : 2 <-- 2
+           Slots : [UU] <-- [UU]
+     Failed disk : none
+       This Slot : 0
+      Array Size : 1953519616 (931.51 GiB 1000.20 GB)
+    Per Dev Size : 1953519880 (931.51 GiB 1000.20 GB)
+   Sector Offset : 0
+     Num Stripes : 7630936
+      Chunk Size : 64 KiB <-- 64 KiB
+        Reserved : 0
+   Migrate State : repair
+       Map State : normal <-- normal
+      Checkpoint : 201165 (512)
+     Dirty State : dirty
+ 
+   Disk01 Serial : Z1W519DN
+           State : active
+              Id : 00000001
+     Usable Size : 1953519880 (931.51 GiB 1000.20 GB)
  
  $ sudo mdadm -E /dev/sdb
  /dev/sdb:
-           Magic : Intel Raid ISM Cfg Sig.
-         Version : 1.1.00
-     Orig Family : 92b6e3e4
-          Family : 92b6e3e4
-      Generation : 00000075
-      Attributes : All supported
-            UUID : e9bb2216:cb1bbc0f:96943390:bb65943c
-        Checksum : 5ad6e3c8 correct
-     MPB Sectors : 2
-           Disks : 2
-    RAID Devices : 1
- 
-   Disk01 Serial : Z1W519DN
-           State : active
-              Id : 00000001
-     Usable Size : 1953519880 (931.51 GiB 1000.20 GB)
+           Magic : Intel Raid ISM Cfg Sig.
+         Version : 1.1.00
+     Orig Family : 92b6e3e4
+          Family : 92b6e3e4
+      Generation : 00000075
+      Attributes : All supported
+            UUID : e9bb2216:cb1bbc0f:96943390:bb65943c
+        Checksum : 5ad6e3c8 correct
+     MPB Sectors : 2
+           Disks : 2
+    RAID Devices : 1
+ 
+   Disk01 Serial : Z1W519DN
+           State : active
+              Id : 00000001
+     Usable Size : 1953519880 (931.51 GiB 1000.20 GB)
  
  [Volume0]:
-            UUID : 3d724b1d:ac75cddb:600ac81a:ccdc2090
-      RAID Level : 1 <-- 1
-         Members : 2 <-- 2
-           Slots : [UU] <-- [UU]
-     Failed disk : none
-       This Slot : 1
-      Array Size : 1953519616 (931.51 GiB 1000.20 GB)
-    Per Dev Size : 1953519880 (931.51 GiB 1000.20 GB)
-   Sector Offset : 0
-     Num Stripes : 7630936
-      Chunk Size : 64 KiB <-- 64 KiB
-        Reserved : 0
-   Migrate State : repair
-       Map State : normal <-- normal
-      Checkpoint : 201165 (512)
-     Dirty State : dirty
- 
-   Disk00 Serial : Z1W50P5E
-           State : active
-              Id : 00000000
-     Usable Size : 1953519880 (931.51 GiB 1000.20 GB)
+            UUID : 3d724b1d:ac75cddb:600ac81a:ccdc2090
+      RAID Level : 1 <-- 1
+         Members : 2 <-- 2
+           Slots : [UU] <-- [UU]
+     Failed disk : none
+       This Slot : 1
+      Array Size : 1953519616 (931.51 GiB 1000.20 GB)
+    Per Dev Size : 1953519880 (931.51 GiB 1000.20 GB)
+   Sector Offset : 0
+     Num Stripes : 7630936
+      Chunk Size : 64 KiB <-- 64 KiB
+        Reserved : 0
+   Migrate State : repair
+       Map State : normal <-- normal
+      Checkpoint : 201165 (512)
+     Dirty State : dirty
+ 
+   Disk00 Serial : Z1W50P5E
+           State : active
+              Id : 00000000
+     Usable Size : 1953519880 (931.51 GiB 1000.20 GB)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1587142

Title:
  Shutdown hangs in md kworker after "Reached target Shutdown."

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1587142/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to