4.15.0-43-generic #46~16.04.1-Ubuntu SMP x86_64

I have the same problem but the process that gets stuck is dockerd not
systemd.

3,2077,80354225063,-;INFO: task dockerd:2070 blocked for more than 120 seconds.
3,2078,80354230430,-;      Tainted: P           O     4.15.0-43-generic 
#46~16.04.1-Ubuntu
3,2079,80354236566,-;"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
6,2080,80354242978,-;dockerd         D    0  2070      1 0x00000000
4,2081,80354242981,-;Call Trace:
4,2082,80354242984,-; __schedule+0x3d6/0x8b0
4,2083,80354242986,-; ? xen_smp_send_reschedule+0x10/0x20
4,2084,80354242989,-; schedule+0x36/0x80
4,2085,80354242991,-; schedule_timeout+0x1db/0x370
4,2086,80354242993,-; ? try_to_wake_up+0x59/0x4a0
4,2087,80354242995,-; wait_for_completion+0xb4/0x140
4,2088,80354242996,-; ? wake_up_q+0x70/0x70
4,2089,80354242998,-; flush_work+0x129/0x1e0
4,2090,80354242999,-; ? worker_detach_from_pool+0xb0/0xb0
4,2091,80354243001,-; flush_delayed_work+0x3f/0x50
4,2092,80354243004,-; fsnotify_wait_marks_destroyed+0x15/0x20
4,2093,80354243005,-; fsnotify_destroy_group+0x48/0xd0
4,2094,80354243008,-; inotify_release+0x1e/0x50
4,2095,80354243011,-; __fput+0xea/0x220
4,2096,80354243013,-; ____fput+0xe/0x10
4,2097,80354243014,-; task_work_run+0x8a/0xb0
4,2098,80354243016,-; exit_to_usermode_loop+0xc4/0xd0
4,2099,80354243018,-; do_syscall_64+0xf4/0x130
4,2100,80354243020,-; entry_SYSCALL_64_after_hwframe+0x3d/0xa2

The tainted marker is from zfs module since I have a ZFS partition mounted in 
the system.
The / partition however is EXT4, and docker is running from /.

cat /proc/1/stack
[<0>] flush_work+0x129/0x1e0
[<0>] flush_delayed_work+0x3f/0x50
[<0>] fsnotify_wait_marks_destroyed+0x15/0x20
[<0>] fsnotify_destroy_group+0x48/0xd0
[<0>] inotify_release+0x1e/0x50
[<0>] __fput+0xea/0x220
[<0>] ____fput+0xe/0x10
[<0>] task_work_run+0x8a/0xb0
[<0>] exit_to_usermode_loop+0xc4/0xd0
[<0>] do_syscall_64+0xf4/0x130
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0xffffffffffffffff


6,2111,218820611653,-;  task                        PC stack   pid father
6,2112,218820614372,-;systemd         D    0     1      0 0x00000004
4,2113,218820616975,-;Call Trace:
4,2114,218820618224,-; __schedule+0x3d6/0x8b0
4,2115,218820619977,-; ? xen_smp_send_reschedule+0x10/0x20
4,2116,218820622101,-; schedule+0x36/0x80
4,2117,218820624470,-; schedule_timeout+0x1db/0x370
4,2118,218820627660,-; ? delete_node+0x1a5/0x1f0
4,2119,218820630763,-; wait_for_completion+0xb4/0x140
4,2120,218820634184,-; ? wake_up_q+0x70/0x70
4,2121,218820638148,-; flush_work+0x129/0x1e0
4,2122,218820642490,-; ? worker_detach_from_pool+0xb0/0xb0
4,2123,218820646706,-; flush_delayed_work+0x3f/0x50
4,2124,218820650474,-; fsnotify_wait_marks_destroyed+0x15/0x20
4,2125,218820654789,-; fsnotify_destroy_group+0x48/0xd0
4,2126,218820658838,-; inotify_release+0x1e/0x50
4,2127,218820662400,-; __fput+0xea/0x220
4,2128,218820665486,-; ____fput+0xe/0x10
4,2129,218820668890,-; task_work_run+0x8a/0xb0
4,2130,218820672633,-; exit_to_usermode_loop+0xc4/0xd0
4,2131,218820676852,-; do_syscall_64+0xf4/0x130
4,2132,218820680778,-; entry_SYSCALL_64_after_hwframe+0x3d/0xa2
4,2133,218820686587,-;RIP: 0033:0x7efd22aaf57d
4,2134,218820691555,-;RSP: 002b:00007ffd463e6070 EFLAGS: 00000293 ORIG_RAX: 
0000000000000003
4,2135,218820699907,-;RAX: 0000000000000000 RBX: 000000000000000d RCX: 
00007efd22aaf57d
4,2136,218820708355,-;RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
000000000000000d
4,2137,218820717053,-;RBP: 00007efd241ea708 R08: 00005559085084d0 R09: 
0000000000000001
4,2138,218820726163,-;R10: 0000000000000000 R11: 0000000000000293 R12: 
0000000000000002
4,2139,218820734459,-;R13: 000055590749d580 R14: 00005559084bd9b8 R15: 
000055590749e9c0


Also kworker shows a stack at some point.

3,1861,80233315906,-;INFO: task kworker/u30:3:5705 blocked for more than 120 
seconds.
3,1862,80233321444,-;      Tainted: P           O     4.15.0-43-generic 
#46~16.04.1-Ubuntu
3,1863,80233327648,-;"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
6,1864,80233333902,-;kworker/u30:3   D    0  5705      2 0x80000000
6,1865,80233333909,-;Workqueue: events_unbound fsnotify_mark_destroy_workfn
4,1866,80233333910,-;Call Trace:
4,1867,80233333914,-; __schedule+0x3d6/0x8b0
4,1868,80233333918,-; schedule+0x36/0x80
4,1869,80233333920,-; schedule_timeout+0x1db/0x370
4,1870,80233333927,-; ? __enqueue_entity+0x5c/0x60
4,1871,80233333932,-; ? enqueue_entity+0x112/0x670
4,1872,80233333937,-; wait_for_completion+0xb4/0x140
4,1873,80233333939,-; ? wake_up_q+0x70/0x70
4,1874,80233333944,-; __synchronize_srcu.part.13+0x85/0xb0
4,1875,80233333947,-; ? trace_raw_output_rcu_utilization+0x50/0x50
4,1876,80233333950,-; synchronize_srcu+0xd3/0xe0
4,1877,80233333956,-; ? synchronize_srcu+0xd3/0xe0
4,1878,80233333962,-; fsnotify_mark_destroy_workfn+0x7c/0xe0
4,1879,80233333966,-; process_one_work+0x14d/0x410
4,1880,80233333968,-; worker_thread+0x22b/0x460
4,1881,80233333971,-; kthread+0x105/0x140
4,1882,80233333974,-; ? process_one_work+0x410/0x410
4,1883,80233333976,-; ? kthread_destroy_worker+0x50/0x50
4,1884,80233333979,-; ret_from_fork+0x35/0x40

A note on this is that the issues reappears after a reboot either in a
day or in a few days.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1798212

Title:
  systemd stuck in uninterruptible sleep state

Status in systemd package in Ubuntu:
  Confirmed

Bug description:
  Ubuntu 16.04 systemd 229 may get stuck in an uninterruptible sleep
  state. Any subsequent systemctl command times out with "Failed to
  execute operation: Connection timed out" error message. gdb and strace
  is unable to attach to systemd process 1.

  $ systemd --version
  systemd 229
  +PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP 
+GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN

  $ cat /proc/1/stack 
  [<0>] flush_work+0x129/0x1e0
  [<0>] flush_delayed_work+0x3f/0x50
  [<0>] fsnotify_wait_marks_destroyed+0x15/0x20
  [<0>] fsnotify_destroy_group+0x48/0xd0
  [<0>] inotify_release+0x1e/0x50
  [<0>] __fput+0xea/0x220
  [<0>] ____fput+0xe/0x10
  [<0>] task_work_run+0x8a/0xb0
  [<0>] exit_to_usermode_loop+0xc4/0xd0
  [<0>] do_syscall_64+0xf4/0x130
  [<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
  [<0>] 0xffffffffffffffff

  $ cat /proc/1/status 
  Name: systemd
  Umask:        0000
  State:        D (disk sleep)
  Tgid: 1
  Ngid: 0
  Pid:  1
  PPid: 0
  TracerPid:    9724
  Uid:  0       0       0       0
  Gid:  0       0       0       0
  FDSize:       128
  Groups:        
  NStgid:       1
  NSpid:        1
  NSpgid:       1
  NSsid:        1
  VmPeak:         251136 kB
  VmSize:         185728 kB
  VmLck:               0 kB
  VmPin:               0 kB
  VmHWM:            6620 kB
  VmRSS:            6264 kB
  RssAnon:          2380 kB
  RssFile:          3884 kB
  RssShmem:            0 kB
  VmData:          18752 kB
  VmStk:             132 kB
  VmExe:            1392 kB
  VmLib:            3692 kB
  VmPTE:             128 kB
  VmSwap:              0 kB
  HugetlbPages:        0 kB
  CoreDumping:  0
  Threads:      1
  SigQ: 1/257098
  SigPnd:       0000000000040000
  ShdPnd:       0000000000010000
  SigBlk:       7be3c0fe28014a03
  SigIgn:       0000000000001000
  SigCgt:       00000001800004ec
  CapInh:       0000000000000000
  CapPrm:       0000003fffffffff
  CapEff:       0000003fffffffff
  CapBnd:       0000003fffffffff
  CapAmb:       0000000000000000
  NoNewPrivs:   0
  Seccomp:      0
  Speculation_Store_Bypass:     vulnerable
  Cpus_allowed: ffffffff
  Cpus_allowed_list:    0-31
  Mems_allowed: 
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003
  Mems_allowed_list:    0-1
  voluntary_ctxt_switches:      716968
  nonvoluntary_ctxt_switches:   9565

  $ dmesg
  [...]
  [1255076.993707] INFO: task systemd:1 blocked for more than 120 seconds.
  [1255077.000295]       Not tainted 4.15.0-32-generic #35~16.04.1-Ubuntu
  [1255077.006804] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [1255077.014996] systemd         D    0     1      0 0x00000000
  [1255077.015000] Call Trace:
  [1255077.015011]  __schedule+0x3d6/0x8b0
  [1255077.015019]  ? enqueue_entity+0x112/0x670
  [1255077.015021]  schedule+0x36/0x80
  [1255077.015025]  schedule_timeout+0x1db/0x370
  [1255077.015028]  ? check_preempt_curr+0x54/0x90
  [1255077.015030]  ? ttwu_do_wakeup+0x1e/0x150
  [1255077.015033]  wait_for_completion+0xb4/0x140
  [1255077.015035]  ? wake_up_q+0x70/0x70
  [1255077.015040]  flush_work+0x129/0x1e0
  [1255077.015043]  ? worker_detach_from_pool+0xb0/0xb0
  [1255077.015045]  flush_delayed_work+0x3f/0x50
  [1255077.015052]  fsnotify_wait_marks_destroyed+0x15/0x20
  [1255077.015055]  fsnotify_destroy_group+0x48/0xd0
  [1255077.015058]  inotify_release+0x1e/0x50
  [1255077.015063]  __fput+0xea/0x220
  [1255077.015065]  ____fput+0xe/0x10
  [1255077.015068]  task_work_run+0x8a/0xb0
  [1255077.015074]  exit_to_usermode_loop+0xc4/0xd0
  [1255077.015077]  do_syscall_64+0xf4/0x130
  [1255077.015080]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
  [1255077.015084] RIP: 0033:0x7f29c9c3457d
  [1255077.015085] RSP: 002b:00007ffec956fe30 EFLAGS: 00000293 ORIG_RAX: 
0000000000000003
  [1255077.015088] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 
00007f29c9c3457d
  [1255077.015090] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000010
  [1255077.015092] RBP: 00007f29cb36d700 R08: 00005625f0043420 R09: 
0000000000000001
  [1255077.015093] R10: 0000000000000000 R11: 0000000000000293 R12: 
000000000000000b
  [1255077.015095] R13: 00005625eeed6040 R14: 00005625eff179b8 R15: 
00005625eeed7460
  [1255077.015421] INFO: task kworker/u66:2:108669 blocked for more than 120 
seconds.
  [1255077.022960]       Not tainted 4.15.0-32-generic #35~16.04.1-Ubuntu
  [1255077.029469] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [1255077.037601] kworker/u66:2   D    0 108669      2 0x80000080
  [1255077.037613] Workqueue: events_unbound fsnotify_mark_destroy_workfn
  [1255077.037615] Call Trace:
  [1255077.037621]  __schedule+0x3d6/0x8b0
  [1255077.037627]  ? __update_load_avg_blocked_se.isra.37+0xc5/0x140
  [1255077.037629]  schedule+0x36/0x80
  [1255077.037633]  schedule_timeout+0x1db/0x370
  [1255077.037636]  ? add_timer+0x125/0x280
  [1255077.037639]  wait_for_completion+0xb4/0x140
  [1255077.037641]  ? wake_up_q+0x70/0x70
  [1255077.037646]  __synchronize_srcu.part.13+0x85/0xb0
  [1255077.037650]  ? trace_raw_output_rcu_utilization+0x50/0x50
  [1255077.037653]  synchronize_srcu+0xd3/0xe0
  [1255077.037656]  ? synchronize_srcu+0xd3/0xe0
  [1255077.037660]  fsnotify_mark_destroy_workfn+0x7c/0xe0
  [1255077.037664]  process_one_work+0x14d/0x410
  [1255077.037678]  worker_thread+0x22b/0x460
  [1255077.037682]  kthread+0x105/0x140
  [1255077.037691]  ? process_one_work+0x410/0x410
  [1255077.037701]  ? kthread_destroy_worker+0x50/0x50
  [1255077.037712]  ret_from_fork+0x35/0x40

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1798212/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to