Public bug reported:

Intially found when running snapd integration test suite on Arch on
5.2.x kernels. The problem was reproduced on daily eoan cloud images
with kernel Ubuntu 5.3.0-10.11-generic 5.3.0-rc8 (as reported in
/proc/version_signature)

Stracing trivial Go binary results in a deadlock when both strace and
the binary being traced end up in D state.

Example code:

```
package main

import (
    "os"
    "syscall"
)

func main() {
    syscall.Exec("/bin/echo", []string{"/bin/echo", "Hello"}, os.Environ())
}
```

$ go build exec.go
$ while true; do echo '- go'; strace -f -e 'execve' ./exec 2>/dev/null ; done
..
Hello
- go
Hello
- go
Hello
- go
^C

At which point strace does not react to ^C. This can only be unlocked by
issuing SIGKILL to either strace or the binary being traced.

Process list:

  PID TTY      STAT   TIME COMMAND
  808 ?        Ss     0:00 sshd: guest [priv]
 1003 ?        S      0:01  \_ sshd: guest@pts/0,pts/1,pts/2
 1038 pts/0    Ss     0:00      \_ -bash
23473 pts/0    D+     0:00      |   \_ strace -f -e execve ./exec
23476 pts/0    Dl+    0:00      |       \_ ./exec

The dump of blocked kernel tasks:

[  414.886023] sysrq: Show Blocked State
[  414.886695]   task                        PC stack   pid father
[  414.886739] strace          D    0 23473   1038 0x00000000
[  414.886741] Call Trace:
[  414.886747]  __schedule+0x2b9/0x6c0
[  414.886748]  schedule+0x42/0xb0
[  414.886750]  schedule_preempt_disabled+0xe/0x10
[  414.886751]  __mutex_lock.isra.0+0x182/0x4f0
[  414.886752]  __mutex_lock_killable_slowpath+0x13/0x20
[  414.886754]  mutex_lock_killable+0x2e/0x40
[  414.886756]  mm_access+0x27/0xa0
[  414.886758]  process_vm_rw_core.isra.0+0x10f/0x590
[  414.886760]  ? __switch_to_asm+0x40/0x70
[  414.886761]  ? __switch_to_asm+0x34/0x70
[  414.886762]  ? __switch_to_asm+0x34/0x70
[  414.886763]  ? __switch_to_asm+0x40/0x70
[  414.886764]  ? reschedule_interrupt+0xa/0x20
[  414.886767]  ? __check_object_size+0x4d/0x150
[  414.886770]  ? _copy_from_user+0x3e/0x60
[  414.886772]  ? rw_copy_check_uvector+0x62/0x110
[  414.886774]  ? __check_object_size+0x4d/0x150
[  414.886774]  ? _copy_from_user+0x3e/0x60
[  414.886776]  process_vm_rw+0x106/0x130
[  414.886778]  ? common_file_perm+0x5e/0x110
[  414.886780]  ? vfs_write+0x173/0x1a0
[  414.886781]  ? ksys_write+0x67/0xe0
[  414.886782]  __x64_sys_process_vm_readv+0x2b/0x30
[  414.886785]  do_syscall_64+0x5a/0x130
[  414.886786]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  414.886788] RIP: 0033:0x7fa7b90ec19e
[  414.886794] Code: Bad RIP value.
[  414.886794] RSP: 002b:00007ffc293ab9f8 EFLAGS: 00000246 ORIG_RAX: 
0000000000000136
[  414.886796] RAX: ffffffffffffffda RBX: 00007ffc293aba90 RCX: 00007fa7b90ec19e
[  414.886796] RDX: 0000000000000001 RSI: 00007ffc293aba00 RDI: 0000000000005bb5
[  414.886797] RBP: 000000c00008a010 R08: 0000000000000001 R09: 0000000000000000
[  414.886797] R10: 00007ffc293aba10 R11: 0000000000000246 R12: 0000000000001000
[  414.886797] R13: 0000000000001000 R14: 0000000000005bb5 R15: 0000000000000000
[  414.886799] exec            D    0 23476  23473 0x80004003
[  414.886800] Call Trace:
[  414.886802]  __schedule+0x2b9/0x6c0
[  414.886803]  schedule+0x42/0xb0
[  414.886805]  de_thread+0x1c3/0x480
[  414.886806]  flush_old_exec+0x2a/0x260
[  414.886808]  ? load_elf_phdrs+0x70/0xb0
[  414.886809]  load_elf_binary+0x344/0x11e0
[  414.886810]  ? get_user_pages_remote+0x146/0x230
[  414.886811]  ? get_acl+0x1d/0x120
[  414.886813]  ? ima_bprm_check+0x87/0xb0
[  414.886814]  search_binary_handler+0x8b/0x1c0
[  414.886816]  __do_execve_file.isra.0+0x4fe/0x860
[  414.886817]  __x64_sys_execve+0x39/0x50
[  414.886818]  do_syscall_64+0x5a/0x130
[  414.886820]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  414.886820] RIP: 0033:0x459ffb
[  414.886821] Code: Bad RIP value.
[  414.886822] RSP: 002b:000000c000082e40 EFLAGS: 00000206 ORIG_RAX: 
000000000000003b
[  414.886822] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000459ffb
[  414.886823] RDX: 000000c000098000 RSI: 000000c000096000 RDI: 000000c00008a010
[  414.886823] RBP: 000000c000082ed0 R08: 0000000000000000 R09: 0000000000000000
[  414.886824] R10: 0000000000000000 R11: 0000000000000206 R12: 000000c00009a848
[  414.886824] R13: 000000000000085b R14: 000000c00009a848 R15: 000000c0000980b0

A patch was proposed on LKML back in 2016
https://lore.kernel.org/patchwork/patch/719314/ but it was not accepted
into mainline.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: Incomplete

** Attachment added: "version.log"
   
https://bugs.launchpad.net/bugs/1845180/+attachment/5290926/+files/version.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1845180

Title:
  strace deadlock when tracing a trivial Go program

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1845180/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to