Hi, I was trying to use strace recently and found that it exhibited some strange behavior. I produced this minimal test case:
#include <unistd.h> int main() { write(1, "a", 1); return 0; } which, when run using "gcc test.c && strace ./a.out" produces this strace output: [ pre-main omitted ] write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) write(1, "a", 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) [ repeats forever ] The correct result is of course: [ pre-main omitted ] write(1, "a", 1) = 1 exit_group(0) = ? +++ exited with 0 +++ Strangely, this only occurs when outputting to a tty-like output. Running "strace ./a.out" from a native Linux x86 console or a terminal emulator causes the abnormal behavior. However, the following commands work correctly: - strace ./a.out >/dev/null - strace ./a.out >/tmp/a # /tmp is a standard tmpfs - strace ./a.out >&- # causes -1 EBADF (Bad file descriptor) "strace -o /tmp/a ./a.out" hangs and produces the above (infinite) output to /tmp/a. I bisected this to 76f969e, "cgroup: cgroup v2 freezer". I reverted the entire patchset (reverting only that one caused a conflict), which resolved the issue. I skimmed the patch and came up with this workaround, which also resolves the issue. I am not at all clear on the technical workings of the patchset, but it seems to me like a process's frozen status is supposed to be "suspended" when a frozen process is ptraced, and "unsuspended" when ptracing ends. Therefore, it seems suspicious to always "enter frozen" whether or not the cgroup is actually frozen. It seems like the code should instead check if the cgroup is actually frozen, and if so, restore the frozen status. I am using systemd but not any other cgroup features. I tried in an initramfs environment (no systemd, /init -> shell script) and reproduced the failing test case. Please CC me on replies. Thanks, Alex. --- kernel/signal.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/signal.c b/kernel/signal.c index 62f9aea4a15a..47145d9d89ca 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -2110,7 +2110,7 @@ static void ptrace_stop(int exit_code, int why, int clear_code, kernel_siginfo_t preempt_disable(); read_unlock(&tasklist_lock); preempt_enable_no_resched(); - cgroup_enter_frozen(); + //cgroup_enter_frozen(); freezable_schedule(); } else { /* @@ -2289,7 +2289,7 @@ static bool do_signal_stop(int signr) } /* Now we don't run again until woken by SIGCONT or SIGKILL */ - cgroup_enter_frozen(); + //cgroup_enter_frozen(); freezable_schedule(); return true; } else { -- 2.21.0