On Wed, Apr 29, 2020 at 8:00 PM Jann Horn <ja...@google.com> wrote: > > But if we go with Bernd's approach together with your restart > suggestion,
So repeat after me: Bernd's approach _without_ the restart is unacceptable. It's unacceptable because it breaks things that currently work, and returns EAGAIN in situations where that is simple not a valid error code. His original patch also happens to be unacceptable because it's an unmaintainable mess, but that's independent of the bug it introduced. That bug has nothing to do with ptrace(). It's literally a "write()" to a file in /proc. What is so hard to get about this basic thing? > then simply doing PTRACE_ATTACH on two threads A and B > would be enough to livelock, right? The same case that just causes a recursive wait. Yes. No worse off than we were. And the fact is, *THAT* case looks truly trivial to work around. Just make the ptrace() code - but not the fs/proc/base.c code - do something like if (lock_exec_creds(tsk)) return -EINTR; and now ptrace() doesn't repeat (simply because it doesn't return that ERESTARTNOINTR. It would go through that "return through signal handling code" in the kernel, but it wouldn't actually retry the system call). But I'm getting less and less interested in trying to "fix" this problem, when people don't seem to realize that the important case is to not break _good_ programs, and the pointless buggy garbage test-case is entirely uninteresting. It's buggy user code. If it causes a wait or a livelock, nobody sane should care in the least. Fix the bug in user space. Introducing new bugs in the kernel where they didn't exist before - in order to try to work around buggy user-space that has never ever worked - is not acceptable. Linus