On Wed, Apr 29, 2020 at 8:41 PM Jann Horn <ja...@google.com> wrote: > > | So a ptrace() user (or [...] wouldn't even see the impossible EAGAIN error. > > So I assumed you explicitly wanted ptrace() to restart, too. I was > just pointing out that that didn't make sense to me.
I'm actually ok with the restart option, simply because I continue to maintain that the program is buggy. "Anything goes". To not be buggy, the program needs to install a SIGCHLD handler so that it can reap its (pseudo-)children. At which point it doesn't actually make any difference whether we fix the kernel or not, because then the non-buggy program will just work - even with a non-modified kernel. Honestly, the main argument for the kernel doing anything different at all is that from a user-mode perspective, silently hanging in the kernel waiting for something to happen is likely the least easy to debug. But if you do a return to user space - even if it's to just rinse and repeat - it's at least not "silent" any more, even if the main noise it makes is just to waste 100% CPU time. At least that's a big hint to somebody to take a look. But yes, we can make ptrace() - and _only_ ptrace() - then not repeat, and return a new error code that it has never returned before. Like EAGAIN. Mainly because in that case we're only breaking semantics of something that was already broken - unlike "write()", which has perfectly well-defined semantics and wasn't broken. Linus