Il giorno sab 12 feb 2022 alle ore 02:41 Raffaello D. Di Napoli <[email protected]> ha scritto: > > > On 2/11/22 16:22, Rob Landley wrote: > > On 2/9/22 11:12 AM, Baruch Siach wrote: > >> Hi Sun, > >> > >> On Wed, Feb 09 2022, סאן עמר wrote: > >>> Hi, I'm using busybox for a while now (v1.29.2). and I had an issue with > >>> a sigterm send randomly to a process of mine. I debugged it until I found > >>> it from the timeout process which was assigned before to another process > >>> with the same pid. (i'm using a lot of timeouts for a lot of jobs) > >>> so i looked at the code, "timeout.c" file where it sleep for 1 second in > >>> each iteration then check the timeout status. I suspect at this time the > >>> process timeout monitoring is terminated, but another one with the same > >>> pid is already created. which creates unwanted timeout. > >>> > >>> There is a comment in there about sleep for "HUGE NUM" will probably > >>> result in this issue, but I can't see why it won't occur also in the > >>> current > >>> case. > >>> > >>> there is no change of this behaviour in the latest master. > >>> i would appreciate any help, sun. > >> Any reference to PID number is inherently racy. > > Not between parent and child. > > Except in BB’s timeout, the relationship is not parent/child :) > > Much to my surprise, I’ll say that. When I read the bug report the other > day, I thought to myself well, this one ought to be easy to fix. But no, > there’s no SIGCHLD to be handled, no relationship between processes to > be leveraged. > > I don’t think this bug can be fixed without a near-complete rewrite, or > without doing a lot of procfs digging to really validate the waited-on > process, since kill(pid, 0) only validates a pid, not a process.
https://github.com/brgl/busybox/blob/master/miscutils/timeout.c This is the code under inspection: grandchild: /* Just sleep(HUGE_NUM); kill(parent) may kill wrong process! */ while (1) { sleep(1); if (--timeout <= 0) break; if (kill(parent, 0)) { /* process is gone */ return EXIT_SUCCESS; } } kill(parent, signo); return EXIT_SUCCESS; After all, it might conduct to a PID-race only if the same pid is reused within a second. Which means that 32768-N processes are created in less than a second. Where N is the running processes in the system. Best regards, -- Roberto A. Foglietta +39.349.33.30.697 _______________________________________________ busybox mailing list [email protected] http://lists.busybox.net/mailman/listinfo/busybox
