On 17/02/25(Mon) 14:45, Claudio Jeker wrote: > On Mon, Feb 17, 2025 at 02:33:21PM +0100, Martin Pieuchot wrote: > > On 17/02/25(Mon) 14:29, Claudio Jeker wrote: > > > On Mon, Feb 17, 2025 at 02:20:57PM +0100, Martin Pieuchot wrote: > > > > On 17/02/25(Mon) 03:16, Claudio Jeker wrote: > > > > > CVSROOT: /cvs > > > > > Module name: src > > > > > Changes by: clau...@cvs.openbsd.org 2025/02/17 03:16:05 > > > > > > > > > > Modified files: > > > > > sys/kern : kern_exit.c > > > > > > > > > > Log message: > > > > > Use ps_mtx to lock the child process that is being checked by dowait6. > > > > > > > > > > The checks of the various ps_flags field is all but atomic so we need > > > > > to lock the process before doing all these checks. Only that way a > > > > > reliable result is seen. It also ensures that the child process did > > > > > finish the transition before signaling up to the parent. > > > > > > > > Why not read `ps_flags' only once? > > > > > > We could do that. Will it help, not sure. I fought enough with dowait6() > > > and just want to be sure to not trip over it all the time. > > > > I feel your pain. Why a mutex help and not reading `ps_flags' only once? > > > > > > Anyway I don't understand what is being protected here. It is really > > > > hard to follow without documentation. Can you tells us which fields > > > > you're serializing access with `ps_mtx' in sys/proc.h? > > > > > > > > This should be like the manual rule. Locking MUST BE documented. > > > > > > The problem is that the atomic flags are not atomic with the process > > > state. So right now I prefer to use a mutex and make sure that there are > > > no concurrency issues. The wait process wants to inspect the current state > > > no concurrency issues. > > > > Which fields describe this "current state"? They are the one we want to > > document. > > > > ps_xsig and every thread in that process so ps_threads. ps_xsig is > currently unmarked and we should change that once the last few places are > fixed.
That's inconsistent with your diff which release the lock before reading ps_xsig. I also don't understand why `ps_flags' needs to be read with a lock when, for example PS_STOPPED is not set with the lock.