On 01/12/20(Tue) 10:21, Claudio Jeker wrote: > On Mon, Nov 30, 2020 at 07:19:28PM -0300, Martin Pieuchot wrote: > > On 04/11/20(Wed) 11:19, Martin Pieuchot wrote: > > > Here's a 3rd approach to solve the TOCTOU race in single_thread_set(). > > > The issue being that the lock serializing access to `ps_single' is not > > > held when calling single_thread_check(). > > > > > > The approach below is controversial because it extends the scope of the > > > SCHED_LOCK(). On the other hand, the two others approaches that both > > > add a new lock to avoid this race ignore the fact that accesses to > > > `ps_single' are currently not clearly serialized w/o KERNEL_LOCK(). > > > > > > So the diff below improves the situation in that regard and do not add > > > more complexity due to the use of multiple locks. After having looked > > > for a way to split the SCHED_LOCK() I believe this is the simplest > > > approach. > > > > > > I deliberately used a *_locked() function to avoid grabbing the lock > > > recursively as I'm trying to get rid of the recursion, see the other > > > thread on tech@. > > > > > > That said the uses of `ps_single' in ptrace_ctrl() are not covered by > > > this diff and I'd be glad to hear some comments about them. This is > > > fine as long as all the code using `ps_single' runs under KERNEL_LOCK() > > > but since we're trying to get the single_thread_* API out of it, this > > > need to be addressed. > > > > > > Note that this diff introduces a helper for initializing ps_single* > > > values in order to keep all the accesses of those fields in the same > > > file. > > > > Anyone? With this only the `ps_threads' iteration must receive some > > love in order to take the single_thread_* API out of the KERNEL_LOCK(). > > For that I just sent a SMR_TAILQ conversion diff. > > > > Combined with the diff to remove the recursive attribute of the > > SCHED_LOCK() we're ready to split it into multiple mutexes. > > > > Does that make any sense? Comments? Oks? > > > > > Index: kern/kern_fork.c > > > =================================================================== > > > RCS file: /cvs/src/sys/kern/kern_fork.c,v > > > retrieving revision 1.226 > > > diff -u -p -r1.226 kern_fork.c > > > --- kern/kern_fork.c 25 Oct 2020 01:55:18 -0000 1.226 > > > +++ kern/kern_fork.c 4 Nov 2020 12:52:54 -0000 > > > @@ -563,10 +563,7 @@ thread_fork(struct proc *curp, void *sta > > > * if somebody else wants to take us to single threaded mode, > > > * count ourselves in. > > > */ > > > - if (pr->ps_single) { > > > - atomic_inc_int(&pr->ps_singlecount); > > > - atomic_setbits_int(&p->p_flag, P_SUSPSINGLE); > > > - } > > > + single_thread_init(p); > > This is not the right name for this function. It does not initalize > anything. Why is this indirection needed? I would just put the > SCHED_LOCK() around this bit. It makes more sense especially with the > comment above.
Updated diff does that. I introduced a function because it helps me having all the locking in the same place. > > > Index: kern/kern_sig.c > > > =================================================================== > > > RCS file: /cvs/src/sys/kern/kern_sig.c,v > > > retrieving revision 1.263 > > > diff -u -p -r1.263 kern_sig.c > > > --- kern/kern_sig.c 16 Sep 2020 13:50:42 -0000 1.263 > > > +++ kern/kern_sig.c 4 Nov 2020 12:38:35 -0000 > > > @@ -1932,11 +1932,27 @@ userret(struct proc *p) > > > p->p_cpu->ci_schedstate.spc_curpriority = p->p_usrpri; > > > } > > > > > > +void > > > +single_thread_init(struct proc *p) > > > +{ > > > + struct process *pr = p->p_p; > > > + int s; > > > + > > > + SCHED_LOCK(s); > > > + if (pr->ps_single) { > > > + atomic_inc_int(&pr->ps_singlecount); > > > + atomic_setbits_int(&p->p_flag, P_SUSPSINGLE); > > > + } > > > + SCHED_UNLOCK(s); > > > +} > > > + > > > int > > > -single_thread_check(struct proc *p, int deep) > > > +_single_thread_check_locked(struct proc *p, int deep) > > Please don't add the leading _ to this function. There is no need for it. Done. > > > @@ -2014,7 +2043,6 @@ single_thread_set(struct proc *p, enum s > > > panic("single_thread_mode = %d", mode); > > > #endif > > > } > > > - SCHED_LOCK(s); > > > pr->ps_singlecount = 0; > > > membar_producer(); > > > pr->ps_single = p; > > I think you can remove the membar_producer() here. It is of no use > anymore. I can indeed because SCHED_LOCK() is currently taken by sleep_setup(). I'd argue this is an implementation detail and I hope to change that soon. So unless I'm completely mistaken, I'd keep it. But again I don't have a strong opinion. Index: kern/kern_fork.c =================================================================== RCS file: /cvs/src/sys/kern/kern_fork.c,v retrieving revision 1.226 diff -u -p -r1.226 kern_fork.c --- kern/kern_fork.c 25 Oct 2020 01:55:18 -0000 1.226 +++ kern/kern_fork.c 1 Dec 2020 13:04:37 -0000 @@ -516,7 +516,7 @@ thread_fork(struct proc *curp, void *sta struct proc *p; pid_t tid; vaddr_t uaddr; - int error; + int s, error; if (stack == NULL) return EINVAL; @@ -563,10 +563,12 @@ thread_fork(struct proc *curp, void *sta * if somebody else wants to take us to single threaded mode, * count ourselves in. */ + SCHED_LOCK(s); if (pr->ps_single) { atomic_inc_int(&pr->ps_singlecount); atomic_setbits_int(&p->p_flag, P_SUSPSINGLE); } + SCHED_UNLOCK(s); /* * Return tid to parent thread and copy it out to userspace Index: kern/kern_sig.c =================================================================== RCS file: /cvs/src/sys/kern/kern_sig.c,v retrieving revision 1.264 diff -u -p -r1.264 kern_sig.c --- kern/kern_sig.c 8 Nov 2020 20:37:24 -0000 1.264 +++ kern/kern_sig.c 1 Dec 2020 13:04:48 -0000 @@ -1941,10 +1941,12 @@ userret(struct proc *p) } int -single_thread_check(struct proc *p, int deep) +single_thread_check_locked(struct proc *p, int deep) { struct process *pr = p->p_p; + SCHED_ASSERT_LOCKED(); + if (pr->ps_single != NULL && pr->ps_single != p) { do { int s; @@ -1957,14 +1959,12 @@ single_thread_check(struct proc *p, int return (EINTR); } - SCHED_LOCK(s); - if (pr->ps_single == NULL) { - SCHED_UNLOCK(s); + if (pr->ps_single == NULL) continue; - } if (atomic_dec_int_nv(&pr->ps_singlecount) == 0) wakeup(&pr->ps_singlecount); + if (pr->ps_flags & PS_SINGLEEXIT) { SCHED_UNLOCK(s); KERNEL_LOCK(); @@ -1975,13 +1975,24 @@ single_thread_check(struct proc *p, int /* not exiting and don't need to unwind, so suspend */ p->p_stat = SSTOP; mi_switch(); - SCHED_UNLOCK(s); } while (pr->ps_single != NULL); } return (0); } +int +single_thread_check(struct proc *p, int deep) +{ + int s, error; + + SCHED_LOCK(s); + error = single_thread_check_locked(p, deep); + SCHED_UNLOCK(s); + + return error; +} + /* * Stop other threads in the process. The mode controls how and * where the other threads should stop: @@ -2003,8 +2014,12 @@ single_thread_set(struct proc *p, enum s KERNEL_ASSERT_LOCKED(); KASSERT(curproc == p); - if ((error = single_thread_check(p, deep))) + SCHED_LOCK(s); + error = single_thread_check_locked(p, deep); + if (error) { + SCHED_UNLOCK(s); return error; + } switch (mode) { case SINGLE_SUSPEND: @@ -2022,7 +2037,6 @@ single_thread_set(struct proc *p, enum s panic("single_thread_mode = %d", mode); #endif } - SCHED_LOCK(s); pr->ps_singlecount = 0; membar_producer(); pr->ps_single = p;