> Date: Fri, 12 May 2017 13:34:38 +0000 > From: Visa Hankala <v...@openbsd.org> > > On Mon, May 01, 2017 at 06:02:24PM +0200, Mark Kettenis wrote: > > The futex(2) syscall needs to be able to atomically copy the futex in > > and out of userland. The current implementation uses copyin(9) and > > copyout(9) for that. The futex is a 32-bit integer, and currently our > > copyin(9) and copyout(9) don't guarantee an atomic 32-bit access. > > Previously mpi@ and I discussed implementing new interfaces that do > > guarantee the required atomicity. However, it oocurred to me that we > > could simply change our copyin implementations such that they > > guarantee atomicity of a properly aligned 32-bit copyin and copyout. > > > > The i386 version of these calls uses "rep movsl", which means it is > > already atomic. At least that is how I interpret 8.2.4 in Volume 3A > > of the Intel SDM. The diff below makes the amd64 version safe as > > well. This does introduce a few additional instructions in the loop. > > Apparently modern Intel CPUs optimize the string loops. If we can > > rely on the hardware to turn 32-bit moves into 64-bit moves, we could > > simplify the code by using "rep movsl" instead of "rep movsq". > > I submit to this approach. OK visa@
So I'm partly backtracking here... I tried to convert a few more architectures, and things get a bit messier than I envisioned on architectures that have an optimized copy function and care about alignment. Especially when copyin(9) is implemented in assembly. So I implemented a new function called copyin_futex(9), which is all we really need. The implementations for powerpc and sparc64 below have been (lightly) tested. The regression test that mpi@ created passes and doesn't crash the kernel! I have an (untested) implementation for hppa as well. The amd64 and i386 implementation can simply be a wrapper around copyin(9). The same holds for all out non-MULTIPROCESSOR architectures. So my idea is to let kern/sys_futex.c provide that wrapper if MULTIPROCESSOR isn't defined and add an #ifdef MULTIPROCESSOR around all MD copyin_futex() implementations. Thoughts? Index: arch/powerpc/powerpc/pmap.c =================================================================== RCS file: /cvs/src/sys/arch/powerpc/powerpc/pmap.c,v retrieving revision 1.166 diff -u -p -r1.166 pmap.c --- arch/powerpc/powerpc/pmap.c 19 Oct 2016 08:28:20 -0000 1.166 +++ arch/powerpc/powerpc/pmap.c 14 May 2017 19:39:54 -0000 @@ -1792,6 +1802,30 @@ copyout(const void *kaddr, void *udaddr, kaddr += l; len -= l; } + curpcb->pcb_onfault = oldh; + return 0; +} + +int +copyin_futex(const uint32_t *udaddr, uint32_t *kaddr) +{ + volatile uint32_t *p; + u_int32_t oldsr; + faultbuf env; + void *oldh = curpcb->pcb_onfault; + + if ((u_int)udaddr & 0x3) + return EFAULT; + + p = PPC_USER_ADDR + ((u_int)udaddr & ~PPC_SEGMENT_MASK); + oldsr = pmap_setusr(curpcb->pcb_pm, (vaddr_t)udaddr); + if (setfault(&env)) { + pmap_popusr(oldsr); + curpcb->pcb_onfault = oldh; + return EFAULT; + } + *kaddr = *p; + pmap_popusr(oldsr); curpcb->pcb_onfault = oldh; return 0; } Index: arch/sparc64/sparc64/locore.s =================================================================== RCS file: /cvs/src/sys/arch/sparc64/sparc64/locore.s,v retrieving revision 1.185 diff -u -p -r1.185 locore.s --- arch/sparc64/sparc64/locore.s 30 Apr 2017 16:45:45 -0000 1.185 +++ arch/sparc64/sparc64/locore.s 14 May 2017 19:39:54 -0000 @@ -5968,6 +5968,21 @@ Lcopyout_done: retl ! New instr clr %o0 ! return 0 +ENTRY(copyin_futex) + andcc %o0, 0x3, %g0 + bnz,pn %xcc, Lcopyfault + nop + GET_CPCB(%o3) + set Lcopyfault, %o4 + membar #Sync + stx %o4, [%o3 + PCB_ONFAULT] + lduwa [%o0] ASI_AIUS, %o2 + stw %o2, [%o1] + membar #Sync + stx %g0, [%o3 + PCB_ONFAULT] + retl + clr %o0 + ! Copyin or copyout fault. Clear cpcb->pcb_onfault and return EFAULT. ! Note that although we were in bcopy, there is no state to clean up; ! the only special thing is that we have to return to [g7 + 8] rather than Index: kern/sys_futex.c =================================================================== RCS file: /cvs/src/sys/kern/sys_futex.c,v retrieving revision 1.2 diff -u -p -r1.2 sys_futex.c --- kern/sys_futex.c 30 Apr 2017 10:10:21 -0000 1.2 +++ kern/sys_futex.c 14 May 2017 19:39:54 -0000 @@ -31,6 +31,8 @@ #include <sys/ktrace.h> #endif +int copyin_futex(const uint32_t *, uint32_t *); + /* * Kernel representation of a futex. */ @@ -187,10 +189,8 @@ futex_wait(uint32_t *uaddr, uint32_t val /* * Read user space futex value - * - * XXX copyin(9) is not guaranteed to be atomic. */ - if ((error = copyin(uaddr, &cval, sizeof(cval)))) + if ((error = copyin_futex(uaddr, &cval))) return error; /* If the value changed, stop here. */