> Date: Tue, 2 May 2017 15:52:56 +0000 > From: Visa Hankala <v...@openbsd.org> > > On Mon, May 01, 2017 at 06:02:24PM +0200, Mark Kettenis wrote: > > The futex(2) syscall needs to be able to atomically copy the futex in > > and out of userland. The current implementation uses copyin(9) and > > copyout(9) for that. The futex is a 32-bit integer, and currently our > > copyin(9) and copyout(9) don't guarantee an atomic 32-bit access. > > Previously mpi@ and I discussed implementing new interfaces that do > > guarantee the required atomicity. However, it oocurred to me that we > > could simply change our copyin implementations such that they > > guarantee atomicity of a properly aligned 32-bit copyin and copyout. > > > > The i386 version of these calls uses "rep movsl", which means it is > > already atomic. At least that is how I interpret 8.2.4 in Volume 3A > > of the Intel SDM. The diff below makes the amd64 version safe as > > well. This does introduce a few additional instructions in the loop. > > Apparently modern Intel CPUs optimize the string loops. If we can > > rely on the hardware to turn 32-bit moves into 64-bit moves, we could > > simplify the code by using "rep movsl" instead of "rep movsq". > > > > Thoughts? > > I would add separate routines for atomic copying because they should > fail if atomicity cannot be guaranteed. copyin(9) and copyout(9) do > not catch memory aligment issues, for example. Forcing them to handle > a special case like this does not look nice.
Hmm, that's not a very convincing argument. Normal loads and stores are only guaranteed to be atomic if they happend to be properly aligned as well.