Re: new semapahore implementation using atomics and futexes

Martin Pieuchot Tue, 08 May 2018 05:52:15 -0700

On 07/05/18(Mon) 14:01, Paul Irofti wrote:
> > > > > The reason we need this is to make semaphores safe for asynchronous
> > > > > signals.
> > 
> > Could you describe with words what is currently broken and how the
> > version below fixes it?
> 
> POSIX dictates that sem_post() needs to be async-safe here[0] and is
> thus included in the list of safe functions to call from within a signal
> handler here[1].
> 
> The old semaphore implementation is using spinlocks and __thrsleep to
> synchronize between threads. 
> 
> Let's say there are two threads: T0 and T1 and the semaphore has V=0.
> T1 calls sem_wait() and it will now sleep (spinlock) until someone else
> sem_post()'s. Let's say T0 sends a signal to T1 and exits.
> The signal handler calls sem_post() which is meant to unblock T1 by
> incrementing V. With the old semaphore implementation we we are now in a
> deadlock as sem_post spinlocks on the same lock.
> 
> The implementation I am proposing does not suffer from this defect as it
> uses futexes to resolve locking and thus sem_post does not need to spin.
> Besides fixing this defect and making us POSIX compliant, this should
> also improve performance as there should be less context switching and
> thus less time spent in the kernel.


Nice!

> > >   The barriers bit is mostly from visa@, thanks!
> > > 
> > >   tb@ found offlineimap faulting because the futex syscall returned EAGAIN
> > >   and sem_wait exited. Loop again on EAGAIN.
> > > 
> > >   Debug printfs are for future debugging.
> > > 
> > > Martin, is handling EAGAIN like this correct?
> > 
> > I don't know, what is it supposed to do? 
> 
> It is supposed to recall _twait is the futex syscall returned EAGAIN.

Yes, it is correct.  If the semaphore value changed between the read and
the syscall you should try to swap it again.

> > > new file mode 100644
> > > index 00000000000..e5c8015d27c
> > > --- /dev/null
> > > +++ lib/librthread/rthread_sem_atomic.c
> > 
> > I'm not fan of the _atomic suffix.  What about rthread_semaphore.c?
> 
> I am not set on the name, but I do think rthread_semaphore is not a good
> option as it only adds confusion. Why is one _sem and another
> _semaphore? There is no info in that naming scheme.

I'm hoping that in the long run we get rid of all __thrsleep(2) based
functions.  So I don't see a point of naming the file based on which
underlying syscall it uses.

Maybe we should take the other approach and copy the existing
implementation into a rthread_sem_compat.c and use rthread_sem.c for
your new implementation.

> > > + for (;;) {
> > > +         while ((v = sem->value) > 0) {
> > > +                 ov = atomic_cas_uint(&sem->value, v, v - 1);
> > > +                 if (ov == v) {
> > > +                         membar_enter_after_atomic();
> > > +                         return 0;
> > > +                 }
> > > +         }
> > > +         if (r)
> > > +                 break;
> > > +
> > > +         atomic_inc_int(&sem->waitcount);
> > 
> > Can you keep the destroy check without adding two atomic operations here?
> 
> I don't understand what you mean here. What destroy check? How can I get
> rid of the atomic inc/dec? Should I do the operations without atomics?
> Is that what you mean?

I'm saying that increasing/decreasing `waitcount' now serves a single
purpose: the check in sem_destroy().  However with your implementation
this check is racy and adds two atomic operation on top of every
syscall.  So my question is could you rewrite the check in sem_destroy()
differently?

> > > + }
> > > + sem->lock = _SPINLOCK_UNLOCKED;
> > 
> > `sem->lock' is no longer unused and can be #ifdef out.
> 
> Good catch!

But you forgot the thread_private.h diff in your new version :)

> > > + //membar_exit_before_atomic();
> > > + //*sval = atomic_add_int_nv(&sem->value, 0);
> > 
> > What does that mean?
> 
> I wasn't sure that I can do just a read of the semaphore value (without 
> locking)
> so I left the potential locking code commented out in case the review
> process will point out that it is indeed needed.

'struct pthread_mutex' has its `lock' member as first argument to help
thinking about alignment for this reason.  I'd suggest you do the same
;)

Re: new semapahore implementation using atomics and futexes

Reply via email to