On Mon, Dec 10, 2018 at 04:23:06PM +0100, Thomas Gleixner wrote:

>  kernel/futex.c |   57 
> +++++++++++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 53 insertions(+), 4 deletions(-)
> 
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -1148,11 +1148,60 @@ static int attach_to_pi_state(u32 __user
>       return ret;
>  }
>  
> +static int handle_exit_race(u32 __user *uaddr, u32 uval, struct task_struct 
> *tsk)
> +{
> +     u32 uval2;
> +
> +     /*
> +      * If PF_EXITPIDONE is not yet set try again.
> +      */
> +     if (!(tsk->flags & PF_EXITPIDONE))
> +             return -EAGAIN;
> +
> +     /*
> +      * Reread the user space value to handle the following situation:
> +      *
> +      * CPU0                         CPU1
> +      *
> +      * sys_exit()                   sys_futex()
> +      *  do_exit()                    futex_lock_pi()
> +      *   exit_signals(tsk)            No waiters:
> +      *    tsk->flags |= PF_EXITING;   *uaddr == 0x00000PID
> +      *  mm_release(tsk)               Set waiter bit
> +      *   exit_robust_list(tsk) {      *uaddr = 0x80000PID;

Just to clarify; this is: sys_futex() <- futex_lock_pi() <-
futex_lock_pi_atomic(), where we do:

  lock_pi_update_atomic(); // changes the futex word
  attach_to_pi_owner(); // possibly returns ESRCH after changing the word


> +      *      Set owner died            attach_to_pi_owner() {
> +      *    *uaddr = 0xC0000000;         tsk = get_task(PID);
> +      *   }                             if (!tsk->flags & PF_EXITING) {
> +      *  ...                              attach();
> +      *  tsk->flags |= PF_EXITPIDONE;   } else {
> +      *                                   if (!(tsk->flags & PF_EXITPIDONE))
> +      *                                     return -EAGAIN;
> +      *                                   return -ESRCH; <--- FAIL
> +      *                                 }
> +      *
> +      * Returning ESRCH unconditionally is wrong here because the
> +      * user space value has been changed by the exiting task.
> +      */
> +     if (get_futex_value_locked(&uval2, uaddr))
> +             return -EFAULT;
> +
> +     /* If the user space value has changed, try again. */
> +     if (uval2 != uval)
> +             return -EAGAIN;

And this then goes back to futex_lock_pi(), which does a retry loop.

> +     /*
> +      * The exiting task did not have a robust list, the robust list was
> +      * corrupted or the user space value in *uaddr is simply bogus.
> +      * Give up and tell user space.
> +      */
> +     return -ESRCH;

If it is unchanged; -ESRCH is a valid return value.

> +}

There is another callers of futex_lock_pi_atomic(),
futex_proxy_trylock_atomic(), which is part of futex_requeue(), that too
does a retry loop on -EAGAIN.

And there is another caller of attach_to_pi_owner(): lookup_pi_state(),
and that too is in futex_requeue() and handles the retry case properly.

Yes, this all looks good.

Acked-by: Peter Zijlstra (Intel) <[email protected]>

Reply via email to