On Mon, Jun 20, 2016 at 04:26:52PM +0200, Matthieu CASTET wrote:
> Hi,
> 
> the commit 337f13046ff03717a9e99675284a817527440a49 is saying that it
> change to syscall to an equivalent to FUTEX_WAIT_BITSET |
> FUTEX_CLOCK_REALTIME with a bitset of FUTEX_BITSET_MATCH_ANY.
> 
> It seems wrong to me, because in case of FUTEX_WAIT, in
> "SYSCALL_DEFINE6(futex", we convert relative timeout to absolute
> timeout [1].
> 
> So FUTEX_CLOCK_REALTIME | FUTEX_WAIT is expecting a relative timeout
> when FUTEX_WAIT_BITSET take an absolute timeout.
> 
> To make it work you have to use something like the (untested) attached
> patch.

+Eric Dumazet

Thanks for reporting Matthieu,

FUTEX_WAIT traditionally used a relative timeout with CLOCK_MONOTONIC while
FUTEX_WAIT_BITSET could use either ??? based on the FUTEX_CLOCK_ flag used. The
man page is not particularly clear on this:

http://man7.org/linux/man-pages/man2/futex.2.html

"
The FUTEX_WAIT_BITSET operation also interprets the timeout argument
differently from FUTEX_WAIT.  See the discussion of FUTEX_CLOCK_REALTIME,
above.
"

Matthew Kerrisk:
I think this language could be removed now that we support the
FUTEX_CLOCK_REALTIME flag for both futex ops.

As for the intended behavior of the FUTEX_CLOCK_REALTIME flag:


FUTEX_CLOCK_REALTIME (since Linux 2.6.28)
    This option bit can be employed only with the FUTEX_WAIT_BITSET,
    FUTEX_WAIT_REQUEUE_PI, and FUTEX_WAIT (since Linux 4.5) operations.

(NOTE: FUTEX_WAIT was recently added after the patch in question here)

    If this option is set, the kernel treats timeout as an absolute time based
    on CLOCK_REALTIME.

    If this option is not set, the kernel treats timeout as a relative time,
    measured against the CLOCK_MONOTONIC clock.


This supports your argument Matthieu. The assumption of a relative timeout for
FUTEX_WAIT in SYSCALL_DEFINE6 needs to be updated to account for FUTEX_WAIT now
honoring the FUTEX_CLOCK_REALTIME flag, which treats the timeout as absolute.

However, I don't think the patch below is correct. The existing logic
determines the type of timeout based on the futex_op when it should instead
determine the type of timeout based on the FUTEX_CLOCK_REALTIME flag.

My reading of the man page is that FUTEX_WAIT_BITSET abides by the timeout
interpretation defined by the FUTEX_CLOCK_REALTIME attribute, so
SYSCALL_DEFINE6 was misbehaving for FUTEX_WAIT|FUTEX_CLOCK_REALTIME (where the
timeout should have been treated as absolute) as well as for
FUTEX_WAIT_BITSET|FUTEX_CLOCK_MONOTONIC (where the timeout should have been
treated as relative).

Consider the following:

diff --git a/kernel/futex.c b/kernel/futex.c
index 33664f7..fa2af29 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -3230,7 +3230,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, 
val,
                        return -EINVAL;
 
                t = timespec_to_ktime(ts);
-               if (cmd == FUTEX_WAIT)
+               if (!(cmd & FUTEX_CLOCK_REALTIME))
                        t = ktime_add_safe(ktime_get(), t);
                tp = &t;
        }

The concern for me is whether the code is incorrect, or if the man page is
incorrect. Does existing userspace code expect the FUTEX_WAIT_BITSET op to
always use an absolute timeout, regardless of the CLOCK used?





> 
> Matthieu
> 
> [1]
>         if (cmd == FUTEX_WAIT)
>             t = ktime_add_safe(ktime_get(), t);

> diff --git a/kernel/futex.c b/kernel/futex.c
> index 33664f7..4bee915 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -3230,7 +3230,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, 
> u32, val,
>                       return -EINVAL;
>  
>               t = timespec_to_ktime(ts);
> -             if (cmd == FUTEX_WAIT)
> +             if (cmd == FUTEX_WAIT && !(op & FUTEX_CLOCK_REALTIME))
>                       t = ktime_add_safe(ktime_get(), t);
>               tp = &t;
>       }


-- 
Darren Hart
Intel Open Source Technology Center

Reply via email to