Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Cyril Hrubis
Hi!
> > That is likely POSIX conformance bug, since POSIX explicitly states that
> > sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout.
> > 
> > "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock
> > shall be used to measure the time interval specified by the timeout
> > argument."
> 
> That's fine because jiffies is a less granular form of CLOCK_MONOTONIC.

Looking into POSIX Realtime Clock and Timers it seems to allow that time
service based on CLOCK_* clocks to have different resolution if it's
less or equal than 20ms and if this fact is documented. If we wanted to
be pedantic about this the man page shoud be patched...

Also this gives us reasonably safe upper bound on timer expiration to be
something as:

sleep_time * 1.125 + 20ms

Does this sounds reasonable now?

-- 
Cyril Hrubis
chru...@suse.cz


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Cyril Hrubis
Hi!
> > That is likely POSIX conformance bug, since POSIX explicitly states that
> > sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout.
> > 
> > "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock
> > shall be used to measure the time interval specified by the timeout
> > argument."
> 
> That's fine because jiffies is a less granular form of CLOCK_MONOTONIC.

Looking into POSIX Realtime Clock and Timers it seems to allow that time
service based on CLOCK_* clocks to have different resolution if it's
less or equal than 20ms and if this fact is documented. If we wanted to
be pedantic about this the man page shoud be patched...

Also this gives us reasonably safe upper bound on timer expiration to be
something as:

sleep_time * 1.125 + 20ms

Does this sounds reasonable now?

-- 
Cyril Hrubis
chru...@suse.cz


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Thomas Gleixner
On Thu, 23 Jun 2016, Cyril Hrubis wrote:
> > 1) sigtimedwait() is unusual in that it uses the jiffies timer.  Most
> >system call timeouts (including specifically the one in FUTEX_WAIT)
> >use the high-resolution timer subsystem, which is a whole different
> >animal with tighter guarantees, and
> 
> That is likely POSIX conformance bug, since POSIX explicitly states that
> sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout.
> 
> "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock
> shall be used to measure the time interval specified by the timeout
> argument."

That's fine because jiffies is a less granular form of CLOCK_MONOTONIC.
 
Thanks,

tglx


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Thomas Gleixner
On Thu, 23 Jun 2016, Cyril Hrubis wrote:
> > 1) sigtimedwait() is unusual in that it uses the jiffies timer.  Most
> >system call timeouts (including specifically the one in FUTEX_WAIT)
> >use the high-resolution timer subsystem, which is a whole different
> >animal with tighter guarantees, and
> 
> That is likely POSIX conformance bug, since POSIX explicitly states that
> sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout.
> 
> "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock
> shall be used to measure the time interval specified by the timeout
> argument."

That's fine because jiffies is a less granular form of CLOCK_MONOTONIC.
 
Thanks,

tglx


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Cyril Hrubis
Hi!
> Two points:
> 1) sigtimedwait() is unusual in that it uses the jiffies timer.  Most
>system call timeouts (including specifically the one in FUTEX_WAIT)
>use the high-resolution timer subsystem, which is a whole different
>animal with tighter guarantees, and

That is likely POSIX conformance bug, since POSIX explicitly states that
sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout.

"If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock
shall be used to measure the time interval specified by the timeout
argument."

> 2) The worst-case error in tglx's proposal is 1/8 of the requested
>timeout: the wakeup is after 112.5% of the requested time, plus
>one tick.  This is well within your requested accuracy.  (For very
>short timeouts, the "plus one tick" can dominate the percentage error.)

Hmm, that still does not add up to the number in the original email
where it says time_elapsed: 1.197057. As far as I can tell the worst
case for a tick is CONFIG_HZ=100 so one tick is 0.01s and even after
that we get 118.7% since we requested 1s. But that may be caused by the
fact that the test uses gettimeofday() to measure the elapsed time, it
should use CLOCK_MONOTONIC instead.

-- 
Cyril Hrubis
chru...@suse.cz


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Cyril Hrubis
Hi!
> Two points:
> 1) sigtimedwait() is unusual in that it uses the jiffies timer.  Most
>system call timeouts (including specifically the one in FUTEX_WAIT)
>use the high-resolution timer subsystem, which is a whole different
>animal with tighter guarantees, and

That is likely POSIX conformance bug, since POSIX explicitly states that
sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout.

"If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock
shall be used to measure the time interval specified by the timeout
argument."

> 2) The worst-case error in tglx's proposal is 1/8 of the requested
>timeout: the wakeup is after 112.5% of the requested time, plus
>one tick.  This is well within your requested accuracy.  (For very
>short timeouts, the "plus one tick" can dominate the percentage error.)

Hmm, that still does not add up to the number in the original email
where it says time_elapsed: 1.197057. As far as I can tell the worst
case for a tick is CONFIG_HZ=100 so one tick is 0.01s and even after
that we get 118.7% since we requested 1s. But that may be caused by the
fact that the test uses gettimeofday() to measure the elapsed time, it
should use CLOCK_MONOTONIC instead.

-- 
Cyril Hrubis
chru...@suse.cz


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Thomas Gleixner
On Thu, 23 Jun 2016, George Spelvin wrote:
> Cyril Hrubis wrote:
> > Thomas Gleixner wrote:
> >> Err. You know that the timer expired because sigtimedwait() returns
> >> EAGAIN. And the only thing you can reliably check for is that the timer did
> >> not expired to early. Anything else is guesswork and voodoo programming.
> 
> > But seriously is there a reason
> > why OS that is not under heavy load cannot expire timers with reasonable
> > overruns? I.e. if I ask for a second of sleep and expect it to be woken
> > up not much more than half of a second later?
> 
> > If we stick only to guarantees that are defined in POSIX playing music
> > with mplayer would not be possible since it sleeps in futex() and if it
> > wakes too late it will fail to fill buffers. In practice this worked
> > fine for me for years.
> 
> Two points:
> 1) sigtimedwait() is unusual in that it uses the jiffies timer.  Most
>system call timeouts (including specifically the one in FUTEX_WAIT)
>use the high-resolution timer subsystem, which is a whole different
>animal with tighter guarantees, and

As Peter said we want to convert sigtimedwait() to use hrtimers as well. We
converted almost all syscalls with timeouts (futex, poll, select ) to
hrtimers years ago, but somehow we missed to do the same to sigtimedwait.

Thanks,

tglx


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Thomas Gleixner
On Thu, 23 Jun 2016, George Spelvin wrote:
> Cyril Hrubis wrote:
> > Thomas Gleixner wrote:
> >> Err. You know that the timer expired because sigtimedwait() returns
> >> EAGAIN. And the only thing you can reliably check for is that the timer did
> >> not expired to early. Anything else is guesswork and voodoo programming.
> 
> > But seriously is there a reason
> > why OS that is not under heavy load cannot expire timers with reasonable
> > overruns? I.e. if I ask for a second of sleep and expect it to be woken
> > up not much more than half of a second later?
> 
> > If we stick only to guarantees that are defined in POSIX playing music
> > with mplayer would not be possible since it sleeps in futex() and if it
> > wakes too late it will fail to fill buffers. In practice this worked
> > fine for me for years.
> 
> Two points:
> 1) sigtimedwait() is unusual in that it uses the jiffies timer.  Most
>system call timeouts (including specifically the one in FUTEX_WAIT)
>use the high-resolution timer subsystem, which is a whole different
>animal with tighter guarantees, and

As Peter said we want to convert sigtimedwait() to use hrtimers as well. We
converted almost all syscalls with timeouts (futex, poll, select ) to
hrtimers years ago, but somehow we missed to do the same to sigtimedwait.

Thanks,

tglx


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread George Spelvin
Cyril Hrubis wrote:
> Thomas Gleixner wrote:
>> Err. You know that the timer expired because sigtimedwait() returns
>> EAGAIN. And the only thing you can reliably check for is that the timer did
>> not expired to early. Anything else is guesswork and voodoo programming.

> But seriously is there a reason
> why OS that is not under heavy load cannot expire timers with reasonable
> overruns? I.e. if I ask for a second of sleep and expect it to be woken
> up not much more than half of a second later?

> If we stick only to guarantees that are defined in POSIX playing music
> with mplayer would not be possible since it sleeps in futex() and if it
> wakes too late it will fail to fill buffers. In practice this worked
> fine for me for years.

Two points:
1) sigtimedwait() is unusual in that it uses the jiffies timer.  Most
   system call timeouts (including specifically the one in FUTEX_WAIT)
   use the high-resolution timer subsystem, which is a whole different
   animal with tighter guarantees, and
2) The worst-case error in tglx's proposal is 1/8 of the requested
   timeout: the wakeup is after 112.5% of the requested time, plus
   one tick.  This is well within your requested accuracy.  (For very
   short timeouts, the "plus one tick" can dominate the percentage error.)


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread George Spelvin
Cyril Hrubis wrote:
> Thomas Gleixner wrote:
>> Err. You know that the timer expired because sigtimedwait() returns
>> EAGAIN. And the only thing you can reliably check for is that the timer did
>> not expired to early. Anything else is guesswork and voodoo programming.

> But seriously is there a reason
> why OS that is not under heavy load cannot expire timers with reasonable
> overruns? I.e. if I ask for a second of sleep and expect it to be woken
> up not much more than half of a second later?

> If we stick only to guarantees that are defined in POSIX playing music
> with mplayer would not be possible since it sleeps in futex() and if it
> wakes too late it will fail to fill buffers. In practice this worked
> fine for me for years.

Two points:
1) sigtimedwait() is unusual in that it uses the jiffies timer.  Most
   system call timeouts (including specifically the one in FUTEX_WAIT)
   use the high-resolution timer subsystem, which is a whole different
   animal with tighter guarantees, and
2) The worst-case error in tglx's proposal is 1/8 of the requested
   timeout: the wakeup is after 112.5% of the requested time, plus
   one tick.  This is well within your requested accuracy.  (For very
   short timeouts, the "plus one tick" can dominate the percentage error.)


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Cyril Hrubis
Hi!
> > While this is true, checking with reasonable error margin works just
> > fine 99% of the time. You cannot really test that timer expires, without
> > setting arbitrary margin.
> 
> Err. You know that the timer expired because sigtimedwait() returns
> EAGAIN. And the only thing you can reliably check for is that the timer did
> not expired to early. Anything else is guesswork and voodoo programming.

There is quite a lot of things that can happen on mutitasking OS and
there are even NMIs in hardware, etc. But seriously is there a reason
why OS that is not under heavy load cannot expire timers with reasonable
overruns? I.e. if I ask for a second of sleep and expect it to be woken
up not much more than half of a second later?

If we stick only to guarantees that are defined in POSIX playing music
with mplayer would not be possible since it sleeps in futex() and if it
wakes too late it will fail to fill buffers. In practice this worked
fine for me for years.

-- 
Cyril Hrubis
chru...@suse.cz


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Cyril Hrubis
Hi!
> > While this is true, checking with reasonable error margin works just
> > fine 99% of the time. You cannot really test that timer expires, without
> > setting arbitrary margin.
> 
> Err. You know that the timer expired because sigtimedwait() returns
> EAGAIN. And the only thing you can reliably check for is that the timer did
> not expired to early. Anything else is guesswork and voodoo programming.

There is quite a lot of things that can happen on mutitasking OS and
there are even NMIs in hardware, etc. But seriously is there a reason
why OS that is not under heavy load cannot expire timers with reasonable
overruns? I.e. if I ask for a second of sleep and expect it to be woken
up not much more than half of a second later?

If we stick only to guarantees that are defined in POSIX playing music
with mplayer would not be possible since it sleeps in futex() and if it
wakes too late it will fail to fill buffers. In practice this worked
fine for me for years.

-- 
Cyril Hrubis
chru...@suse.cz


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Thomas Gleixner
On Wed, 22 Jun 2016, Cyril Hrubis wrote:
> Hi!
> > > rtbox:~ # 
> > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test
> > > Test FAILED: sigtimedwait() did not return in the required time
> > > time_elapsed: 1.197057
> > > ...come on, you can do it...
> > > rtbox:~ # 
> > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test
> > > Test PASSED
> > > 
> > > #define ERRORMARGIN 0.1
> > > ...
> > > if ((time_elapsed > SIGTIMEDWAITSEC + ERRORMARGIN)
> > > || (time_elapsed < SIGTIMEDWAITSEC - ERRORMARGIN)) {
> > > printf("Test FAILED: sigtimedwait() did not return in "
> > > "the required time\n");
> > > printf("time_elapsed: %lf\n", time_elapsed);
> > > return PTS_FAIL;
> > > }
> > > 
> > > Looks hohum to me, but gripe did arrive with patch set, so you get a note.
> > 
> > hohum is a euphemism. That's completely bogus.
> > 
> > The only guarantee a syscall with timers has is: timer does not fire early.
> 
> While this is true, checking with reasonable error margin works just
> fine 99% of the time. You cannot really test that timer expires, without
> setting arbitrary margin.

Err. You know that the timer expired because sigtimedwait() returns
EAGAIN. And the only thing you can reliably check for is that the timer did
not expired to early. Anything else is guesswork and voodoo programming.

Thanks,

tglx




Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-23 Thread Thomas Gleixner
On Wed, 22 Jun 2016, Cyril Hrubis wrote:
> Hi!
> > > rtbox:~ # 
> > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test
> > > Test FAILED: sigtimedwait() did not return in the required time
> > > time_elapsed: 1.197057
> > > ...come on, you can do it...
> > > rtbox:~ # 
> > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test
> > > Test PASSED
> > > 
> > > #define ERRORMARGIN 0.1
> > > ...
> > > if ((time_elapsed > SIGTIMEDWAITSEC + ERRORMARGIN)
> > > || (time_elapsed < SIGTIMEDWAITSEC - ERRORMARGIN)) {
> > > printf("Test FAILED: sigtimedwait() did not return in "
> > > "the required time\n");
> > > printf("time_elapsed: %lf\n", time_elapsed);
> > > return PTS_FAIL;
> > > }
> > > 
> > > Looks hohum to me, but gripe did arrive with patch set, so you get a note.
> > 
> > hohum is a euphemism. That's completely bogus.
> > 
> > The only guarantee a syscall with timers has is: timer does not fire early.
> 
> While this is true, checking with reasonable error margin works just
> fine 99% of the time. You cannot really test that timer expires, without
> setting arbitrary margin.

Err. You know that the timer expired because sigtimedwait() returns
EAGAIN. And the only thing you can reliably check for is that the timer did
not expired to early. Anything else is guesswork and voodoo programming.

Thanks,

tglx




Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-22 Thread Cyril Hrubis
Hi!
> > rtbox:~ # 
> > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test
> > Test FAILED: sigtimedwait() did not return in the required time
> > time_elapsed: 1.197057
> > ...come on, you can do it...
> > rtbox:~ # 
> > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test
> > Test PASSED
> > 
> > #define ERRORMARGIN 0.1
> > ...
> > if ((time_elapsed > SIGTIMEDWAITSEC + ERRORMARGIN)
> > || (time_elapsed < SIGTIMEDWAITSEC - ERRORMARGIN)) {
> > printf("Test FAILED: sigtimedwait() did not return in "
> > "the required time\n");
> > printf("time_elapsed: %lf\n", time_elapsed);
> > return PTS_FAIL;
> > }
> > 
> > Looks hohum to me, but gripe did arrive with patch set, so you get a note.
> 
> hohum is a euphemism. That's completely bogus.
> 
> The only guarantee a syscall with timers has is: timer does not fire early.

While this is true, checking with reasonable error margin works just
fine 99% of the time. You cannot really test that timer expires, without
setting arbitrary margin.

Looking into POSIX sigtimedwait() timer should run on CLOCK_MONOTONIC so
we can call clock_getres(CLOCK_MONOTOINC, ...) double or tripple the
value and use it for error margin. And also fix the test to use
the CLOCK_MONOTONIC timer.

And of course the error margin must not be used when we check that the
elapsed time wasn't shorter than we expected.

Does that sound reasonable?

-- 
Cyril Hrubis
chru...@suse.cz


Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel

2016-06-22 Thread Cyril Hrubis
Hi!
> > rtbox:~ # 
> > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test
> > Test FAILED: sigtimedwait() did not return in the required time
> > time_elapsed: 1.197057
> > ...come on, you can do it...
> > rtbox:~ # 
> > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test
> > Test PASSED
> > 
> > #define ERRORMARGIN 0.1
> > ...
> > if ((time_elapsed > SIGTIMEDWAITSEC + ERRORMARGIN)
> > || (time_elapsed < SIGTIMEDWAITSEC - ERRORMARGIN)) {
> > printf("Test FAILED: sigtimedwait() did not return in "
> > "the required time\n");
> > printf("time_elapsed: %lf\n", time_elapsed);
> > return PTS_FAIL;
> > }
> > 
> > Looks hohum to me, but gripe did arrive with patch set, so you get a note.
> 
> hohum is a euphemism. That's completely bogus.
> 
> The only guarantee a syscall with timers has is: timer does not fire early.

While this is true, checking with reasonable error margin works just
fine 99% of the time. You cannot really test that timer expires, without
setting arbitrary margin.

Looking into POSIX sigtimedwait() timer should run on CLOCK_MONOTONIC so
we can call clock_getres(CLOCK_MONOTOINC, ...) double or tripple the
value and use it for error margin. And also fix the test to use
the CLOCK_MONOTONIC timer.

And of course the error margin must not be used when we check that the
elapsed time wasn't shorter than we expected.

Does that sound reasonable?

-- 
Cyril Hrubis
chru...@suse.cz