Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Hi! > > That is likely POSIX conformance bug, since POSIX explicitly states that > > sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout. > > > > "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock > > shall be used to measure the time interval specified by the timeout > > argument." > > That's fine because jiffies is a less granular form of CLOCK_MONOTONIC. Looking into POSIX Realtime Clock and Timers it seems to allow that time service based on CLOCK_* clocks to have different resolution if it's less or equal than 20ms and if this fact is documented. If we wanted to be pedantic about this the man page shoud be patched... Also this gives us reasonably safe upper bound on timer expiration to be something as: sleep_time * 1.125 + 20ms Does this sounds reasonable now? -- Cyril Hrubis chru...@suse.cz
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Hi! > > That is likely POSIX conformance bug, since POSIX explicitly states that > > sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout. > > > > "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock > > shall be used to measure the time interval specified by the timeout > > argument." > > That's fine because jiffies is a less granular form of CLOCK_MONOTONIC. Looking into POSIX Realtime Clock and Timers it seems to allow that time service based on CLOCK_* clocks to have different resolution if it's less or equal than 20ms and if this fact is documented. If we wanted to be pedantic about this the man page shoud be patched... Also this gives us reasonably safe upper bound on timer expiration to be something as: sleep_time * 1.125 + 20ms Does this sounds reasonable now? -- Cyril Hrubis chru...@suse.cz
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
On Thu, 23 Jun 2016, Cyril Hrubis wrote: > > 1) sigtimedwait() is unusual in that it uses the jiffies timer. Most > >system call timeouts (including specifically the one in FUTEX_WAIT) > >use the high-resolution timer subsystem, which is a whole different > >animal with tighter guarantees, and > > That is likely POSIX conformance bug, since POSIX explicitly states that > sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout. > > "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock > shall be used to measure the time interval specified by the timeout > argument." That's fine because jiffies is a less granular form of CLOCK_MONOTONIC. Thanks, tglx
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
On Thu, 23 Jun 2016, Cyril Hrubis wrote: > > 1) sigtimedwait() is unusual in that it uses the jiffies timer. Most > >system call timeouts (including specifically the one in FUTEX_WAIT) > >use the high-resolution timer subsystem, which is a whole different > >animal with tighter guarantees, and > > That is likely POSIX conformance bug, since POSIX explicitly states that > sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout. > > "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock > shall be used to measure the time interval specified by the timeout > argument." That's fine because jiffies is a less granular form of CLOCK_MONOTONIC. Thanks, tglx
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Hi! > Two points: > 1) sigtimedwait() is unusual in that it uses the jiffies timer. Most >system call timeouts (including specifically the one in FUTEX_WAIT) >use the high-resolution timer subsystem, which is a whole different >animal with tighter guarantees, and That is likely POSIX conformance bug, since POSIX explicitly states that sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout. "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock shall be used to measure the time interval specified by the timeout argument." > 2) The worst-case error in tglx's proposal is 1/8 of the requested >timeout: the wakeup is after 112.5% of the requested time, plus >one tick. This is well within your requested accuracy. (For very >short timeouts, the "plus one tick" can dominate the percentage error.) Hmm, that still does not add up to the number in the original email where it says time_elapsed: 1.197057. As far as I can tell the worst case for a tick is CONFIG_HZ=100 so one tick is 0.01s and even after that we get 118.7% since we requested 1s. But that may be caused by the fact that the test uses gettimeofday() to measure the elapsed time, it should use CLOCK_MONOTONIC instead. -- Cyril Hrubis chru...@suse.cz
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Hi! > Two points: > 1) sigtimedwait() is unusual in that it uses the jiffies timer. Most >system call timeouts (including specifically the one in FUTEX_WAIT) >use the high-resolution timer subsystem, which is a whole different >animal with tighter guarantees, and That is likely POSIX conformance bug, since POSIX explicitly states that sigtimedwait() shall use CLOCK_MONOTONIC to measure the timeout. "If the Monotonic Clock option is supported, the CLOCK_MONOTONIC clock shall be used to measure the time interval specified by the timeout argument." > 2) The worst-case error in tglx's proposal is 1/8 of the requested >timeout: the wakeup is after 112.5% of the requested time, plus >one tick. This is well within your requested accuracy. (For very >short timeouts, the "plus one tick" can dominate the percentage error.) Hmm, that still does not add up to the number in the original email where it says time_elapsed: 1.197057. As far as I can tell the worst case for a tick is CONFIG_HZ=100 so one tick is 0.01s and even after that we get 118.7% since we requested 1s. But that may be caused by the fact that the test uses gettimeofday() to measure the elapsed time, it should use CLOCK_MONOTONIC instead. -- Cyril Hrubis chru...@suse.cz
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
On Thu, 23 Jun 2016, George Spelvin wrote: > Cyril Hrubis wrote: > > Thomas Gleixner wrote: > >> Err. You know that the timer expired because sigtimedwait() returns > >> EAGAIN. And the only thing you can reliably check for is that the timer did > >> not expired to early. Anything else is guesswork and voodoo programming. > > > But seriously is there a reason > > why OS that is not under heavy load cannot expire timers with reasonable > > overruns? I.e. if I ask for a second of sleep and expect it to be woken > > up not much more than half of a second later? > > > If we stick only to guarantees that are defined in POSIX playing music > > with mplayer would not be possible since it sleeps in futex() and if it > > wakes too late it will fail to fill buffers. In practice this worked > > fine for me for years. > > Two points: > 1) sigtimedwait() is unusual in that it uses the jiffies timer. Most >system call timeouts (including specifically the one in FUTEX_WAIT) >use the high-resolution timer subsystem, which is a whole different >animal with tighter guarantees, and As Peter said we want to convert sigtimedwait() to use hrtimers as well. We converted almost all syscalls with timeouts (futex, poll, select ) to hrtimers years ago, but somehow we missed to do the same to sigtimedwait. Thanks, tglx
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
On Thu, 23 Jun 2016, George Spelvin wrote: > Cyril Hrubis wrote: > > Thomas Gleixner wrote: > >> Err. You know that the timer expired because sigtimedwait() returns > >> EAGAIN. And the only thing you can reliably check for is that the timer did > >> not expired to early. Anything else is guesswork and voodoo programming. > > > But seriously is there a reason > > why OS that is not under heavy load cannot expire timers with reasonable > > overruns? I.e. if I ask for a second of sleep and expect it to be woken > > up not much more than half of a second later? > > > If we stick only to guarantees that are defined in POSIX playing music > > with mplayer would not be possible since it sleeps in futex() and if it > > wakes too late it will fail to fill buffers. In practice this worked > > fine for me for years. > > Two points: > 1) sigtimedwait() is unusual in that it uses the jiffies timer. Most >system call timeouts (including specifically the one in FUTEX_WAIT) >use the high-resolution timer subsystem, which is a whole different >animal with tighter guarantees, and As Peter said we want to convert sigtimedwait() to use hrtimers as well. We converted almost all syscalls with timeouts (futex, poll, select ) to hrtimers years ago, but somehow we missed to do the same to sigtimedwait. Thanks, tglx
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Cyril Hrubis wrote: > Thomas Gleixner wrote: >> Err. You know that the timer expired because sigtimedwait() returns >> EAGAIN. And the only thing you can reliably check for is that the timer did >> not expired to early. Anything else is guesswork and voodoo programming. > But seriously is there a reason > why OS that is not under heavy load cannot expire timers with reasonable > overruns? I.e. if I ask for a second of sleep and expect it to be woken > up not much more than half of a second later? > If we stick only to guarantees that are defined in POSIX playing music > with mplayer would not be possible since it sleeps in futex() and if it > wakes too late it will fail to fill buffers. In practice this worked > fine for me for years. Two points: 1) sigtimedwait() is unusual in that it uses the jiffies timer. Most system call timeouts (including specifically the one in FUTEX_WAIT) use the high-resolution timer subsystem, which is a whole different animal with tighter guarantees, and 2) The worst-case error in tglx's proposal is 1/8 of the requested timeout: the wakeup is after 112.5% of the requested time, plus one tick. This is well within your requested accuracy. (For very short timeouts, the "plus one tick" can dominate the percentage error.)
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Cyril Hrubis wrote: > Thomas Gleixner wrote: >> Err. You know that the timer expired because sigtimedwait() returns >> EAGAIN. And the only thing you can reliably check for is that the timer did >> not expired to early. Anything else is guesswork and voodoo programming. > But seriously is there a reason > why OS that is not under heavy load cannot expire timers with reasonable > overruns? I.e. if I ask for a second of sleep and expect it to be woken > up not much more than half of a second later? > If we stick only to guarantees that are defined in POSIX playing music > with mplayer would not be possible since it sleeps in futex() and if it > wakes too late it will fail to fill buffers. In practice this worked > fine for me for years. Two points: 1) sigtimedwait() is unusual in that it uses the jiffies timer. Most system call timeouts (including specifically the one in FUTEX_WAIT) use the high-resolution timer subsystem, which is a whole different animal with tighter guarantees, and 2) The worst-case error in tglx's proposal is 1/8 of the requested timeout: the wakeup is after 112.5% of the requested time, plus one tick. This is well within your requested accuracy. (For very short timeouts, the "plus one tick" can dominate the percentage error.)
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Hi! > > While this is true, checking with reasonable error margin works just > > fine 99% of the time. You cannot really test that timer expires, without > > setting arbitrary margin. > > Err. You know that the timer expired because sigtimedwait() returns > EAGAIN. And the only thing you can reliably check for is that the timer did > not expired to early. Anything else is guesswork and voodoo programming. There is quite a lot of things that can happen on mutitasking OS and there are even NMIs in hardware, etc. But seriously is there a reason why OS that is not under heavy load cannot expire timers with reasonable overruns? I.e. if I ask for a second of sleep and expect it to be woken up not much more than half of a second later? If we stick only to guarantees that are defined in POSIX playing music with mplayer would not be possible since it sleeps in futex() and if it wakes too late it will fail to fill buffers. In practice this worked fine for me for years. -- Cyril Hrubis chru...@suse.cz
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Hi! > > While this is true, checking with reasonable error margin works just > > fine 99% of the time. You cannot really test that timer expires, without > > setting arbitrary margin. > > Err. You know that the timer expired because sigtimedwait() returns > EAGAIN. And the only thing you can reliably check for is that the timer did > not expired to early. Anything else is guesswork and voodoo programming. There is quite a lot of things that can happen on mutitasking OS and there are even NMIs in hardware, etc. But seriously is there a reason why OS that is not under heavy load cannot expire timers with reasonable overruns? I.e. if I ask for a second of sleep and expect it to be woken up not much more than half of a second later? If we stick only to guarantees that are defined in POSIX playing music with mplayer would not be possible since it sleeps in futex() and if it wakes too late it will fail to fill buffers. In practice this worked fine for me for years. -- Cyril Hrubis chru...@suse.cz
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
On Wed, 22 Jun 2016, Cyril Hrubis wrote: > Hi! > > > rtbox:~ # > > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test > > > Test FAILED: sigtimedwait() did not return in the required time > > > time_elapsed: 1.197057 > > > ...come on, you can do it... > > > rtbox:~ # > > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test > > > Test PASSED > > > > > > #define ERRORMARGIN 0.1 > > > ... > > > if ((time_elapsed > SIGTIMEDWAITSEC + ERRORMARGIN) > > > || (time_elapsed < SIGTIMEDWAITSEC - ERRORMARGIN)) { > > > printf("Test FAILED: sigtimedwait() did not return in " > > > "the required time\n"); > > > printf("time_elapsed: %lf\n", time_elapsed); > > > return PTS_FAIL; > > > } > > > > > > Looks hohum to me, but gripe did arrive with patch set, so you get a note. > > > > hohum is a euphemism. That's completely bogus. > > > > The only guarantee a syscall with timers has is: timer does not fire early. > > While this is true, checking with reasonable error margin works just > fine 99% of the time. You cannot really test that timer expires, without > setting arbitrary margin. Err. You know that the timer expired because sigtimedwait() returns EAGAIN. And the only thing you can reliably check for is that the timer did not expired to early. Anything else is guesswork and voodoo programming. Thanks, tglx
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
On Wed, 22 Jun 2016, Cyril Hrubis wrote: > Hi! > > > rtbox:~ # > > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test > > > Test FAILED: sigtimedwait() did not return in the required time > > > time_elapsed: 1.197057 > > > ...come on, you can do it... > > > rtbox:~ # > > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test > > > Test PASSED > > > > > > #define ERRORMARGIN 0.1 > > > ... > > > if ((time_elapsed > SIGTIMEDWAITSEC + ERRORMARGIN) > > > || (time_elapsed < SIGTIMEDWAITSEC - ERRORMARGIN)) { > > > printf("Test FAILED: sigtimedwait() did not return in " > > > "the required time\n"); > > > printf("time_elapsed: %lf\n", time_elapsed); > > > return PTS_FAIL; > > > } > > > > > > Looks hohum to me, but gripe did arrive with patch set, so you get a note. > > > > hohum is a euphemism. That's completely bogus. > > > > The only guarantee a syscall with timers has is: timer does not fire early. > > While this is true, checking with reasonable error margin works just > fine 99% of the time. You cannot really test that timer expires, without > setting arbitrary margin. Err. You know that the timer expired because sigtimedwait() returns EAGAIN. And the only thing you can reliably check for is that the timer did not expired to early. Anything else is guesswork and voodoo programming. Thanks, tglx
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Hi! > > rtbox:~ # > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test > > Test FAILED: sigtimedwait() did not return in the required time > > time_elapsed: 1.197057 > > ...come on, you can do it... > > rtbox:~ # > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test > > Test PASSED > > > > #define ERRORMARGIN 0.1 > > ... > > if ((time_elapsed > SIGTIMEDWAITSEC + ERRORMARGIN) > > || (time_elapsed < SIGTIMEDWAITSEC - ERRORMARGIN)) { > > printf("Test FAILED: sigtimedwait() did not return in " > > "the required time\n"); > > printf("time_elapsed: %lf\n", time_elapsed); > > return PTS_FAIL; > > } > > > > Looks hohum to me, but gripe did arrive with patch set, so you get a note. > > hohum is a euphemism. That's completely bogus. > > The only guarantee a syscall with timers has is: timer does not fire early. While this is true, checking with reasonable error margin works just fine 99% of the time. You cannot really test that timer expires, without setting arbitrary margin. Looking into POSIX sigtimedwait() timer should run on CLOCK_MONOTONIC so we can call clock_getres(CLOCK_MONOTOINC, ...) double or tripple the value and use it for error margin. And also fix the test to use the CLOCK_MONOTONIC timer. And of course the error margin must not be used when we check that the elapsed time wasn't shorter than we expected. Does that sound reasonable? -- Cyril Hrubis chru...@suse.cz
Re: [LTP] [patch V2 00/20] timer: Refactor the timer wheel
Hi! > > rtbox:~ # > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test > > Test FAILED: sigtimedwait() did not return in the required time > > time_elapsed: 1.197057 > > ...come on, you can do it... > > rtbox:~ # > > /usr/local/ltp/conformance/interfaces/sigtimedwait/sigtimedwait_1-1.run-test > > Test PASSED > > > > #define ERRORMARGIN 0.1 > > ... > > if ((time_elapsed > SIGTIMEDWAITSEC + ERRORMARGIN) > > || (time_elapsed < SIGTIMEDWAITSEC - ERRORMARGIN)) { > > printf("Test FAILED: sigtimedwait() did not return in " > > "the required time\n"); > > printf("time_elapsed: %lf\n", time_elapsed); > > return PTS_FAIL; > > } > > > > Looks hohum to me, but gripe did arrive with patch set, so you get a note. > > hohum is a euphemism. That's completely bogus. > > The only guarantee a syscall with timers has is: timer does not fire early. While this is true, checking with reasonable error margin works just fine 99% of the time. You cannot really test that timer expires, without setting arbitrary margin. Looking into POSIX sigtimedwait() timer should run on CLOCK_MONOTONIC so we can call clock_getres(CLOCK_MONOTOINC, ...) double or tripple the value and use it for error margin. And also fix the test to use the CLOCK_MONOTONIC timer. And of course the error margin must not be used when we check that the elapsed time wasn't shorter than we expected. Does that sound reasonable? -- Cyril Hrubis chru...@suse.cz