On Thu, 26 Oct 2023 21:55:24 +0200 Thomas Monjalon <tho...@monjalon.net> wrote:
> > To be safe the sleep has to be longer than the system clock tick. > > Most systems are built today with HZ=250 but really should be using HZ=1000 > > on modern CPU's. > > If it has to be more than 1 ms, > we should mention it is a slow call > which may be skipped if the thread is already blocking on something else. A thread that is real time is not subject to timer slack. Therefore even 100ns would probably work fine. https://man7.org/linux/man-pages/man7/time.7.html High-resolution timers Before Linux 2.6.21, the accuracy of timer and sleep system calls (see below) was also limited by the size of the jiffy. Since Linux 2.6.21, Linux supports high-resolution timers (HRTs), optionally configurable via CONFIG_HIGH_RES_TIMERS. On a system that supports HRTs, the accuracy of sleep and timer system calls is no longer constrained by the jiffy, but instead can be as accurate as the hardware allows (microsecond accuracy is typical of modern hardware). You can determine whether high-resolution timers are supported by checking the resolution returned by a call to clock_getres(2) or looking at the "resolution" entries in /proc/timer_list. HRTs are not supported on all hardware architectures. (Support is provided on x86, ARM, and PowerPC, among others.) ... Timer slack Since Linux 2.6.28, it is possible to control the "timer slack" value for a thread. The timer slack is the length of time by which the kernel may delay the wake-up of certain system calls that block with a timeout. Permitting this delay allows the kernel to coalesce wake-up events, thus possibly reducing the number of system wake-ups and saving power. For more details, see the description of PR_SET_TIMERSLACK in prctl(2). ... https://man7.org/linux/man-pages/man2/prctl.2.html PR_SET_TIMERSLACK (since Linux 2.6.28) Each thread has two associated timer slack values: a "default" value, and a "current" value. This operation sets the "current" timer slack value for the calling thread. arg2 is an unsigned long value, then maximum "current" value is ULONG_MAX and the minimum "current" value is 1. If the nanosecond value supplied in arg2 is greater than zero, then the "current" value is set to this value. If arg2 is equal to zero, the "current" timer slack is reset to the thread's "default" timer slack value. The "current" timer slack is used by the kernel to group timer expirations for the calling thread that are close to one another; as a consequence, timer expirations for the thread may be up to the specified number of nanoseconds late (but will never expire early). Grouping timer expirations can help reduce system power consumption by minimizing CPU wake-ups. The timer expirations affected by timer slack are those set by select(2), pselect(2), poll(2), ppoll(2), epoll_wait(2), epoll_pwait(2), clock_nanosleep(2), nanosleep(2), and futex(2) (and thus the library functions implemented via futexes, including pthread_cond_timedwait(3), pthread_mutex_timedlock(3), pthread_rwlock_timedrdlock(3), pthread_rwlock_timedwrlock(3), and sem_timedwait(3)). Timer slack is not applied to threads that are scheduled under a real-time scheduling policy (see sched_setscheduler(2)). When a new thread is created, the two timer slack values are made the same as the "current" value of the creating thread. Thereafter, a thread can adjust its "current" timer slack value via PR_SET_TIMERSLACK. The "default" value can't be changed. The timer slack values of init (PID 1), the ancestor of all processes, are 50,000 nanoseconds (50 microseconds). The timer slack value is inherited by a child created via fork(2), and is preserved across execve(2). Since Linux 4.6, the "current" timer slack value of any process can be examined and changed via the file /proc/pid/timerslack_ns. See proc(5).