> Date: Thu, 30 Jan 2025 15:37:37 +0100 > From: Stephan <stephan...@googlemail.com> > > I am implementing a semaphore system for applications. To wait on a > semaphore to become ready, it does basically this: > > do > { > khs->khs_waiters++; > error = cv_timedwait_sig(&khs->khs_cv, > &khs->khs_interlock, mstohz((flags & SEM_RELATIVE_TIMEOUT) ? timeout : > 0)); > khs->khs_waiters--; > > [...] > > } while (khs->khs_count - count < 0);
I suggest, for now, that you structure it like this: const unsigned deadline = getticks() + mstohz(...); unsgined delta; for (;;) { if (<condition>) { error = 0; break; } if (error) /* if previous cv_timedwait_sig failed */ break; if ((delta = deadline - getticks()) > INT_MAX) { error = EWOULDBLOCK; break; } error = cv_timedwait_sig(cv, lock, delta); } This structure might look a little weird, because it's not a great API, but we haven't made anything better that resolves all the relevant issues. In particular, this approach has the following critical property just like a naive cv_wait loop with no error conditions: If <condition> becomes true, then we will notice that and never accidentally exit the loop for any other reason. This is important if you are doing something like this: mutex_enter(); register_current_thread_as_recipient_of_next_resource(); ... cv_timedwait loop ... if (error) goto out; accept_next_resource(); out: deregister_current_thread_as_recipient_of_next_resource(); mutex_exit(); return error; If another thread has assigned ownership a resource to this thread, it is critical for this thread to accept ownership (and later free the resource when done), even if it wakes up because of a signal or a timeout. We have had driver bugs in this class where a thread issues a command and waits for it to complete, the command completes at the same time as a signal or timeout, and the thread -- thinking the command has failed -- aborts the command (possibly some unrelated command that was issued after this thread's command!) and ignores the result it never realized it received. I would even suggest ignoring the return value of cv_timedwait_sig altogether and testing for signals explicitly -- except that you need the return value in order to correctly distinguish ERESTART from EINTR without a bunch of other work. By the way: Does khs->khs_waiters exist only as a micro-optimization for the wakeup path? if (__predict_false(khs->khs_waiters)) cv_signal/broadcast(cv); If so, you can ditch it -- cv_signal and cv_broadcast already have that micro-optimization internally. > I have seen that cv_signal() on a CV unblocks all threads waiting on > it, instead of just one. That this could happen is briefly mentioned > in condvar(9). You have this backwards: cv_signal generally wakes a single thread waiting on a condvar, while cv_broadcast always wakes all threads. (cv_signal is for when you're trying to hand a resource off to a single thread at a time, to avoid a thundering herd.) However, any thread sleeping in cv_*wait* may spuriously wake up at any time, so the callers have to be prepared for this. This is independent of whether you use cv_signal or cv_broadcast -- you always have to be prepared for spurious wakeup. > I think the problem is that when cv_timedwait_sig() is called > repeatedly, the original timeout is used again and again, adding up to > an unpredictable amount of time. That's right. Each call to cv_timedwait_sig sleeps for at most the time you pass it (plus scheduling slop), but multiple calls to cv_timedwait_sig in a loop are not coordinated. > If that is the case, how can it be solved best? Is > cv_timedwaitbt_sig() suitable? Or should one track sleep and wakeup > times and calculate a new timeout each time? I suggest for now that you calculate an absolute deadline in ticks, using getticks() + mstohz(...), and write your loop based on when getticks() has passed that deadline. (And just use cv_wait_sig instead for an indefinite timeout.) I added cv_timedwaitbt_sig some years ago in an attempt to address this and other problems, but it turns out that it doesn't work very well (fails to address the key issue above about accepting transfer of resource ownership), and I intend to revisit it at some point with a different approach. I briefly committed another approach I called `timedwaitclock', but it wasn't fully baked either and the name was atrocious so I backed it out: https://mail-index.netbsd.org/source-changes/2020/05/03/msg116972.html After that, I drafted another approach that I think will work out better (and has a less awful name!) but I didn't finish it -- the patch series is still languishing in my hg tree. The loop would look something like this instead: struct clockwait cw; error = clockwait_init_ticks/bt/timespec/...(&cw, timeout, clock_id /* CLOCK_REALTIME, CLOCK_MONOTONIC, &c. */, flags /* relative, absolute, &c. */); if (error) goto out; while (!<condition>) { error = cv_clockwait_sig(cv, lock, &cw); if (error) goto out; } Internally, cv_clockwait_sig would: (a) test for timeout or signals and fail _without releasing the lock_ (so the condition can't have changed if it fails), (b) wait interruptibly with timeout (and return zero in those cases, to be picked up at step (a) in the next iteration), and (c) update cw with the time slept. But this API is not ready yet. Sorry this is such a mess -- I meant to clean it up a while ago but I blew my round tuit supply on other problems.