On Tue, 3 Jun 2025 at 09:46, Jonathan Wakely wrote: > > This adds a new implementation of std::counting_semaphore<Max> for the > case where Max == 1, i.e. the std::binary_semaphore typedef. When the > maximum counter value is 1 we don't need to load the current counter > value before doing a compare-exchange to acquire the semaphore. We can > just optimisitcally assume it's currently 1, and if that's true then the > compare_exchange will succeed. This simplifies _M_try_acquire so that we > don't need the separate _M_do_try_acquire function used by the general > __semaphore_base implementation for _Max > 1 cases. > > We can also use the simpler forms of atomic waiting that just take a > value instead of a value accessor and predicate, because we know that > the _M_counter is always a __platform_wait_t. This change adds a > bare_wait flag to __atomic_wait_address_v because we don't need to track > waiters for semaphores, we only need to notify when a semaphore with a > count of zero is released. > > I'm not sure if this makes the code any faster in real scenarios, but > the generated code for std::binary_semaphore is slightly smaller now. > > libstdc++-v3/ChangeLog: > > * include/bits/semaphore_base.h (__binary_semaphore_impl): New > class with optimized implementation for std::binary_semaphore. > (__semaphore_impl) <_max == 1>: Modify alias template to use > __binary_semaphore_impl. > * include/bits/atomic_wait.h (__atomic_wait_address_v): Add > parameter for bare waits. > --- > > Tested x86_64-linux. > > libstdc++-v3/include/bits/atomic_wait.h | 5 +- > libstdc++-v3/include/bits/semaphore_base.h | 73 +++++++++++++++++++++- > 2 files changed, 73 insertions(+), 5 deletions(-) > > diff --git a/libstdc++-v3/include/bits/atomic_wait.h > b/libstdc++-v3/include/bits/atomic_wait.h > index 815726c16ccb..9ae11191d9ab 100644 > --- a/libstdc++-v3/include/bits/atomic_wait.h > +++ b/libstdc++-v3/include/bits/atomic_wait.h > @@ -249,12 +249,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > // C++26 will return __val > } > > + // Wait on __addr while *__addr == __old is true. > inline void > __atomic_wait_address_v(const __detail::__platform_wait_t* __addr, > __detail::__platform_wait_t __old, > - int __order) > + int __order, bool __bare_wait = false) > { > - __detail::__wait_args __args{ __addr, __old, __order }; > + __detail::__wait_args __args{ __addr, __old, __order, __bare_wait }; > // C++26 will not ignore the return value here > __detail::__wait_impl(__addr, __args);
This crashes here on Solaris, this function doesn't work for proxy waits (i.e. targets without futex). It needs to use __args._M_setup_wait to get the proxy wait state, or needs the additional fix pasted below. Patches 3/4 and 4/4 depend on that fix. Patches 1/4 (which fixes the deadlock bug) and 2/4 (which is not essential but IMHO makes the code simpler) are still OK and pass all tests on Solaris without this (because only 3/4 introduces uses of the __atomic_wait_address_v overload that doesn't support proxy waits). --- a/libstdc++-v3/include/bits/atomic_wait.h +++ b/libstdc++-v3/include/bits/atomic_wait.h @@ -249,14 +249,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // C++26 will return __val } + // Wait on __addr while *__addr == __old is true. inline void __atomic_wait_address_v(const __detail::__platform_wait_t* __addr, __detail::__platform_wait_t __old, - int __order) + int __order, bool __bare_wait = false) { - __detail::__wait_args __args{ __addr, __old, __order }; + // If we have a platform wait function (e.g. futex wait) that atomically + // loads *__addr and compares it to __args._M_old then we can use that. + // Otherwise we need to use an accessor function to do that load and a + // predicate function to do the comparison. +#if _GLIBCXX_HAVE_PLATFORM_WAIT + __detail::__wait_args __args{ __addr, __old, __order, __bare_wait }; // C++26 will not ignore the return value here __detail::__wait_impl(__addr, __args); +#else + auto __vfn = [&__addr] { return __atomic_load_n(__addr, __order); }; + auto __pfn = [&__old](const __detail::__platform_wait_t& __val) { + return !__detail::__atomic_eq(__old, __val); + }; + return std::__atomic_wait_address(__addr, __pfn, __vfn, __bare_wait); +#endif } // Wait on __addr while __vfn() == __old is true.