On Monday, 1 March 2021 10:12:35 PST Ville Voutilainen wrote:
> I do have a question about the intent/concern here, regardless of what
> your patch technically
> does. The ABI break _is_ your concern, and the "heisenbugs" you were
> worried about would
> in fact be caused by the ABI break? So if you embed these things in
> your QAtomicThing, the ABI
> break may mess it up(*)? Is that a correct understanding of the concern
> here?

Let me clarify. I am stating that the current implementation is inefficient 
because Linux currently only implements 32-bit futexes. With that in mind, I 
am arguing we need to change the implementation in a way that will break ABI 
in a new release.

The concern is that such a change, despite being in experimental code for 
which ABI stability has never been promised, is still troublesome. The problem 
there is that even people who have read the documentation and waited until 
2024 to write code using the feature may still be affected. This isn't about 
Qt because I have no plans to use wait and notify in inline API.

But you can see someone doing:

#if __cplusplus >= 202002L && __has_include(<latch>)
#  include <latch>
#else
#  error "Please upgrade your compiler & standard library"
#endif

and using <latch> in their inline code. And as you say, if they then mix DSOs 
compiled with different versions of GCC, one of them might be the old, 
experimental and binary-incompatible code. Remember that before April 2024, 
the most recent Ubuntu LTS will be 22.04, which will be released before GCC 
12, which means it will contain GCC 11.

So, wholly aside from how we fix the inefficiencies, we must decide how to 
deal with the ABI break. We can:

a) not have an ABI break in the first place, by having the above code not 
   compile with GCC until we promise ABI compatibility
b) cause a non-silent ABI break
c) accept the silent ABI break

I advocate for (a) and vehemently disagree with (c). Meanwhile, (b) can be 
something like (untested!):

 _GLIBCXX_BEGIN_NAMESPACE_VERSION
   namespace __detail
   {
+    struct __waiters;
+    inline namespace __experimental
+    {
+      // from atomic_wait.cc
+      __waiters &__get_waiter(const void *);
+    }
     using __platform_wait_t = int;
 
     constexpr auto __atomic_spin_count_1 = 16;
@@ -187,11 +193,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
       static __waiters&
       _S_for(const void* __t)
       {
-       const unsigned char __mask = 0xf;
-       static __waiters __w[__mask + 1];
-
-       auto __key = _Hash_impl::hash(__t) & __mask;
-       return __w[__key];
+       return __get_waiter(__t);
       }
     };

That way, any DSO compiled using GCC 11 will fail to load when using GCC 12's 
libstdc++. And probably put "std::__detail::__experimental" in the 
GLIBCXX_EXPERIMENTAL ELF version so it's even clearer that it's experimental, 
thus helping Linux distributions and other binary artefact distributors 
realise they've made a mistake.

I still don't like (b) because it pushes the problem to the wrong people: the 
packagers. But it's far better than (c).

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering



Reply via email to