On Tuesday, 26 May 2026 09:29:29 Pacific Daylight Time Thiago Macieira wrote:
> BTW, my prompt to Claude was:
> 
>  I have a test bug that looks like a compiler problem. It could be a source
> problem, but I don't think so.
>  When running @tests/auto/corelib/tools/qarraydata/tst_qarraydata.cpp test
> arrayDataOps, I get the following output:
> ...
>  The compiled assembly is @tst_qarraydata.cpp.s and the object file is
> @tst_qarraydata.cpp.o
> 
> So I did tell it that the problem could be a compiler bug. That's probably
> confirmation bias, because I had concluded the same.
> 
> But in any case, it found the exact same piece of assembly I had and
> concluded the exact same thing: the cached size wasn't updated.
> 
> What we both got wrong is the attribution: the issue wasn't a compiler bug,
> but a source code bug that caused UB.

I've resumed the session and told it to look at operator* and operator->

This is what it replied after thinking a bit:
====
● Read qarraydatapointer.h
  │ src/corelib/tools/qarraydatapointer.h
  └ L103:121 (19 lines read)

● This changes the conclusion significantly. The real bug is here.

  operator->() does static_cast<DataOps *>(this) — a downcast from 
QArrayDataPointer<T>* to the derived DataOps* (QArrayDataOps<T>). This is only 
valid if the object actually is a DataOps. In SimpleVector, d is stored as a 
plain QArrayDataPointer<T> — not DataOps. So the downcast is undefined 
behaviour.

  GCC 15.2.1 exploits this UB through TBAA (type-based alias analysis):

   - The callee copyAppend receives this typed as DataOps* (derived)
   - The caller's stack variable is typed as QArrayDataPointer<T> (base)
   - GCC's TBAA concludes: writes through a DataOps* member cannot alias the 
base-typed stack variable's size field
   - Therefore: no reload needed after the call — stale 0 is reused

  Revised conclusion

  This is a source bug, not a compiler bug. GCC 15.2.1 is correct to exploit 
the UB; it just does so more aggressively than earlier versions.

  The fix is in SimpleVector — it should store a DataOps (i.e. 
QArrayDataOps<T>) rather than a bare QArrayDataPointer<T>, matching how 
production Qt containers use this pattern. In QList and friends the stored 
member IS of the derived type, so operator->() is valid there. SimpleVector is 
the odd one out.
====

The proposed fix would have worked, but would be undesirable. Its understanding 
of the production containers is wrong. So when I asked it where it was seeing 
the derived type, it change its mind....

====

● QList stores DataPointer d — i.e. QArrayDataPointer<T>, exactly the same as 
SimpleVector. There is no QArrayDataOps<T> object anywhere.

  My previous revision was wrong. QList::d is QArrayDataPointer<T>, not 
DataOps. The same downcast UB exists in QList too — it's a pervasive design 
pattern throughout Qt.

  This means the "source bug" framing is also incorrect. The operator*/
operator-> downcast is intentional and universal. GCC has tolerated it for 
years.

  Conclusion: back to compiler bug

  The downcast pattern is deliberate and identical in both QList and 
SimpleVector. If it were truly invalid UB that GCC was correctly exploiting, 
QList would be broken too. The fact that only this specific inlined call site 
in the test misfires — where GCC can see the allocation of detached.d on the 
stack and caches size = 0 across the copyAppend call — points to a GCC 15 
regression in load elimination / alias analysis for this particular code 
shape, not a principled TBAA exploit.

====

This shows it does *not* fully understand the consequences of UB. I expect 
it's the same for most human developers. So when asking it about UBs 
expressing itself differently, it came to the correct conclusion of:

====

Revised conclusion (final)

The root cause is UB in the source: operator->() performs an invalid downcast 
from QArrayDataPointer<T>* to DataOps* when the object is not actually a 
DataOps. GCC 15.2.1's optimizer exploits this via TBAA in the specific inlined 
call shape present in the test, but not (yet) in QList's call sites. The 
behaviour difference between the two is entirely consistent with how UB works — 
it can be latent and context-sensitive.

The fix belongs in qarraydatapointer.h: operator->() should not perform a 
downcast at all. The DataOps methods should be accessible without pretending 
the object is of a derived type it isn't.

====

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Principal Engineer - Intel DCG - Platform & Sys. Eng.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

-- 
Development mailing list
[email protected]
https://lists.qt-project.org/listinfo/development

Reply via email to