Hi,

On 2025-06-10 17:28:11 +0300, Konstantin Knizhnik wrote:
> On 09/06/2025 2:05 am, Thomas Munro wrote:
> > On Sat, Jun 7, 2025 at 6:47 AM Andres Freund <and...@anarazel.de> wrote:
> > > On 2025-06-06 14:03:12 +0300, Konstantin Knizhnik wrote:
> > > > There is really essential difference in code generated by clang 15 
> > > > (working)
> > > > and 16 (not working).
> > > There also are code gen differences between upstream clang 17 and apple's
> > > clang, which is based on llvm 17 as well (I've updated the toolchain, it
> > > repros with that as well).
> > Just for the record, Apple clang 17 (self-reported clobbered version)
> > is said to be based on LLVM 19[1].  For a long time it was off by one
> > but tada now it's apparently two.  Might be relevant if people are
> > comparing generated code up that close....

You've got to be kidding me. Because the world otherwise would be too easy, I
guess.

> > . o O (I wonder if one could corroborate that by running "strings" on
> > upstream clang binaries (as compiled by MacPorts/whatever) for each
> > major version and finding new strings, ie strings that don't appear in
> > earlier major versions, and then seeing which ones are present in
> > Apple's clang binaries...  What a silly problem.)
> > 
> > [1] 
> > https://en.wikipedia.org/wiki/Xcode#Xcode_15.0_-_16.x_(since_visionOS_support)
> 
> 
> Some updates: I was able to reproduce the problem at my Mac with old clang
> (15.0) but only with disabled optimization (CFLAGS=-O0).
> So very unlikely it is bug in compiler.

I was able to reproduce it with gcc, too.


> Why it is better reproduced in debug build? May be because of timing.

Code-gen wise the biggest change I see is that there is more stack spilling
due to assertion related code...


> Or may be because without optimization compiler is doing stupid things:
> loads all three bitfields from memory to register (one half word+one byte),
> then does some manipulations with this register and writes it back to
> memory. Can register somehow be clobbered between read and write (for
> example by signal handler)? Very unlikely...
> So still do not have any good hypothesis.
> 
> But with bitfields replaced with uint8 the bug is not reproduced any more.
> May be just do this change (which seems to be good thing in any case)?

I've reproduced it without that bitfield, unfortunately :(.


Unfortunately my current set of debugging output seems to have prevented the
issue from re-occurring. Need to pare it down to re-trigger. But for me it
only reproduces relatively rarely, so paring down the debug output is a rather
slow process :(


This is really a peculiar issue. I've now ran 10s of thousands of non-macos
iterations, without triggering this or a related issue even once. The one good
news is that currently the regression tests are remarkably stable, I think in
the past I hardly could have run that many iterations without (independent)
failures.

Greetings,

Andres Freund


Reply via email to