https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125436

--- Comment #11 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Kevin Puetz from comment #9)
> Created attachment 64546 [details]
> g++ -g -shared -fPIC -O1 foo.cpp -o libfoo.so (demonstrate corruption in gcc
> 16.1.0)
> 
> Ok, that was the easy way but I was afraid you'd say the asm-reg stuff was
> sketchy. I did not have hard registers assigned like that in the original
> case, I just had a lot of inlining and enough register complexity that it
> eventually ended up using esi for something. But I lost that pretty early in
> the attempts to minimize it, because esi is a long way down the list of
> preferred registers. I assume that's since it's callee-preserve, so the
> allocator would rather use something volatile it can just clobber. But after
> enough tinkiner I came up with a structure to ynthetically create a lot of
> simultaneously-live registers without it actually being complicated, and
> where it mostly doesn't matter which variable(s) get clobbered.
> 
> This one works with the same main.c, but its output is the opposite. I load
> many copies of magic_number, xor-ing them all together so the result depends
> on all the variables (making them all live). Since they are loaded from a
> volatile source, the optimizer can't actually assume they are all the same,
> and must store each one, creating enough pressure to that it eventually
> allocates one of them (`h`) into esi.
> 
> But we know they are all the same, so the correct answer is "0". An even
> number of copies of the same value, xor'ed together, should cancel out. But
> if/when esi gets clobbered with nullptr (the initial value of dtv in the
> first call to __tls_get_addr_slow), that incorrectly zeros `h`, leaving an
> odd number of intact registers, so the "incorrect" result is printing
> `ms_abi = 123456768`, which shows through incomplete cancellation due to
> clobbering `h`.
> 
> I also added a pair of `double` values, which end up in xmm6 and 7, and
> cancel those out using subtraction. I don't actually see those get
> corrupted, but AFAIK there's no reason __tls_get_addr wouldn't be allowed
> to; xmm6/7 are volatile in sysv_abi (but nonvolatile in ms_abi).
> 
> The generated asm shows that neither ms_tls_access nor ms_foo have done
> anything to preserve them before calling the sysv_abi function
> __tls_get_addr, so the risk is still there. Unless there's a special promise
> made by glibc somewhere that __tls_get_addr will never use xmm* (e.g. use a
> malloc that uses an SSE memset).
> 
> If anything in __tls_get_addr were to touch xmm6-15, that would make
> ms_tls_access break its claimed calling convention by trashing registers
> that are supposed to be callee-preserved.

Please try users/hjl/pr124798/master branch at

https://gitlab.com/x86-gcc/gcc/-/commits/users/hjl/pr124798/master

commit 0ab019696bcb8503e290172c95ed8049e90a25e7
Author: H.J. Lu <[email protected]>
Date:   Tue Apr 14 18:37:20 2026 +0800

    x86: Implement TARGET_FNTYPE_ABI

fixes your test.

Reply via email to