nonlocal_goto_* tests fail remat with PIE

Alexandre Oliva Sat, 13 Sep 2025 00:30:34 -0700

gcc.target/aarch64/sme/nonlocal_goto_[123].c fail on aarch64 targets
configured with --enable-default-pie.


Instead of expected 'add' insns that would compute the address of a
stack slot to pass to a function, after setting up a trampoline and
calling the instruction cache sync function with the same address, when
PIE is enabled we get an 'ldr' instead.

That's because we fail to rematerialize the address of the stack slot,
and so we allocate another pseudo to hold the address across the cache
sync call, and then restore it, as it happens, from memory.  I suspect
that's a register allocation bug, and at least part of it seems to be
related with dropping function_invariant equivalences when flag_pic is
nonzero in setup_reg_equiv, but the upcoming patch for that is not
enough for rematerialization to succeed in lra, but it is enough to
avoid the allocation of yet another pseudo in ira.

Anyhow, I am going to propose two different versions of workarounds for
the testcases, one (1v1) that tolerates the missed optimization, and
another (1v2) that forces PIE off to avoid running into the problem.
The second patch is for ira to retain the equivalence but, as mentioned
above, it is *not* enough for us to output the expected code, for
reasons I'm yet to investigate.

-- 
Alexandre Oliva, happy hacker            https://blog.lx.oliva.nom.br/
Free Software Activist     FSFLA co-founder     GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity.
Excluding neuro-others for not behaving ""normal"" is *not* inclusive!

[PATCH 0/2] [aarch64] sme/nonlocal_goto_* tests fail remat with PIE

Reply via email to