Re: [RFC] LRA: Carrying hard registers around

Michael Matz via Gcc Wed, 24 Jun 2026 09:37:50 -0700

Hello,

On Wed, 24 Jun 2026, Stefan Schulze Frielinghaus wrote:


> > > Live range splitting for hard regs which are assigned to some pseudo is
> > > already implemented.  However, live ranges of hard regs which are live
> > > due to hard register assignments
> > 
> > Define "hard register assignments".  Because ...
> 
> Basically any set insn where the left-hand side is a hard register.
> Such insns may stem from register asm (assuming that
> -fstrict-extended-asm [1] isn't used) during expand but could also be
> directly emitted by a target.
> 
> We already forbid certain forms of optimizations in e.g. cselib/combine 
> which could lead to those scenarios.  So instead of fighting 
> optimization passes I was rather hoping for a general solution.

The general solution is to not let values stay in hardregs before register 
allocation.  As I was trying to alude to here:

> > This example is similar but supported:
> > 
> >   int x;
> >   asm("" : "=r5" (x));         // insn 1 'r5 = ...something'
> >   code;
> >   asm("use %0" : : "r5" (x));  // insn 9 'use r5'
> > 
> > here the output/input are r5, as in your situation, but the actual value 
> > from 1 to 9 will be carried through the local variable 'x': a pseudo or 
> > memory:
> > 
> >   1:  r5=...
> >   1': r102=r5                  // from inline-asm output setup
> >   2:  r101=exp(r100)
> >   ...
> >   9': r5=r102                  // from inline-asm input setup
> >   9:  // some user of r5
> > 
> > The value doesn't (or shouldn't) flow through r5, but through some 
> > pseudo/memory.  It's the reg-allocators duty then to not assign it to r5 
> > itself, on the grounds that it would conflict with r100, which is 
> > constrained to r5 itself, or if it does, generated the appropriate 
> > compensation code.

Making 'x' a register var shouldn't influence this: that's the whole 
reason why we explicitely document valid uses of register vars: so that 
they can live in pseudos (or eventually, after reg-alloc, in different 
than specified hardregs) outside of the inline asm instructions referring 
to them.

> > So, make an example to show us?
> 
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125780 where we are
> given
> 
> uint32_t xdiv (uint32_t a, uint32_t b)
> {
>     register int r18 __asm("18") = 0x1122;
>     __asm volatile ("seh ; %0" : "+r" (r18), "+r" (a));
>     uint32_t c = a / b;    // r18 is live here
>     __asm volatile ("clh ; %0" : "+r" (r18), "+r" (c));
>     return c;
> }

I see, thanks.

> Due to register asm, hard register r18 is live across the division.

And that's the bogosity.  The register variable r18 can live in a pseudo 
during that division, and _should_ live in a pseudo outside of the 
specific setup/teardown sequences of the two inline asms.

In normal situations the register allocator will then do its best to 
allocate that pseudo to r18 again (and hence remove the copy insns), 
whereas in _this_ situation it won't (the pseudo conflicts with r18), so 
the copies remain, and everything will work out automatically leading to 
exactly the situation that you'd like to create after the fact by 
extending live range splitting.

> The insn implementing the division makes use of a single-register 
> constraint referring to r18 which means we finally ICE during RA which 
> is pretty annoying.  I'm not willing to declare this a user error 
> especially in light of the clear documentation of local register asm and 
> where they should have an effect.

Yes, that's not a user error, it's exactly a valid use of register vars.  
But something during codegen goes wrong when the variable r18 lives in 
hardreg 18 between the two inline asms before register allocation.  That 
wrongness makes live unnecessarily hard for regalloc.  You're trying to 
rectify the situation, I say it'd be better if the situation wouldn't 
occur to begin with.

> Again, I would like to stress that this is not restricted to register 
> asm only but we may also run into similar scenarios if targets emit hard 
> register assignments directly.

That would always be a bit suspect outside very controlled situations: if 
pattern A emits a specific hardreg setup, and pattern B uses that specific 
hardreg (and assumes it contains the value that pattern A wrote, why would 
one think that's a valid assumption?), and pattern C also uses/clobbers 
that specific hardreg, then there are no guarantees that that will work at 
all.


Ciao,
Michael.

Re: [RFC] LRA: Carrying hard registers around

Reply via email to