Am 25.04.25 um 16:37 schrieb Vladimir Makarov:

On 4/19/25 3:29 PM, Denis Chertykov wrote:
Bugfix for PR118591

[...]

It is difficult for me to understand AVR code but I think the reason for the bug is in something else.  And the fix should be different.

Hi Vladimir,

let me try to explain the bug.  It occurs with avr-gcc -Os -mlra for
the following C test case:

__attribute__((noipa))
void func2 (long long a1, long long a2, long b)
{
  static unsigned char count = 0;
  if (b != count++)
    __builtin_abort ();
}

int main (void)
{
  for (long b = 0; b < 5; ++b)
    {
      __asm ("/* some reg pressure */" ::: "r5", "r9");
      func2 (0, 0, b);
    }

  return 0;
}

The bug is in main.  Due to the high register pressure, b lives in the
frame (or is spilled to a frame location).  Since the stack pointer (SP)
cannot access the stack (except for PUSH / POP), a frame pointer has to
be set up.  FP is reg Y = r29:r28 which is initialized as FP = SP in
the prologue.
        in r28,__SP_L__  ;  FP = Y = r29:r28 := SP  *movhi/7
        in r29,__SP_H__

According to the ABI, b has to be passed on the stack, so the code must
read from the frame and push the 4 bytes of b.  The generated code to
read and push b is this (-mlra -Os):
        ldd r24,Y+4      ;  63  [c=4 l=1]  movqi_insn/3
        push r24                 ;  9   [c=4 l=1]  pushqi1/0
        ldd r24,Y+4      ;  64  [c=4 l=1]  movqi_insn/3
        push r24                 ;  11  [c=4 l=1]  pushqi1/0
        ldd r24,Y+4      ;  65  [c=4 l=1]  movqi_insn/3
        push r24                 ;  13  [c=4 l=1]  pushqi1/0
        ldd r24,Y+4      ;  66  [c=4 l=1]  movqi_insn/3
        push r24                 ;  15  [c=4 l=1]  pushqi1/0

So the code is reading 4 times from the *same* location.

LRA misses that the PUSH changes SP but not FP.  They are
different registers, and changing SP does not change FP
magically.  Hence the elimination offset between FP
and SP is no more 0.  B lives in frame at Y+1...Y+4.

For reference, here is the code from Reload (-mno-lra -Os):
        ldd r24,Y+4      ;  62  [c=4 l=1]  movqi_insn/3
        push r24                 ;  9   [c=4 l=1]  pushqi1/0
        ldd r25,Y+3      ;  63  [c=4 l=1]  movqi_insn/3
        push r25                 ;  11  [c=4 l=1]  pushqi1/0
        ldd r26,Y+2      ;  64  [c=4 l=1]  movqi_insn/3
        push r26                 ;  13  [c=4 l=1]  pushqi1/0
        ldd r27,Y+1      ;  65  [c=4 l=1]  movqi_insn/3
        push r27                 ;  15  [c=4 l=1]  pushqi1/0

As it seems, lra-eliminations.cc is missing some
setup_can_eliminate (*, false), or does some incorrect
setup_can_eliminate (*, true).

As far as I know, this is the last bug that occurs with AVR+LRA.
When it is fixed, I think we can pull the LRA switch for AVR.

Johann

Reply via email to