Re: LRA for avr: help with FP and elimination
On 8/10/23 07:33, senthilkumar.selva...@microchip.com wrote: Hi Vlad, I can confirm your commit (https://gcc.gnu.org/git?p=gcc.git;a=commit;h=2971ff7b1d564ac04b537d907c70e6093af70832) fixes the above problem, thank you. However, I see execution failures if a pseudo assigned to FP has to be spilled because of stack slot creation. To reproduce, build the compiler just like above, and then do $ avr-gcc -mmcu=avr51 /gcc/testsuite/gcc.c-torture/execute/20050224-1.c -O2 -S -fdump-rtl-all The execution failure occurs at this point movw r24,r2 sbiw r24,36 brne .L8 r2 is never set anywhere at all in the assembly. The relevant insns (in the IRA dump) are (insn 3 15 4 3 (set (reg/v:HI 51 [ j ]) (const_int 0 [0])) "gcc/gcc/testsuite/gcc.c-torture/execute/20050224-1.c":19:21 101 {*movhi_split} (expr_list:REG_EQUAL (const_int 0 [0]) (nil))) ... (insn 28 27 67 8 (parallel [ (set (reg/v:HI 51 [ j ]) (plus:HI (reg/v:HI 51 [ j ]) (const_int 1 [0x1]))) (clobber (scratch:QI)) ]) "/home/i41766/code/personal/gcc/gcc/testsuite/gcc.c-torture/execute/20050224-1.c":28:8 175 {addhi3_clobber} (nil)) ... (jump_insn 44 43 45 13 (parallel [ (set (pc) (if_then_else (ne (reg/v:HI 51 [ j ]) (const_int 36 [0x24])) (label_ref:HI 103) (pc))) (clobber (scratch:QI)) ]) "/home/i41766/code/personal/gcc/gcc/testsuite/gcc.c-torture/execute/20050224-1.c":11:16 discrim 1 713 {cbranchhi4_insn} (expr_list:REG_DEAD (reg/v:HI 51 [ j ]) (int_list:REG_BR_PROB 7 (nil))) -> 103) LRA deletes insns 3 and 28, and uses r2 in the jump_insn. In the reload dump, for pseudo r51, I'm seeing this subreg regs: Frame pointer can not be eliminated anymore Spilling non-eliminable hard regs: 28 29 Spilling r51(28) Slot 0 regnos (width = 0):46 Slot 1 regnos (width = 0):45 lra_update_fp2sp_elimination calls spill_pseudos with HARD_FRAME_POINTER_REGNUM, and that sets reg_renumber[51] to -1. Later down the line, process_bb_lives is called with dead_insn_p=true from lra_create_lives_ranges_1 on the relevant BB (#8), and df_get_live_out on that BB does not contain 51 (even though previous calls to the same BB did). Breakpoint 8, process_bb_lives (bb=0x7fffea570240, curr_point=@0x7fffd838: 25, dead_insn_p=true) at gcc/gcc/lra-lives.cc:664 664 function_abi last_call_abi = default_function_abi; (gdb) n 666 reg_live_out = df_get_live_out (bb); (gdb) 667 sparseset_clear (pseudos_live); (gdb) p debug_bitmap(reg_live_out) first = 0x321c128 current = 0x321c128 indx = 0 0x321c128 next = (nil) prev = (nil) indx = 0 bits = { 28 32 34 43 44 47 48 49 50 } process_bb_lives then considers the insn setting 51 (and the reload insns LRA created) as dead, and removes them. BB 8 Insn 67: point = 31, n_alt = -1 Insn 114: point = 31, n_alt = 3 Deleting dead insn 114 deleting insn with uid = 114. Insn 28: point = 31, n_alt = 1 Deleting dead insn 28 deleting insn with uid = 28. Insn 113: point = 31, n_alt = 2 Deleting dead insn 113 Same for insn 3 as well BB 3 Insn 92: point = 40, n_alt = -1 Insn 5: point = 40, n_alt = 1 Insn 4: point = 41, n_alt = 3 Insn 3: point = 42, n_alt = 3 Deleting dead insn 3 deleting insn with uid = 3. Yet when it prints "Global pseudo live data has been updated" after all this, r51 is live again :( BB 8: livein: 8: 43 44 47 48 49 50 51 liveout: 8: 28 32 34 43 44 47 48 49 50 51 Eventually, it assigns 2 to r51, resulting in just the compare and branch instruction remaining in the assembly. Is this an LRA bug or is the target doing something wrong? I've reproduced this. Probably it is a bug with live info update when fp->sp elimination became invalid. I'll start to work on this problem on the next week and hope to have a fix soon after that.
Re: LRA for avr: help with FP and elimination
On Tue, 2023-07-18 at 11:04 -0400, Vladimir Makarov wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the > content is safe > > On 7/17/23 03:17, senthilkumar.selva...@microchip.com wrote: > > On Fri, 2023-07-14 at 09:29 -0400, Vladimir Makarov wrote: > > > If you send me the preprocessed test, I could start to work on it to fix > > > the problems. I think it is hard to fix them right for a person having > > > a little experience with LRA. > > > > > > > > Ok, this is a reduced test case that reproduces the failure. > > > > $ cat case.c > > typedef int HItype __attribute__ ((mode (HI))); > > HItype > > __mulvhi3 (HItype a, HItype b) > > { > >HItype w; > > > >if (__builtin_mul_overflow (a, b, )) > > __builtin_trap (); > > > >return w; > > } > > > > On latest master, this trivial patch turns on LRA for avr > > --- gcc/config/avr/avr.cc > > +++ gcc/config/avr/avr.cc > > @@ -15244,9 +15244,6 @@ avr_float_lib_compare_returns_bool (machine_mode > > mode, enum rtx_code) > > #undef TARGET_CONVERT_TO_TYPE > > #define TARGET_CONVERT_TO_TYPE avr_convert_to_type > > > > -#undef TARGET_LRA_P > > -#define TARGET_LRA_P hook_bool_void_false > > - > > #undef TARGET_ADDR_SPACE_SUBSET_P > > #define TARGET_ADDR_SPACE_SUBSET_P avr_addr_space_subset_p > > > > Then configuring and building for avr without attempting to build libgcc > > > > $ configure --target=avr --prefix= --enable-languages=c && make > > all-host && make install-host > > > > And finally to reproduce the failure > > $ /bin/avr-gcc -mmcu=avr25 case.c -Os > > Thank you. I've reproduced the bug and started to work on it > yesterday. The problem is a bit tricky than I initially thought but I > believe I'll fix it on this week. > > Hi Vlad, I can confirm your commit (https://gcc.gnu.org/git?p=gcc.git;a=commit;h=2971ff7b1d564ac04b537d907c70e6093af70832) fixes the above problem, thank you. However, I see execution failures if a pseudo assigned to FP has to be spilled because of stack slot creation. To reproduce, build the compiler just like above, and then do $ avr-gcc -mmcu=avr51 /gcc/testsuite/gcc.c-torture/execute/20050224-1.c -O2 -S -fdump-rtl-all The execution failure occurs at this point movw r24,r2 sbiw r24,36 brne .L8 r2 is never set anywhere at all in the assembly. The relevant insns (in the IRA dump) are (insn 3 15 4 3 (set (reg/v:HI 51 [ j ]) (const_int 0 [0])) "gcc/gcc/testsuite/gcc.c-torture/execute/20050224-1.c":19:21 101 {*movhi_split} (expr_list:REG_EQUAL (const_int 0 [0]) (nil))) ... (insn 28 27 67 8 (parallel [ (set (reg/v:HI 51 [ j ]) (plus:HI (reg/v:HI 51 [ j ]) (const_int 1 [0x1]))) (clobber (scratch:QI)) ]) "/home/i41766/code/personal/gcc/gcc/testsuite/gcc.c-torture/execute/20050224-1.c":28:8 175 {addhi3_clobber} (nil)) ... (jump_insn 44 43 45 13 (parallel [ (set (pc) (if_then_else (ne (reg/v:HI 51 [ j ]) (const_int 36 [0x24])) (label_ref:HI 103) (pc))) (clobber (scratch:QI)) ]) "/home/i41766/code/personal/gcc/gcc/testsuite/gcc.c-torture/execute/20050224-1.c":11:16 discrim 1 713 {cbranchhi4_insn} (expr_list:REG_DEAD (reg/v:HI 51 [ j ]) (int_list:REG_BR_PROB 7 (nil))) -> 103) LRA deletes insns 3 and 28, and uses r2 in the jump_insn. In the reload dump, for pseudo r51, I'm seeing this subreg regs: Frame pointer can not be eliminated anymore Spilling non-eliminable hard regs: 28 29 Spilling r51(28) Slot 0 regnos (width = 0): 46 Slot 1 regnos (width = 0): 45 lra_update_fp2sp_elimination calls spill_pseudos with HARD_FRAME_POINTER_REGNUM, and that sets reg_renumber[51] to -1. Later down the line, process_bb_lives is called with dead_insn_p=true from lra_create_lives_ranges_1 on the relevant BB (#8), and df_get_live_out on that BB does not contain 51 (even though previous calls to the same BB did). Breakpoint 8, process_bb_lives (bb=0x7fffea570240, curr_point=@0x7fffd838: 25, dead_insn_p=true) at gcc/gcc/lra-lives.cc:664 664 function_abi last_call_abi = default_function_abi; (gdb) n 666 reg_live_out = df_get_live_out (bb); (gdb) 667 sparseset_clear (pseudos_live); (gdb) p debug_bitmap(reg_live_out) first = 0x321c128 current = 0x321c128 indx = 0 0x321c128 next = (nil) prev = (nil) indx = 0 bits = { 28 32 34 43 44 47 48 49 50 } process_bb_lives then considers the insn setting 51 (and the reload insns LRA created) as dead, and removes them. BB 8 Insn 67: point = 31, n_alt = -1 Insn 114: point = 31, n_alt = 3 Deleting dead insn 114 deleting insn with uid = 114. Insn 28: point = 31, n_alt = 1 Deleting dead insn 28 deleting insn with uid = 28. Insn 113: point = 31, n_alt
Re: LRA for avr: help with FP and elimination
> On Jul 27, 2023, at 7:50 AM, Maciej W. Rozycki wrote: > > On Fri, 14 Jul 2023, Vladimir Makarov via Gcc wrote: > >>> On the avr, the stack pointer (SP) >>> is not used to access stack slots >> It is very uncommon target then. > > Same with the VAX target. SP is used for outgoing function arguments, > function calls, alloca only. AP is used for incoming function arguments > and is set automatically by hardware at function entry. FP is used for > local variables and is likewise set by hardware at function entry. While most other targets maintain FP in software, doesn't the same description apply to any target that can have a frame pointer. The frame pointer may be used only some of the time (PDP-11) or always (VAX) but when it's used local variable references and argument references would go through FP, not SP, right? paul
Re: LRA for avr: help with FP and elimination
On Fri, 14 Jul 2023, Vladimir Makarov via Gcc wrote: > > On the avr, the stack pointer (SP) > >is not used to access stack slots > It is very uncommon target then. Same with the VAX target. SP is used for outgoing function arguments, function calls, alloca only. AP is used for incoming function arguments and is set automatically by hardware at function entry. FP is used for local variables and is likewise set by hardware at function entry. The RET instruction sets SP from FP automatically as the first step of the function return sequence. I guess it'll affect LRA conversion of the target too. Maciej
Re: LRA for avr: help with FP and elimination
On 7/17/23 03:17, senthilkumar.selva...@microchip.com wrote: On Fri, 2023-07-14 at 09:29 -0400, Vladimir Makarov wrote: If you send me the preprocessed test, I could start to work on it to fix the problems. I think it is hard to fix them right for a person having a little experience with LRA. Ok, this is a reduced test case that reproduces the failure. $ cat case.c typedef int HItype __attribute__ ((mode (HI))); HItype __mulvhi3 (HItype a, HItype b) { HItype w; if (__builtin_mul_overflow (a, b, )) __builtin_trap (); return w; } On latest master, this trivial patch turns on LRA for avr --- gcc/config/avr/avr.cc +++ gcc/config/avr/avr.cc @@ -15244,9 +15244,6 @@ avr_float_lib_compare_returns_bool (machine_mode mode, enum rtx_code) #undef TARGET_CONVERT_TO_TYPE #define TARGET_CONVERT_TO_TYPE avr_convert_to_type -#undef TARGET_LRA_P -#define TARGET_LRA_P hook_bool_void_false - #undef TARGET_ADDR_SPACE_SUBSET_P #define TARGET_ADDR_SPACE_SUBSET_P avr_addr_space_subset_p Then configuring and building for avr without attempting to build libgcc $ configure --target=avr --prefix= --enable-languages=c && make all-host && make install-host And finally to reproduce the failure $ /bin/avr-gcc -mmcu=avr25 case.c -Os Thank you. I've reproduced the bug and started to work on it yesterday. The problem is a bit tricky than I initially thought but I believe I'll fix it on this week.
Re: LRA for avr: help with FP and elimination
On Fri, 2023-07-14 at 09:29 -0400, Vladimir Makarov wrote: > EXTERNAL EMAIL: Do not click links or open attachments unless you know the > content is safe > > On 7/13/23 05:27, SenthilKumar.Selvaraj--- via Gcc wrote: > > Hi, > > > >I've been spending some (spare) time checking what it would take to > >make LRA work for the avr target. > > > >Right after I removed the TARGET_LRA_P hook disabling LRA, building > >libgcc failed with a weird ICE. > > On the avr, the stack pointer (SP) > >is not used to access stack slots > It is very uncommon target then. > > - TARGET_CAN_ELIMINATE returns false > >if frame_pointer_needed, and TARGET_FRAME_POINTER_REQUIRED returns true > >if get_frame_size() > 0. > > > >With LRA, however, reload generates > > > > (insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__) > > (const_int 1 [0x1])) [2 %sfp+1 S1 A8]) > > (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 > > {movqi_insn_split} > > (nil)) > > > >and the backend code errors out when it finds SP is being used as a > >pointer register. > > > >Digging through the RTL dumps, I found the following. For the > >following insn sequence in *.ira > > > > (insn 189 128 159 7 (set (reg:HI 58 [ b ]) > > (const_int 0 [0])) "case.c":7:7 101 {*movhi_split} > > (nil)) > > (insn 159 189 160 7 (set (subreg:QI (reg:HI 58 [ b ]) 0) > > (reg:QI 86 [ a ])) "case.c":7:7 86 {movqi_insn_split} > > (nil)) > > (insn 160 159 32 7 (set (subreg:QI (reg:HI 58 [ b ]) 1) > > (reg:QI 87 [ a+1 ])) "case.c":7:7 86 {movqi_insn_split} > > (nil)) > > > >1. For r58, IRA picks R28:R29, which is the frame pointer for avr. > > > >Popping a13(r58,l0) -- assign reg 28 > > > >2. LRA sees the subreg in insn 159 and generates a reload reg > >(r125). simplify_subreg_regno (lra-constraints.cc:1810) however > >bails (returns -1) if the reg involved is FRAME_POINTER_REGNUM and > >reload isn't completed yet. LRA therefore decides rclass for the > >pseudo reg is NO_REGS. > > > > > > Creating newreg=125 from oldreg=58, assigning class NO_REGS to subreg reg > > r125 > >159: r125:HI#0=r86:QI > > > >4. As rclass is NO_REGS, LRA picks an insn alternative that involves > > memory. > >That is my understanding, please correct me if I'm wrong. > > > > 0 Small class reload: reject+=3 > > 0 Non input pseudo reload: reject++ > > Cycle danger: overall += LRA_MAX_REJECT > >alt=0,overall=610,losers=1,rld_nregs=1 > > 0 Small class reload: reject+=3 > > 0 Non input pseudo reload: reject++ > > alt=1: Bad operand -- refuse > > 0 Non pseudo reload: reject++ > >alt=2,overall=1,losers=0,rld_nregs=0 > >Choosing alt 2 in insn 159: (0) Qm (1) rY00 {movqi_insn_split} > > > >5. LRA creates stack slots, and then uses the FP register to access > >the slots. This is despite r58 already being assigned R28:R29. > > > >6. TARGET_FRAME_POINTER_REQUIRED is never called, and therefore > > frame_pointer_needed is not set, despite the creation of stack > > slots. TARGET_CAN_ELIMINATE therefore okays elimination of FP to SP, > > and this eventually causes the ICE when the avr backend sees SP being > > used as a pointer register. > > > >This is the relevant sequence after reload > > > > (insn 189 128 239 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) > > (const_int 0 [0])) "case.c":7:7 101 {*movhi_split} > > (nil)) > > (insn 239 189 159 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) > > (const_int 1 [0x1])) [2 %sfp+1 S2 A8]) > > (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split} > > (nil)) > > (insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__) > > (const_int 1 [0x1])) [2 %sfp+1 S1 A8]) > > (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 > > {movqi_insn_split} > > (nil)) > > (insn 240 159 241 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) > > (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) > > (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 > > {*movhi_split} > > (nil)) > > (insn 241 240 160 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) > > (const_int 1 [0x1])) [2 %sfp+1 S2 A8]) > > (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split} > > (nil)) > > (insn 160 241 242 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__) > > (const_int 2 [0x2])) [2 %sfp+2 S1 A8]) > > (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86 > > {movqi_insn_split} > > (nil)) > > (insn 242 160 33 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) > > (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) > > (const_int 1 [0x1])) [2 %sfp+1 S2
Re: LRA for avr: help with FP and elimination
On 7/13/23 05:27, SenthilKumar.Selvaraj--- via Gcc wrote: Hi, I've been spending some (spare) time checking what it would take to make LRA work for the avr target. Right after I removed the TARGET_LRA_P hook disabling LRA, building libgcc failed with a weird ICE. On the avr, the stack pointer (SP) is not used to access stack slots It is very uncommon target then. - TARGET_CAN_ELIMINATE returns false if frame_pointer_needed, and TARGET_FRAME_POINTER_REQUIRED returns true if get_frame_size() > 0. With LRA, however, reload generates (insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S1 A8]) (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split} (nil)) and the backend code errors out when it finds SP is being used as a pointer register. Digging through the RTL dumps, I found the following. For the following insn sequence in *.ira (insn 189 128 159 7 (set (reg:HI 58 [ b ]) (const_int 0 [0])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 159 189 160 7 (set (subreg:QI (reg:HI 58 [ b ]) 0) (reg:QI 86 [ a ])) "case.c":7:7 86 {movqi_insn_split} (nil)) (insn 160 159 32 7 (set (subreg:QI (reg:HI 58 [ b ]) 1) (reg:QI 87 [ a+1 ])) "case.c":7:7 86 {movqi_insn_split} (nil)) 1. For r58, IRA picks R28:R29, which is the frame pointer for avr. Popping a13(r58,l0) -- assign reg 28 2. LRA sees the subreg in insn 159 and generates a reload reg (r125). simplify_subreg_regno (lra-constraints.cc:1810) however bails (returns -1) if the reg involved is FRAME_POINTER_REGNUM and reload isn't completed yet. LRA therefore decides rclass for the pseudo reg is NO_REGS. Creating newreg=125 from oldreg=58, assigning class NO_REGS to subreg reg r125 159: r125:HI#0=r86:QI 4. As rclass is NO_REGS, LRA picks an insn alternative that involves memory. That is my understanding, please correct me if I'm wrong. 0 Small class reload: reject+=3 0 Non input pseudo reload: reject++ Cycle danger: overall += LRA_MAX_REJECT alt=0,overall=610,losers=1,rld_nregs=1 0 Small class reload: reject+=3 0 Non input pseudo reload: reject++ alt=1: Bad operand -- refuse 0 Non pseudo reload: reject++ alt=2,overall=1,losers=0,rld_nregs=0 Choosing alt 2 in insn 159: (0) Qm (1) rY00 {movqi_insn_split} 5. LRA creates stack slots, and then uses the FP register to access the slots. This is despite r58 already being assigned R28:R29. 6. TARGET_FRAME_POINTER_REQUIRED is never called, and therefore frame_pointer_needed is not set, despite the creation of stack slots. TARGET_CAN_ELIMINATE therefore okays elimination of FP to SP, and this eventually causes the ICE when the avr backend sees SP being used as a pointer register. This is the relevant sequence after reload (insn 189 128 239 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) (const_int 0 [0])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 239 189 159 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S2 A8]) (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S1 A8]) (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split} (nil)) (insn 240 159 241 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 241 240 160 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S2 A8]) (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 160 241 242 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 2 [0x2])) [2 %sfp+2 S1 A8]) (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86 {movqi_insn_split} (nil)) (insn 242 160 33 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 {*movhi_split} (nil)) For choices other than FP, simplify_subreg_regno returns the correct part of the wider HImode reg, so rclass is not NO_REGS, and things workout fine. I checked what classic reload does in the same situation - it picks a different register (R25) instead of spilling to a stack slot. (insn 189 128 159 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) (const_int 0 [0])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 159 189 226 7 (set (reg:QI 25 r25) (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86
LRA for avr: help with FP and elimination
Hi, I've been spending some (spare) time checking what it would take to make LRA work for the avr target. Right after I removed the TARGET_LRA_P hook disabling LRA, building libgcc failed with a weird ICE. On the avr, the stack pointer (SP) is not used to access stack slots - TARGET_CAN_ELIMINATE returns false if frame_pointer_needed, and TARGET_FRAME_POINTER_REQUIRED returns true if get_frame_size() > 0. With LRA, however, reload generates (insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S1 A8]) (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split} (nil)) and the backend code errors out when it finds SP is being used as a pointer register. Digging through the RTL dumps, I found the following. For the following insn sequence in *.ira (insn 189 128 159 7 (set (reg:HI 58 [ b ]) (const_int 0 [0])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 159 189 160 7 (set (subreg:QI (reg:HI 58 [ b ]) 0) (reg:QI 86 [ a ])) "case.c":7:7 86 {movqi_insn_split} (nil)) (insn 160 159 32 7 (set (subreg:QI (reg:HI 58 [ b ]) 1) (reg:QI 87 [ a+1 ])) "case.c":7:7 86 {movqi_insn_split} (nil)) 1. For r58, IRA picks R28:R29, which is the frame pointer for avr. Popping a13(r58,l0) -- assign reg 28 2. LRA sees the subreg in insn 159 and generates a reload reg (r125). simplify_subreg_regno (lra-constraints.cc:1810) however bails (returns -1) if the reg involved is FRAME_POINTER_REGNUM and reload isn't completed yet. LRA therefore decides rclass for the pseudo reg is NO_REGS. Creating newreg=125 from oldreg=58, assigning class NO_REGS to subreg reg r125 159: r125:HI#0=r86:QI 4. As rclass is NO_REGS, LRA picks an insn alternative that involves memory. That is my understanding, please correct me if I'm wrong. 0 Small class reload: reject+=3 0 Non input pseudo reload: reject++ Cycle danger: overall += LRA_MAX_REJECT alt=0,overall=610,losers=1,rld_nregs=1 0 Small class reload: reject+=3 0 Non input pseudo reload: reject++ alt=1: Bad operand -- refuse 0 Non pseudo reload: reject++ alt=2,overall=1,losers=0,rld_nregs=0 Choosing alt 2 in insn 159: (0) Qm (1) rY00 {movqi_insn_split} 5. LRA creates stack slots, and then uses the FP register to access the slots. This is despite r58 already being assigned R28:R29. 6. TARGET_FRAME_POINTER_REQUIRED is never called, and therefore frame_pointer_needed is not set, despite the creation of stack slots. TARGET_CAN_ELIMINATE therefore okays elimination of FP to SP, and this eventually causes the ICE when the avr backend sees SP being used as a pointer register. This is the relevant sequence after reload (insn 189 128 239 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) (const_int 0 [0])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 239 189 159 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S2 A8]) (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 159 239 240 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S1 A8]) (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split} (nil)) (insn 240 159 241 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 241 240 160 7 (set (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S2 A8]) (reg:HI 28 r28 [orig:58 b ] [58])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 160 241 242 7 (set (mem/c:QI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 2 [0x2])) [2 %sfp+2 S1 A8]) (reg:QI 18 r18 [orig:87 a+1 ] [87])) "case.c":7:7 86 {movqi_insn_split} (nil)) (insn 242 160 33 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) (mem/c:HI (plus:HI (reg/f:HI 32 __SP_L__) (const_int 1 [0x1])) [2 %sfp+1 S2 A8])) "case.c":7:7 101 {*movhi_split} (nil)) For choices other than FP, simplify_subreg_regno returns the correct part of the wider HImode reg, so rclass is not NO_REGS, and things workout fine. I checked what classic reload does in the same situation - it picks a different register (R25) instead of spilling to a stack slot. (insn 189 128 159 7 (set (reg:HI 28 r28 [orig:58 b ] [58]) (const_int 0 [0])) "case.c":7:7 101 {*movhi_split} (nil)) (insn 159 189 226 7 (set (reg:QI 25 r25) (reg:QI 24 r24 [orig:86 a ] [86])) "case.c":7:7 86 {movqi_insn_split} (nil)) (insn 226 159 160 7 (set (reg:QI 28 r28) (reg:QI 25 r25)) "case.c":7:7 86 {movqi_insn_split} (nil)) (insn 160 226 227 7 (set (reg:QI 25