[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

--- Comment #10 from Segher Boessenkool  ---
(In reply to Kewen Lin from comment #9)
> (In reply to Segher Boessenkool from comment #8)
> > So for which pseudo and which hard register did this ICE, and what did the
> > code look like at that point?
> 
> The culprit pseudo is r133, the values of those related expressions in the
> check:
> 
> lra_reg_info[i].nrefs  -> 4
> 
> reg_renumber[i] -> 97
> 
> overlaps_hard_reg_set_p(lra_reg_info[i].conflict_hard_regs, E_SImode, 97) ->
> true
> 
> Before IRA, the code looks like:

> (insn 34 33 35 4 (set (reg:SI 97 ctr)
> (reg/v/f:SI 133 [ foo ])) "test.f":17:72 562 {*movsi_internal1}
>  (nil))  

> in IRA, the hard reg assignment is:

> choosing r3 for r133.

Doing ctr (reg 97) instead (as LRA seems to change it to?) is
counterproductive.

We have that

> (insn 33 32 34 4 (set (reg:SI 3 3)
> (reg/v/f:SI 137 [ g ])) "test.f":17:72 562 {*movsi_internal1}
>  (nil))

right before 34, so if we want to use hard reg 3 for pseudo 97 we could
swap insns 33 and 34 (both of which are trivial assignments), much nicer
than the current dance via memory.

But all of this is a distraction from the actual bug here, sorry.

[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

--- Comment #9 from Kewen Lin  ---
(In reply to Segher Boessenkool from comment #8)
> So for which pseudo and which hard register did this ICE, and what did the
> code look like at that point?

The culprit pseudo is r133, the values of those related expressions in the
check:

lra_reg_info[i].nrefs  -> 4

reg_renumber[i] -> 97

overlaps_hard_reg_set_p(lra_reg_info[i].conflict_hard_regs, E_SImode, 97) ->
true

Before IRA, the code looks like:

(note 21 1 166 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(insn 166 21 2 2 (set (reg:SI 229)
(reg:SI 3 3 [ foo ])) "test.f":1:23 562 {*movsi_internal1}
 (expr_list:REG_DEAD (reg:SI 3 3 [ foo ])
(nil)))
(insn 2 166 167 2 (set (reg/v/f:SI 133 [ foo ])
(reg:SI 229)) "test.f":1:23 562 {*movsi_internal1}
 (expr_list:REG_DEAD (reg:SI 229)
(nil)))


(insn 33 32 34 4 (set (reg:SI 3 3)
(reg/v/f:SI 137 [ g ])) "test.f":17:72 562 {*movsi_internal1}
 (nil))
(insn 34 33 35 4 (set (reg:SI 97 ctr)
(reg/v/f:SI 133 [ foo ])) "test.f":17:72 562 {*movsi_internal1}
 (nil))  
(call_insn 35 34 38 4 (parallel [
(call (mem:SI (reg:SI 97 ctr) [0 *foo_21(D) S4 A8])
(const_int 0 [0]))
(use (const_int 4 [0x4]))
(clobber (reg:SI 96 lr))
]) "test.f":17:72 814 {*call_indirect_nonlocal_sysvsi}
 (expr_list:REG_DEAD (reg:SI 97 ctr)
(expr_list:REG_DEAD (reg:SI 3 3)
(nil)))
(expr_list:SI (use (reg:SI 3 3))
(nil)))

...

in IRA, the hard reg assignment is:

  125:r120 l0 9   86:r125 l010   13:r126 l029   87:r127 l0 8
   73:r128 l044   35:r133 l0 3   33:r134 l0 4   31:r135 l0 0
  ...

choosing r3 for r133.  As above, the r3 is redefined at insn 33, r133 is used
in insn34, so it has to try to split the live range of r133.

I didn't dig into LRA code, I don't really understand what the condition means.
:( The reason why I did the hacking by commenting gcc_unreachable() is to see
how it causes some unexpected consequence without this kind of checking. As
shown above, the resulted assembly looks inefficient, but it should work from
correctness perspective. So I guessed maybe the condition is too strict or we
miss to consider something special and punt some valid things.

[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-26 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

--- Comment #8 from Segher Boessenkool  ---
So for which pseudo and which hard register did this ICE, and what did the
code look like at that point?

[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-26 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

--- Comment #7 from Segher Boessenkool  ---
That mfctr;mtctr is extremely slow of course, and that mtctr is superfluous
completely (this is true for all registers, not just CTR, nothing special to
PowerPC even).  I know this is just -Og, but still :-)

[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

Kewen Lin  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org

--- Comment #6 from Kewen Lin  ---
>From the ICE point:

  if (! lra_hard_reg_split_p && ! lra_asm_error_p && flag_checking)
/* Check correctness of allocation but only when there are no hard reg
   splits and asm errors as in the case of errors explicit insns involving
   hard regs are added or the asm is removed and this can result in
   incorrect allocation.  */
for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
  if (lra_reg_info[i].nrefs != 0
  && reg_renumber[i] >= 0
  && overlaps_hard_reg_set_p (lra_reg_info[i].conflict_hard_regs,
  PSEUDO_REGNO_MODE (i), reg_renumber[i]))
gcc_unreachable ();

this is similar to PR97978. I commented out the line gcc_unreachable(), the
resulted assembly looks fine to me. It's able to spill the value into one stack
slot, it's reloaded after the second indirect call (though the code is very
poor but it's at -Og).

mr 3,14  // ctr doesn't conflict with r3
mfctr 9
mtctr 9
stw 9,36(1)  // saved
crxor 6,6,6
bctrl
mr 5,18
lwz 7,160(1)
mr 4,7
lwz 6,152(1)
mr 3,6
lwz 12,28(1)
mtctr 12
crxor 6,6,6
bctrl
lfs 0,.LC2@l(31)
lfs 12,0(24)
fcmpu 0,12,0
lwz 12,28(1)
lwz 6,152(1)
lwz 7,160(1)
lwz 9,36(1)  // reload
mtctr 9  // make it live in ctr

IMHO, this is very likely a LRA issue, maybe the assert condition is too
strict.

[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

Kewen Lin  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #5 from Kewen Lin  ---
Thanks for the configuration command and dump. By looking into them, I found
the difference is "--enable-secureplt", with one rebuilt cross binutils with
--enable-secureplt, I am able to reproduce this now.

[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-25 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

--- Comment #4 from Arseny Solokha  ---
Created attachment 53345
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53345=edit
-mdebug=target dump

--host=x86_64-pc-linux-gnu --target=powerpc-e300c3-linux-gnu
--build=x86_64-pc-linux-gnu --prefix=/usr
--bindir=/usr/x86_64-pc-linux-gnu/powerpc-e300c3-linux-gnu/gcc-bin/13.0.0
--includedir=/usr/lib/gcc/powerpc-e300c3-linux-gnu/13.0.0/include
--datadir=/usr/share/gcc-data/powerpc-e300c3-linux-gnu/13.0.0
--mandir=/usr/share/gcc-data/powerpc-e300c3-linux-gnu/13.0.0/man
--infodir=/usr/share/gcc-data/powerpc-e300c3-linux-gnu/13.0.0/info
--with-gxx-include-dir=/usr/lib/gcc/powerpc-e300c3-linux-gnu/13.0.0/include/g++-v13
--with-python-dir=/share/gcc-data/powerpc-e300c3-linux-gnu/13.0.0/python
--enable-languages=c,c++,fortran --enable-obsolete --enable-secureplt
--disable-werror --with-system-zlib --disable-nls
--disable-libunwind-exceptions --enable-checking=yes --disable-esp
--enable-libstdcxx-time --disable-libstdcxx-pch
--enable-poison-system-directories --with-sysroot=/usr/powerpc-e300c3-linux-gnu
--disable-bootstrap --enable-__cxa_atexit --enable-clocale=gnu
--disable-multilib --disable-fixed-point --enable-targets=all --enable-libgomp
--disable-libssp --disable-libada --disable-cet --disable-systemtap
--enable-valgrind-annotations --disable-vtable-verify --disable-libvtv
--without-zstd --enable-lto --with-isl --disable-isl-version-check
--disable-libsanitizer --enable-default-ssp

[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-24 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

--- Comment #3 from Kewen Lin  ---
(In reply to Arseny Solokha from comment #2)
> I don't set --enable-default-pie anymore when configuring gcc, so here's the
> difference. Therefore, it stops ICEing if I add -fPIE or -fPIC when
> compiling the testcase. Conversely, could you please try adding -fno-PIE?

Thanks for the reply.  Sigh, I am still unable to reproduce it. I tried to add
extra option "-fno-PIE", "-fno-PIE -fno-PIC", "-fno-pie -fno-pic", it still
passed. Same behaviors even if I removed the "--enable-default-pie" from
config, or explicitly added the "--disable-default-pie" into config.

[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-24 Thread asolokha at gmx dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

--- Comment #2 from Arseny Solokha  ---
I don't set --enable-default-pie anymore when configuring gcc, so here's the
difference. Therefore, it stops ICEing if I add -fPIE or -fPIC when compiling
the testcase. Conversely, could you please try adding -fno-PIE?

[Bug rtl-optimization/106419] ICE in lra_assign, at lra-assigns.cc:1649

2022-07-24 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106419

Kewen Lin  changed:

   What|Removed |Added

   Last reconfirmed||2022-07-25
 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING
 CC||linkw at gcc dot gnu.org

--- Comment #1 from Kewen Lin  ---
I wasn't able to reproduce this with cross build compiler w/i either latest
trunk or the mentioned snapshot.  I guess there might be some differences
between our configurations.

My configuration from by "-v": ... --target=powerpc-e300c3-linux-gnu
--enable-languages=c,c++,fortran --enable-obsolete --disable-werror
--with-system-zlib --disable-nls --disable-libunwind-exceptions
--enable-checking=yes --disable-esp --enable-libstdcxx-time
--disable-libstdcxx-pch --enable-poison-system-directories
--with-sysroot=/usr/powerpc-e300c3-linux-gnu --disable-bootstrap
--enable-__cxa_atexit --enable-clocale=gnu --disable-multilib
--disable-fixed-point --enable-targets=all --enable-libgomp --disable-libssp
--disable-libada --disable-cet --disable-systemtap
--enable-valgrind-annotations --disable-vtable-verify --disable-libvtv
--without-zstd --enable-lto --with-isl --disable-isl-version-check
--disable-libsanitizer --enable-default-pie --enable-default-ssp

Maybe one dump file from -mdebug=target (or =all) can help.