[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

--- Comment #18 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:4b5ef857f5faf09f274c0a95c67faaa80d198124

commit r13-6894-g4b5ef857f5faf09f274c0a95c67faaa80d198124
Author: Jakub Jelinek 
Date:   Tue Mar 28 10:43:08 2023 +0200

i386: Require just 32-bit alignment for SLOT_FLOATxFDI_387 -m32
-mpreferred-stack-boundary=2 DImode temporaries [PR109276]

The following testcase ICEs since r11-2259 because assign_386_stack_local
-> assign_stack_local -> ix86_local_alignment now uses 64-bit alignment
for DImode temporaries rather than 32-bit as before.
Most of the spots in the backend which ask for DImode temporaries are
during
expansion and those apparently are handled fine with -m32
-mpreferred-stack-boundary=2, we dynamically realign the stack in that case
(most of the spots actually don't need that alignment but at least one
does), then 2 spots are in STV which I assume also work correctly.
But during splitting we can create a DImode slot which doesn't need to be
64-bit alignment (it is nicer for performance though), when we apparently
aren't able to detect it for dynamic stack realignment purposes.

The following patch just makes the slot 32-bit aligned in that rare case.

2023-03-28  Jakub Jelinek  

PR target/109276
* config/i386/i386.cc (assign_386_stack_local): For DImode
with SLOT_FLOATxFDI_387 and -m32 -mpreferred-stack-boundary=2 pass
align 32 rather than 0 to assign_stack_local.

* gcc.target/i386/pr109276.c: New test.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

--- Comment #17 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #16)
> (In reply to Uroš Bizjak from comment #15)
> > (In reply to Jakub Jelinek from comment #13)
> > > asks for a DImode stack slot, ix86_local_alignment newly doesn't lower the
> > > alignment
> > > which isn't good for -mpreferred-stack-boundary=2.
> > 
> > IIRC, DImode FILD/FIST(P) instructions execute faster when memory is aligned
> > to 64 bits. *If* it is possible, 64 bit alignment is preferred.
> 
> Well, user who uses -mpreferred-stack-boundary=2 probably doesn't care about
> performance that much.  I guess I could also check whether we've already
> decided to dynamically reallocate the stack for some other reasons and only
> use 32-bit alignment if we haven't.
> But there are too many such flags, so not sure what to check.

No need to complicate things too much, I agree that by using
-mpreferred-stack-boundary=2 user made a compromise w.r.t. performance.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

--- Comment #16 from Jakub Jelinek  ---
(In reply to Uroš Bizjak from comment #15)
> (In reply to Jakub Jelinek from comment #13)
> > asks for a DImode stack slot, ix86_local_alignment newly doesn't lower the
> > alignment
> > which isn't good for -mpreferred-stack-boundary=2.
> 
> IIRC, DImode FILD/FIST(P) instructions execute faster when memory is aligned
> to 64 bits. *If* it is possible, 64 bit alignment is preferred.

Well, user who uses -mpreferred-stack-boundary=2 probably doesn't care about
performance that much.  I guess I could also check whether we've already
decided to dynamically reallocate the stack for some other reasons and only use
32-bit alignment if we haven't.
But there are too many such flags, so not sure what to check.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-27 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

--- Comment #15 from Uroš Bizjak  ---
(In reply to Jakub Jelinek from comment #13)
> asks for a DImode stack slot, ix86_local_alignment newly doesn't lower the
> alignment
> which isn't good for -mpreferred-stack-boundary=2.

IIRC, DImode FILD/FIST(P) instructions execute faster when memory is aligned to
64 bits. *If* it is possible, 64 bit alignment is preferred.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

Jakub Jelinek  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #14 from Jakub Jelinek  ---
Created attachment 54769
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54769&action=edit
gcc13-pr109276.patch

So, I wonder if for the local DImode temporaries we just couldn't ask for
32-bit alignment with -m32 -mpreferred-stack-alignment=2.
I went through all the assign_386_stack_local calls which do or can pass DImode
to it:
- 2 spots in sync.md, both for a MEM scratch which I don't see a reason why it 
  couldn't be just 32-bit aligned, they are during expansion, SLOT_TEMP
- the SLOT_FLOATxFDI_387 use in the splitter used in this PR, that one also
  looks ok to be 32-bit aligned, during splitting
- floatunssi2 also during expansion, SLOT_TEMP, this one I'm afraid
really
  wants the slot to be 64-bit aligned
- in ix86_expand_divmod_libfunc SLOT_TEMP, also during expansion, I think
  that one would be ok to use 32-bit aligned
- scalar_chain::make_vector_copies and scalar_chain::convert_reg, during STV,
  SLOT_STV_TEMP, I think these want 64-bit alignment

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

--- Comment #13 from Jakub Jelinek  ---
Neither the new ix86_lower_local_decl_alignment function is called, nor
SET_DECL_ALIGN in pass_adjust_alignment::execute.
What happens is that
#0  ix86_local_alignment (exp=,
mode=E_DImode, align=64, may_lower=false) at
../../gcc/config/i386/i386.cc:17499
#1  0x00883aab in get_stack_local_alignment (type=, mode=E_DImode) at ../../gcc/function.cc:290
#2  0x00883d8c in assign_stack_local_1 (mode=E_DImode, size=...,
align=0, kind=2) at ../../gcc/function.cc:388
#3  0x0088457b in assign_stack_local (mode=E_DImode, size=..., align=0)
at ../../gcc/function.cc:547
#4  0x0120a3e7 in assign_386_stack_local (mode=E_DImode,
n=SLOT_FLOATxFDI_387) at ../../gcc/config/i386/i386.cc:16688
#5  0x01b3c96f in gen_split_72 (curr_insn=0x7fffea1452c0,
operands=0x35183e0 ) at ../../gcc/config/i386/i386.md:5723
#6  0x021b7b7c in split_9 (x1=0x7fffea2fb6d8, insn=0x7fffea1452c0) at
../../gcc/config/i386/i386.md:5722
#7  0x0229f190 in split_insns (x1=0x7fffea2fb6d8, insn=0x7fffea1452c0)
at ../../gcc/config/i386/i386.md:16795
#8  0x0079c159 in try_split (pat=0x7fffea2fb6d8, trial=0x7fffea1452c0,
last=1) at ../../gcc/emit-rtl.cc:3799
asks for a DImode stack slot, ix86_local_alignment newly doesn't lower the
alignment
which isn't good for -mpreferred-stack-boundary=2.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

--- Comment #12 from Jakub Jelinek  ---
The above testcase only started ICEing with r11-2259-g0a9d711df36b42b6494b73 ,
so I think the IPA pass is innocent here.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

--- Comment #11 from Jakub Jelinek  ---
Reduced testcase:
/* PR target/109276 */
/* { dg-do compile } */
/* { dg-options "-march=x86-64" } */
/* { dg-additional-options "-mpreferred-stack-boundary=2" } */

long long a;
long double b;

void
foo (void)
{
  b += a;
}

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P2

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-24 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

--- Comment #10 from H.J. Lu  ---
(In reply to Jakub Jelinek from comment #7)
> This started to ICE with r11-508-gdfa4fcdba374ed44 in the newly added pass
> there,
> and since r11-2259-g0a9d711df36b42b6494b73 it ICEs similarly like on the
> trunk.

The new pass should just stack alignment requirements when alignment of
stack variable is increased.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-03-24

--- Comment #9 from Andrew Pinski  ---
Confirmed.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

--- Comment #8 from Andrew Pinski  ---
A workaround is to do:
long long int digit __attribute__((aligned(8)));

Note manually setting the digit alignment to 4 does not cause an ICE for GCC
10.4 either.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-24 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

Jakub Jelinek  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com,
   ||jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek  ---
This started to ICE with r11-508-gdfa4fcdba374ed44 in the newly added pass
there,
and since r11-2259-g0a9d711df36b42b6494b73 it ICEs similarly like on the trunk.

[Bug target/109276] [11/12/13 Regression] ICE: in assign_stack_local_1, at function.cc:429 with -mpreferred-stack-boundary=2

2023-03-24 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109276

Andrew Pinski  changed:

   What|Removed |Added

  Known to fail||11.1.0, 13.0
Summary|ICE: in |[11/12/13 Regression] ICE:
   |assign_stack_local_1, at|in assign_stack_local_1, at
   |function.cc:429 with|function.cc:429 with
   |-mpreferred-stack-boundary= |-mpreferred-stack-boundary=
   |2   |2
   Target Milestone|--- |11.4
   Keywords||ice-on-valid-code
  Known to work||10.1.0, 10.4.0, 9.5.0