[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-08-01 Thread andrey.vihrov at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #15 from Andrey Vihrov  ---
*** Bug 101438 has been marked as a duplicate of this bug. ***

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-07-28 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

H.J. Lu  changed:

   What|Removed |Added

 Resolution|--- |FIXED
   Target Milestone|--- |11.3
 Status|NEW |RESOLVED

--- Comment #14 from H.J. Lu  ---
Fixed for GCC 12 and GCC 11.3.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-04-01 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #13 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Vladimir Makarov
:

https://gcc.gnu.org/g:5f587c81bc558942d2988f5e2965a72471f5c202

commit r11-9754-g5f587c81bc558942d2988f5e2965a72471f5c202
Author: Vladimir N. Makarov 
Date:   Fri Apr 1 09:48:57 2022 -0400

[PR105032] LRA: modify loop condition to find reload insns for hard reg
splitting

When trying to split hard reg live range to assign hard reg to a reload
pseudo, LRA searches for reload insns of the reload pseudo
assuming a specific order of the reload insns.  This order is violated if
reload involved in inheritance transformation. In such case, the loop used
for reload insn searching can become infinite.  The patch fixes this.

gcc/ChangeLog:

PR middle-end/105032
* lra-assigns.c (find_reload_regno_insns): Modify loop condition.

gcc/testsuite/ChangeLog:

PR middle-end/105032
* gcc.target/i386/pr105032.c: New.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-30 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #12 from Vladimir Makarov  ---
GCC-11 branch needs a bit different patch.  I'll commit a modified patch to
gcc-11 branch on Friday.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-30 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Vladimir Makarov :

https://gcc.gnu.org/g:22b0476a814a4759bb68f38b9415624a0fe52a7d

commit r12-7924-g22b0476a814a4759bb68f38b9415624a0fe52a7d
Author: Vladimir N. Makarov 
Date:   Wed Mar 30 13:03:44 2022 -0400

[PR105032] LRA: modify loop condition to find reload insns for hard reg
splitting

When trying to split hard reg live range to assign hard reg to a reload
pseudo, LRA searches for reload insns of the reload pseudo
assuming a specific order of the reload insns.  This order is violated if
reload involved in inheritance transformation. In such case, the loop used
for reload insn searching can become infinite.  The patch fixes this.

gcc/ChangeLog:

PR middle-end/105032
* lra-assigns.cc (find_reload_regno_insns): Modify loop condition.

gcc/testsuite/ChangeLog:

PR middle-end/105032
* gcc.target/i386/pr105032.c: New.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-30 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #10 from Vladimir Makarov  ---
I've reproduced the bug also on the trunk.  The loop in question assumes a
specific order for reload insns.  In this case order of insns involving the
reload pseudos is violated because the pseudo is also used for inheritance.

We can change the loop condition to guarantee its finish independently of the
reload insns order.  It might results in failure of hard reg live range
splitting for the pseudo. Permitting hard reg splitting for reload pseudo
involved in inheritance is questionable with LRA correct work and generated
code efficiency.  So it has no sense for me to do this.

The patch will be pushed to trunk right after finishing testing.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-29 Thread vmakarov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #9 from Vladimir Makarov  ---
Cycling is the worst what can happen to compiler (even crash is better).
This is the highest priority PR right now for me.  I can not say why the cycle
does not finish.  It should as it works only for reload pseudos.  I'll
investigate it more.

In any case I hope to fix it on this week.  Sorry for inconvenience.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-23 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

H.J. Lu  changed:

   What|Removed |Added

 CC||hjl.tools at gmail dot com

--- Comment #8 from H.J. Lu  ---
(In reply to Jakub Jelinek from comment #7)
> Anyway, with the "b" etc. constraints (which is a good idea to use on x86
> when it has single register constraints for those but can't be used on other
> arches which do not have such constraints) you just trigger slightly
> different path in the RA, but the problem remains roughly the same, you
> force use of 6 registers as input plus one memory address and esp is a stack
> pointer and ebp could be a frame pointer and it is a question if you don't
> need another register for the address of the memory input.

Since it is very unreliable, the inlined syscall with asm statement
has been removed from glibc.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #7 from Jakub Jelinek  ---
Consider e.g.
void bar (int *);
int
foo (int *a, int *b, int *c, int *d)
{
  for (int i = 0; i < 1024; i++)
a[i] = a[i] * b[i] + (c[i] - d[i]);
  bar (a);
  return 42;
}
with -m32 -O3 -mavx -mstackrealign.
It needs to dynamically realign the stack because user asked for it (so that it
can use 256-bit aligned stack slots and callers don't guarantee that
alignment), so it needs a frame pointer (%ebp), stack pointer (%esp) and DRAP
(%ecx in this case).  Especially if you e.g. also use VLAs or alloca in the
function.
%ebp based addressing is for the automatic vars in the function in its stack
frame, stack pointer can be variable offset from it used for outgoing arguments
to function calls and push/pop or for alloca/VLAs and DRAP is used to access
function arguments which aren't at fixed offset from the frame pointer either.
Anyway, with the "b" etc. constraints (which is a good idea to use on x86 when
it has single register constraints for those but can't be used on other arches
which do not have such constraints) you just trigger slightly different path in
the RA, but the problem remains roughly the same, you force use of 6 registers
as input plus one memory address and esp is a stack pointer and ebp could be a
frame pointer and it is a question if you don't need another register for the
address of the memory input.

A way to free one input would be to store 2 arguments into an array and use the
whole array as one memory input and only inside of the inline asm load it into
the right registers.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-23 Thread ammarfaizi2 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #6 from Ammar Faizi  ---
(In reply to Jakub Jelinek from comment #4)
> If this is a macro that users should use in arbitrary user code, there is
> another problem, if something is vectorized in the function, either using
> AVX or later or -mstackrealign is used, another register is needed for the
> stack realignment (DRAP register).


I don't really understand about stack realignment part. So I have a question,
what is another register here? Is it %ebp?

If we have %ebp as a stack frame pointer, can't the compiler just use it for
the realignment?

I am not sure what the DRAP register really means. Googled about it, but
doesn't show anything relevant.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-23 Thread ammarfaizi2 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #5 from Ammar Faizi  ---
(In reply to Jakub Jelinek from comment #3)
> This has been hanging or ICEing on and off since forever.
> E.g. even r105000 ICEs, r20 works, r21 ICEs, r10-5912 works, r11-1
> hangs, so does current trunk.
> The first revision after r10-5912 to start hanging was
> r10-6326-gbcf3fa7cf5a3d024b507.
> Note, without optimizations, the inline asm is on or beyond the border what
> can be handled, it uses 6 of the 8 GPRs the arch has, the further two are
> the stack pointer and when not optimizing or if frame pointer is for
> whatever reason needed frame pointer.  The asm also has a memory input.  So,
> it fully depends on optimization (which isn't done with -O0 generally) that
> the address of the
> _arg6 variable can be expressed as offset(%esp) or offset(%ebp).  If it is
> not (and -O0 asks for no optimizations), then there are no registers left
> how to describe the input.

Interestingly, changing the my_syscall6() macro to this one works nicely.

#define my_syscall6(num, arg1, arg2, arg3, arg4, arg5, arg6)\
({  \
long _eax  = (long)(num);   \
long _arg6 = (long)(arg6); /* Always in memory */   \
__asm__ volatile (  \
"pushl  %[_arg6]\n\t"   \
"pushl  %%ebp\n\t"  \
"movl   4(%%esp),%%ebp\n\t" \
"int$0x80\n\t"  \
"popl   %%ebp\n\t"  \
"addl   $4,%%esp\n\t"   \
: "+a"(_eax)/* %eax */  \
: "b"(arg1),/* %ebx */  \
  "c"(arg2),/* %ecx */  \
  "d"(arg3),/* %edx */  \
  "S"(arg4),/* %esi */  \
  "D"(arg5),/* %edi */  \
  [_arg6]"m"(_arg6) /* memory */\
: "memory", "cc"\
);  \
_eax;   \
})

Link: https://godbolt.org/z/hdsffvr1d

What could possibly be wrong here?
I am not sure what is the behavior difference between this macro with the
previously posted?

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #4 from Jakub Jelinek  ---
If this is a macro that users should use in arbitrary user code, there is
another problem, if something is vectorized in the function, either using AVX
or later or -mstackrealign is used, another register is needed for the stack
realignment (DRAP register).

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

Jakub Jelinek  changed:

   What|Removed |Added

   Last reconfirmed||2022-03-23
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||jakub at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
This has been hanging or ICEing on and off since forever.
E.g. even r105000 ICEs, r20 works, r21 ICEs, r10-5912 works, r11-1
hangs, so does current trunk.
The first revision after r10-5912 to start hanging was
r10-6326-gbcf3fa7cf5a3d024b507.
Note, without optimizations, the inline asm is on or beyond the border what can
be handled, it uses 6 of the 8 GPRs the arch has, the further two are the stack
pointer and when not optimizing or if frame pointer is for whatever reason
needed frame pointer.  The asm also has a memory input.  So, it fully depends
on optimization (which isn't done with -O0 generally) that the address of the
_arg6 variable can be expressed as offset(%esp) or offset(%ebp).  If it is not
(and -O0 asks for no optimizations), then there are no registers left how to
describe the input.

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-22 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #2 from Hongtao.liu  ---
stuck in this loop

1731  for (prev_insn = PREV_INSN (start_insn),
1732 next_insn = NEXT_INSN (start_insn);
1733   insns_num != 1 && (prev_insn != NULL || next_insn != NULL);
)
1734{
1735  if (prev_insn != NULL)
1736{
1737  if (bitmap_bit_p (&lra_reg_info[regno].insn_bitmap,
(gdb)
1738INSN_UID (prev_insn)))
1739{
1740  first_insn = prev_insn;
1741  insns_num--;
1742}
1743prev_insn = PREV_INSN (prev_insn);
1744}
1745  if (next_insn != NULL && second_insn == NULL)
1746{
1747  if (! bitmap_bit_p (&lra_reg_info[regno].insn_bitmap,
(gdb)
1748INSN_UID (next_insn)))
1749next_insn = NEXT_INSN (next_insn);
1750  else
1751{
1752  second_insn = next_insn;
1753  insns_num--;
1754}
1755}
1756}

(gdb) p second_insn
$5 = (rtx_insn *) 0x7fffea2f9980
(gdb) p prev_insn
$6 = (rtx_insn *) 0x0
(gdb) p next_insn
$7 = (rtx_insn *) 0x7fffea2f9980
(gdb) p second_insn
$8 = (rtx_insn *) 0x7fffea2f9980
(gdb) p insns_num
$9 = 2
(gdb) f
#0  find_reload_regno_insns (regno=91, start=@0x7fffd308: 0xcc2968
::release()+68>, finish=@0x7fffd300:
0x7fffd320) at gcc/lra-assigns.cc:1733

[Bug middle-end/105032] Compiling inline ASM x86 causing GCC stuck in an endless loop with 100% CPU usage

2022-03-22 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105032

--- Comment #1 from Andrew Pinski  ---
GCC 6.4 used to ICE:
: In function 'void* __sys_mmap(void*, size_t, int, int, int, off_t)':
:39:1: error: unable to find a register to spill
 }
 ^
:39:1: error: this is the insn:
(insn 21 20 37 2 (set (reg:SI 102 [orig:96 offset.10_18 ] [96])
(mem/c:SI (plus:SI (reg/f:SI 16 argp)
(const_int 28 [0x1c])) [1 offset+0 S4 A32])) :37 86
{*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 16 argp)
(nil)))
:39: confused by earlier errors, bailing out