[Bug target/92904] varargs for __int128 is placed at an unaligned location and uses movdqa for the load

2020-02-14 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92904

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |8.4

--- Comment #8 from Jakub Jelinek  ---
Fixed for 8.4+ and 9.3+ too.

[Bug target/92904] varargs for __int128 is placed at an unaligned location and uses movdqa for the load

2020-02-14 Thread cvs-commit at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92904

--- Comment #7 from CVS Commits  ---
The releases/gcc-8 branch has been updated by Jakub Jelinek
:

https://gcc.gnu.org/g:ffb5cc9a5599b1936c5ebea153ca52a0aa2c785d

commit r8-9995-gffb5cc9a5599b1936c5ebea153ca52a0aa2c785d
Author: Jakub Jelinek 
Date:   Fri Feb 14 15:24:48 2020 +0100

backport: re PR target/92904 (varargs for __int128 is placed at an
unaligned location and uses movdqa for the load)

Backported from mainline
2019-12-12  Jakub Jelinek  

PR target/92904
* config/i386/i386.c (ix86_gimplify_va_arg): If need_intregs and
not need_temp, decrease alignment of the read because the GPR save
area only guarantees 8-byte alignment.

* gcc.c-torture/execute/pr92904.c: New test.

[Bug target/92904] varargs for __int128 is placed at an unaligned location and uses movdqa for the load

2019-12-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92904

--- Comment #6 from Jakub Jelinek  ---
Author: jakub
Date: Fri Dec 20 17:43:23 2019
New Revision: 279673

URL: https://gcc.gnu.org/viewcvs?rev=279673=gcc=rev
Log:
Backported from mainline
2019-12-12  Jakub Jelinek  

PR target/92904
* config/i386/i386.c (ix86_gimplify_va_arg): If need_intregs and
not need_temp, decrease alignment of the read because the GPR save
area only guarantees 8-byte alignment.

* gcc.c-torture/execute/pr92904.c: New test.

Added:
branches/gcc-9-branch/gcc/testsuite/gcc.c-torture/execute/pr92904.c
Modified:
branches/gcc-9-branch/gcc/ChangeLog
branches/gcc-9-branch/gcc/config/i386/i386.c
branches/gcc-9-branch/gcc/testsuite/ChangeLog

[Bug target/92904] varargs for __int128 is placed at an unaligned location and uses movdqa for the load

2019-12-13 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92904

--- Comment #5 from Jakub Jelinek  ---
Fixed on the trunk so far.

[Bug target/92904] varargs for __int128 is placed at an unaligned location and uses movdqa for the load

2019-12-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92904

--- Comment #4 from Jakub Jelinek  ---
Author: jakub
Date: Fri Dec 13 00:09:34 2019
New Revision: 279327

URL: https://gcc.gnu.org/viewcvs?rev=279327=gcc=rev
Log:
PR target/92904
* config/i386/i386.c (ix86_gimplify_va_arg): If need_intregs and
not need_temp, decrease alignment of the read because the GPR save
area only guarantees 8-byte alignment.

* gcc.c-torture/execute/pr92904.c: New test.

Added:
trunk/gcc/testsuite/gcc.c-torture/execute/pr92904.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/testsuite/ChangeLog

[Bug target/92904] varargs for __int128 is placed at an unaligned location and uses movdqa for the load

2019-12-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92904

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2019-12-12
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Jakub Jelinek  ---
Created attachment 47481
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47481=edit
gcc10-pr92904.patch

Untested fix.

[Bug target/92904] varargs for __int128 is placed at an unaligned location and uses movdqa for the load

2019-12-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92904

Jakub Jelinek  changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org,
   ||matz at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
I've started by checking the argument passing against the psABI:
struct S { long long a, b; };
struct __attribute__((aligned (16))) T { long long a, b; };
void f1 (__int128);
void f2 (int, __int128);
void f3 (int, int, __int128);
void f4 (int, int, int, __int128);
void f5 (int, int, int, int, __int128);
void f6 (int, int, int, int, int, __int128);
void f7 (int, int, int, int, int, int, __int128);
void f8 (int, int, int, int, int, int, int, __int128);
void f9 (int, ...);
void f10 (struct S);
void f11 (int, struct S);
void f12 (int, int, struct S);
void f13 (int, int, int, struct S);
void f14 (int, int, int, int, struct S);
void f15 (int, int, int, int, int, struct S);
void f16 (int, int, int, int, int, int, struct S);
void f17 (int, int, int, int, int, int, int, struct S);
void f18 (struct T);
void f19 (int, struct T);
void f20 (int, int, struct T);
void f21 (int, int, int, struct T);
void f22 (int, int, int, int, struct T);
void f23 (int, int, int, int, int, struct T);
void f24 (int, int, int, int, int, int, struct T);
void f25 (int, int, int, int, int, int, int, struct T);

void
foo ()
{
  __int128 i = -1;
  struct S s = { -1, -1 };
  struct T t = { -1, -1 };
  f1 (-1);
  f2 (0, -1);
  f3 (0, 0, -1);
  f4 (0, 0, 0, -1);
  f5 (0, 0, 0, 0, -1);
  f6 (0, 0, 0, 0, 0, -1);
  f7 (0, 0, 0, 0, 0, 0, -1);
  f8 (0, 0, 0, 0, 0, 0, 0, -1);
  f9 (0, i);
  f9 (0, 0, i);
  f9 (0, 0, 0, i);
  f9 (0, 0, 0, 0, i);
  f9 (0, 0, 0, 0, 0, i);
  f9 (0, 0, 0, 0, 0, 0, i);
  f9 (0, 0, 0, 0, 0, 0, 0, i);
  f10 (s);
  f11 (0, s);
  f12 (0, 0, s);
  f13 (0, 0, 0, s);
  f14 (0, 0, 0, 0, s);
  f15 (0, 0, 0, 0, 0, s);
  f16 (0, 0, 0, 0, 0, 0, s);
  f17 (0, 0, 0, 0, 0, 0, 0, s);
  f9 (0, s);
  f9 (0, 0, s);
  f9 (0, 0, 0, s);
  f9 (0, 0, 0, 0, s);
  f9 (0, 0, 0, 0, 0, s);
  f9 (0, 0, 0, 0, 0, 0, s);
  f9 (0, 0, 0, 0, 0, 0, 0, s);
  f18 (t);
  f19 (0, t);
  f20 (0, 0, t);
  f21 (0, 0, 0, t);
  f22 (0, 0, 0, 0, t);
  f23 (0, 0, 0, 0, 0, t);
  f24 (0, 0, 0, 0, 0, 0, t);
  f25 (0, 0, 0, 0, 0, 0, 0, t);
  f9 (0, t);
  f9 (0, 0, t);
  f9 (0, 0, 0, t);
  f9 (0, 0, 0, 0, t);
  f9 (0, 0, 0, 0, 0, t);
  f9 (0, 0, 0, 0, 0, 0, t);
  f9 (0, 0, 0, 0, 0, 0, 0, t);
}

I believe GCC follows the psABI here:
"For classification purposes __int128 is treated as if it were implemented
as:
typedef struct {
long low, high;
} __int128;
with the exception that arguments of type __int128 that are stored in
memory must be aligned on a 16-byte boundary."
and __int128 is passed the same as the aligned(16) structure containing two
long/long long fields, in particular, if both halves fit in general purpose
registers, they are passed there, without any gaps, if only half fits into gpr
and the other would need to go onto stack, it is passed on stack, and when
passed on stack, it is passed aligned on 16-byte boundary.
So, I'd say the bug is in the x86_64 va_arg expansion that when reading the
__int128 (and maybe > 8 bytes aligned struct too) from the area of spilled gprs
it should use 8 byte alignment (i.e. lower than the type has), will keep
looking into that.
Also checked with clang and that one seems to ignore the psABI completely,
hapilly passing the __int128 partially in gpr (%r9) and on the stack (that is
the f6 call and f9 (0, 0, 0, 0, 0, i);), which violates the psABI:
"If there are no registers available for any eightbyte of an argument, the
whole
argument is passed on the stack."
or that the __int128 should be passed 16-byte aligned on the stack (that is the
f8 call and f9 (0, 0, 0, 0, 0, 0, 0, -1);), which violates the psABI:
"with the exception that arguments of type __int128 that are stored in
memory must be aligned on a 16-byte boundary."
Note, the passing of struct S and struct T seems to be correct in both
compilers.