[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-06-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

Jakub Jelinek  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #22 from Jakub Jelinek  ---
Fixed for 6.5 and 7.4+ too.

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-06-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #21 from Jakub Jelinek  ---
Author: jakub
Date: Mon Jun 25 17:55:15 2018
New Revision: 262103

URL: https://gcc.gnu.org/viewcvs?rev=262103=gcc=rev
Log:
PR target/84786
* config/i386/sse.md (vshift_count): New mode attr.
(3): Use N instead of vN
as last operand's constraint for VI2_AVX2_AVX512BW shifts.  Use YvN
instead of vN as last operand's constraint for VI48_AVX2 shifts.

* gcc.target/i386/avx512f-pr84786-3.c: New test.

Added:
branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/avx512f-pr84786-3.c
Modified:
branches/gcc-6-branch/gcc/ChangeLog
branches/gcc-6-branch/gcc/config/i386/sse.md
branches/gcc-6-branch/gcc/testsuite/ChangeLog

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-06-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #20 from Jakub Jelinek  ---
Author: jakub
Date: Mon Jun 25 12:58:03 2018
New Revision: 262016

URL: https://gcc.gnu.org/viewcvs?rev=262016=gcc=rev
Log:
2018-06-25  Jakub Jelinek  

PR target/84786
* gcc.target/i386/avx512f-pr84786-3.c: New test.

Added:
branches/gcc-8-branch/gcc/testsuite/gcc.target/i386/avx512f-pr84786-3.c
Modified:
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-06-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #19 from Jakub Jelinek  ---
Author: jakub
Date: Mon Jun 25 12:57:04 2018
New Revision: 262015

URL: https://gcc.gnu.org/viewcvs?rev=262015=gcc=rev
Log:
PR target/84786
* config/i386/sse.md (vshift_count): New mode attr.
(3): Use N instead of vN
as last operand's constraint for VI2_AVX2_AVX512BW shifts.  Use YvN
instead of vN as last operand's constraint for VI48_AVX2 shifts.

* gcc.target/i386/avx512f-pr84786-3.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/avx512f-pr84786-3.c
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-06-25 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #18 from Jakub Jelinek  ---
Author: jakub
Date: Mon Jun 25 12:48:29 2018
New Revision: 262014

URL: https://gcc.gnu.org/viewcvs?rev=262014=gcc=rev
Log:
PR target/84786
* config/i386/sse.md (vshift_count): New mode attr.
(3): Use N instead of vN
as last operand's constraint for VI2_AVX2_AVX512BW shifts.  Use YvN
instead of vN as last operand's constraint for VI48_AVX2 shifts.

* gcc.target/i386/avx512f-pr84786-3.c: New test.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/avx512f-pr84786-3.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/i386/sse.md
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-06-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #17 from Jakub Jelinek  ---
Author: jakub
Date: Fri Jun 22 20:36:31 2018
New Revision: 261919

URL: https://gcc.gnu.org/viewcvs?rev=261919=gcc=rev
Log:
Backported from mainline
2018-03-13  Jakub Jelinek  

PR target/84786
* config/i386/sse.md (sse2_loadhpd): Use Yv constraint rather than v
on the last operand.

* gcc.target/i386/avx512f-pr84786-1.c: New test.
* gcc.target/i386/avx512f-pr84786-2.c: New test.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/avx512f-pr84786-1.c
branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/avx512f-pr84786-2.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/i386/sse.md
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-06-22 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #16 from Jakub Jelinek  ---
Created attachment 44312
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44312=edit
gcc7-pr84786.patch

Patch I'm going to bootstrap/regtest on 7.x branch now.
On the trunk/in 8.x this got fixed with r253924 aka one of the PR82370 fixes,
but that is something that IMHO shouldn't be backported.

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-27 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #15 from Matthias Kretz  ---
Here's an idea for a test case (https://godbolt.org/g/SjM2HE: it appears fixed
on GCC 8):

typedef unsigned short V __attribute__((vector_size (16)));

V foo (V x, int y)
{
  x <<= y;
  asm volatile (""::"x"(x):"xmm1", "xmm2", "xmm3",
 "xmm4", "xmm5", "xmm6", "xmm7", "xmm8", "xmm9",
 "xmm10", "xmm11", "xmm12", "xmm13", "xmm14", "xmm15");
  return x >> y;
}

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-27 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #14 from Matthias Kretz  ---
I applied both patches to my GCC 7.2 installation and as a result my complete
testsuite passes now. Anything else I can help with?

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-27 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #13 from Matthias Kretz  ---
I'll try to apply it locally and will report my findings.

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #12 from Jakub Jelinek  ---
For 7.x I think we need:
--- gcc/config/i386/sse.md.jj   2018-03-05 17:04:45.820743323 +0100
+++ gcc/config/i386/sse.md  2018-03-26 17:29:00.967880855 +0200
@@ -10687,7 +10687,7 @@
   [(set (match_operand:VI2_AVX2_AVX512BW 0 "register_operand" "=x,v")
(any_lshift:VI2_AVX2_AVX512BW
  (match_operand:VI2_AVX2_AVX512BW 1 "register_operand" "0,v")
- (match_operand:DI 2 "nonmemory_operand" "xN,vN")))]
+ (match_operand:DI 2 "nonmemory_operand" "xN,YvN")))]
   "TARGET_SSE2 &&  && "
   "@
p\t{%2, %0|%0, %2}
@@ -10706,7 +10706,7 @@
   [(set (match_operand:VI48_AVX2 0 "register_operand" "=x,x,v")
(any_lshift:VI48_AVX2
  (match_operand:VI48_AVX2 1 "register_operand" "0,x,v")
- (match_operand:DI 2 "nonmemory_operand" "xN,xN,vN")))]
+ (match_operand:DI 2 "nonmemory_operand" "xN,xN,YvN")))]
   "TARGET_SSE2 && "
   "@
p\t{%2, %0|%0, %2}

but don't have time to test it right now, nor create short self-contained
testcases for it.

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #11 from Matthias Kretz  ---
Created attachment 43762
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43762=edit
test case that produces incorrect vpsrlw

Compiled with `g++-7 -std=c++17 -O0 -fabi-version=0 -fabi-compat-version=0
-march=knl -o fail fail.cpp`

g++-7 (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0

from objdump -d | grep vpsrlw I get:
  56e364:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  56eaf6:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  56f174:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0
  56f68c:   62 b1 7d 08 d1 c3   vpsrlw %xmm19,%xmm0,%xmm0
  58ef6f:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  58f6f5:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  58fd67:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0
  590273:   62 b1 7d 08 d1 c3   vpsrlw %xmm19,%xmm0,%xmm0
  59585d:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  595fef:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  596664:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0
  596b6a:   62 b1 7d 08 d1 c3   vpsrlw %xmm19,%xmm0,%xmm0
  59cb7a:   62 b1 7d 28 d1 c0   vpsrlw %xmm16,%ymm0,%ymm0
  59d39f:   62 b1 7d 28 d1 c1   vpsrlw %xmm17,%ymm0,%ymm0
  59d9fe:   62 b1 7d 28 d1 c2   vpsrlw %xmm18,%ymm0,%ymm0
  59dfc6:   62 b1 7d 28 d1 c3   vpsrlw %xmm19,%ymm0,%ymm0
  5a407c:   62 b1 7d 28 d1 c0   vpsrlw %xmm16,%ymm0,%ymm0
  5a4895:   62 b1 7d 28 d1 c1   vpsrlw %xmm17,%ymm0,%ymm0
  5a4eeb:   62 b1 7d 28 d1 c2   vpsrlw %xmm18,%ymm0,%ymm0
  5a54ad:   62 b1 7d 28 d1 c3   vpsrlw %xmm19,%ymm0,%ymm0
  5be392:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  5bea0b:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  5bef85:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0
  5bf3be:   62 b1 7d 08 d1 c3   vpsrlw %xmm19,%xmm0,%xmm0
  5d8ae0:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  5d9149:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  5d96b3:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0
  5d9adc:   62 b1 7d 08 d1 c3   vpsrlw %xmm19,%xmm0,%xmm0
  5de3e7:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  5dea62:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  5defd5:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0
  5df3fe:   62 b1 7d 08 d1 c3   vpsrlw %xmm19,%xmm0,%xmm0
  5e3cd2:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  5e431b:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  5e4865:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0
  5e4c6e:   62 b1 7d 08 d1 c3   vpsrlw %xmm19,%xmm0,%xmm0
  5e94bd:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  5e9b18:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  5ea06b:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0
  5ea474:   62 b1 7d 08 d1 c3   vpsrlw %xmm19,%xmm0,%xmm0
  799710:   c5 f9 d1 c5 vpsrlw %xmm5,%xmm0,%xmm0
  7999a9:   c5 f9 d1 c6 vpsrlw %xmm6,%xmm0,%xmm0
  799e3b:   c5 f9 d1 c7 vpsrlw %xmm7,%xmm0,%xmm0
  79a101:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  79a3c2:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  79a68c:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0
  7a1e51:   c5 f9 d1 c5 vpsrlw %xmm5,%xmm0,%xmm0
  7a20ea:   c5 f9 d1 c6 vpsrlw %xmm6,%xmm0,%xmm0
  7a2579:   c5 f9 d1 c7 vpsrlw %xmm7,%xmm0,%xmm0
  7a283f:   62 b1 7d 08 d1 c0   vpsrlw %xmm16,%xmm0,%xmm0
  7a2b00:   62 b1 7d 08 d1 c1   vpsrlw %xmm17,%xmm0,%xmm0
  7a2dca:   62 b1 7d 08 d1 c2   vpsrlw %xmm18,%xmm0,%xmm0

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #10 from Matthias Kretz  ---
This is all I have right now:
TID 0 SDE-ERROR: Executed instruction not valid for specified chip (KNL):
0x70d281: vpsrlw xmm0, xmm0, xmm16
Image:
/home/travis/build/VcDevel/Vc/build-Experimental/c2dd920concentrateGCC7.2.0Relivy-bridgeknl/tests/mask_knl_vectorbuiltin+0x30d281
(in multi-region image, region# 0)
Function:
_ZN5Tests11load_store_IN2Vc2v29simd_maskIfNS2_6detail7avx_abiILi32EE3runEv
Instruction bytes are: 62 b1 7d 08 d1 c0

See bottom of: http://lxwww53.gsi.de/testDetails.php?test=2016375=14519

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-26 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #9 from Jakub Jelinek  ---
(In reply to Matthias Kretz from comment #8)
> There seems to be a similar bug for vpsrlw and vpsllw. Do you need a
> testcase? (It's hard to hit the bug... just had one occur on a Travis CI
> build)

I don't see how that is possible.

Do you have at least objdump with the instructions (e.g. which exact operand it
is), and what gcc options were used?

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-26 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #8 from Matthias Kretz  ---
There seems to be a similar bug for vpsrlw and vpsllw. Do you need a testcase?
(It's hard to hit the bug... just had one occur on a Travis CI build)

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-13 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #7 from Jakub Jelinek  ---
Fixed for 8.1+ so far.

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-13 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #6 from Jakub Jelinek  ---
Author: jakub
Date: Tue Mar 13 08:03:28 2018
New Revision: 258475

URL: https://gcc.gnu.org/viewcvs?rev=258475=gcc=rev
Log:
PR target/84786
* config/i386/sse.md (sse2_loadhpd): Use Yv constraint rather than v
on the last operand.

* gcc.target/i386/avx512f-pr84786-1.c: New test.
* gcc.target/i386/avx512f-pr84786-2.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/i386/avx512f-pr84786-1.c
trunk/gcc/testsuite/gcc.target/i386/avx512f-pr84786-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/sse.md
trunk/gcc/testsuite/ChangeLog

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #5 from Jakub Jelinek  ---
Created attachment 43628
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43628=edit
gcc8-pr84786.patch

Untested fix.

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #4 from Jakub Jelinek  ---
Simplified testcase for the testsuite:
/* PR target/84786 */
/* { dg-do run } */
/* { dg-options "-mavx512f -mno-avx512vl -O2" } */
/* { dg-require-effective-target avx512f } */

#include "avx512f-check.h"

typedef double V __attribute__((vector_size (16)));

__attribute__((noipa)) V
foo (V x, double y)
{
  register double z __asm ("xmm18");
  asm volatile ("" : "=v" (z) : "0" (y));
  x[1] = z;
  return x;
}

static void
avx512f_test (void)
{
  V a = foo ((V) { 1.0, 2.0 }, 3.0);
  if (a[0] != 1.0 || a[1] != 3.0)
abort ();
}

Now, the hard part is look around for similar issues (I've done it already at
some point, but this case apparently went unnoticed).

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords||wrong-code
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2018-03-12
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #3 from Jakub Jelinek  ---
Ah, I see the bug:
(define_insn "sse2_loadhpd"
  [(set (match_operand:V2DF 0 "nonimmediate_operand"
  "=x,v,x,v,o,o ,o")
(vec_concat:V2DF
  (vec_select:DF
(match_operand:V2DF 1 "nonimmediate_operand"
  " 0,v,0,v,0,0 ,0")
(parallel [(const_int 0)]))
  (match_operand:DF 2 "nonimmediate_operand"
  " m,m,x,v,x,*f,r")))]
While V2DFmode won't be put into %xmm16-%xmm31 without TARGET_AVX512VL, DFmode
can be, so we need an avx512vl alternative for the v for the last operand and
another one with x for avx/avx2/avx512f etc. without avx512vl.

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-10 Thread kretz at kde dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

--- Comment #2 from Matthias Kretz  ---
Created attachment 43618
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43618=edit
unreduced testcase

Compile with `g++ -std=c++17 -O2 -march=knl -o knl-fail knl-fail.cpp`.
The function `Tests::operators_ > >::run()` countains the invalid instructions.

% g++-7 --version
g++-7 (Ubuntu 7.2.0-1ubuntu1~16.04) 7.2.0

[Bug target/84786] [miscompilation] vunpcklpd accessing xmm16-22 targeting KNL

2018-03-09 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84786

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
Without -mavx512vl the 128/256-bit pseudos should never be allocated in
xmm16-31 registers, so I really wonder how this happens.  If you don't mind
attaching unreduced code (compressed), I can do the reduction myself.