https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86896

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW
                 CC|                            |hjl.tools at gmail dot com,
                   |                            |hubicka at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org
           Assignee|marxin at gcc dot gnu.org          |unassigned at gcc dot 
gnu.org

--- Comment #3 from Martin Liška <marxin at gcc dot gnu.org> ---
So it's hard to isolate self-contained test-case, but we really generate:
        vmovdqa64       %xmm16, %xmm4

I'm not i386 expert, but according to this:
https://hjlebbink.github.io/x86doc/html/MOVDQA,VMOVDQA32_64.html

+-----------------------------------------------------------+--------+------------------------+--------------------+---------------------------------------------------------------------------------+
|                    Opcode/Instruction                     | Op/En  | 64/32
bit Mode Support | CPUID Feature Flag |                                  
Description                                   |
+-----------------------------------------------------------+--------+------------------------+--------------------+---------------------------------------------------------------------------------+
| EVEX.128.66.0F.W1 6F /r VMOVDQA64 xmm1 {k1}{z}, xmm2/m128 | FVM-RM | V/V     
              | AVX512VL AVX512F   | Move aligned quadword integer values from
xmm2/m128 to xmm1 using writemask k1. |
+-----------------------------------------------------------+--------+------------------------+--------------------+---------------------------------------------------------------------------------+

The instruction requires AVX512VL flags, but we don't require it:

   953  (define_insn "mov<mode>_internal"
   954    [(set (match_operand:VMOVE 0 "nonimmediate_operand"
   955           "=v,v ,v ,m")
   956          (match_operand:VMOVE 1 "nonimmediate_or_sse_const_operand"
   957           " C,BC,vm,v"))]
   958    "TARGET_SSE
   959     && (register_operand (operands[0], <MODE>mode)
   960         || register_operand (operands[1], <MODE>mode))"
   961  {
   962    switch (get_attr_type (insn))
   963      {
   964      case TYPE_SSELOG1:
   965        return standard_sse_constant_opcode (insn, operands);
   966  
   967      case TYPE_SSEMOV:
   968        /* There is no evex-encoded vmov* for sizes smaller than 64-bytes
   969           in avx512f, so we need to use workarounds, to access sse
registers
   970           16-31, which are evex-only. In avx512vl we don't need
workarounds.  */
   971        if (TARGET_AVX512F && <MODE_SIZE> < 64 && !TARGET_AVX512VL //
<-------------
   972            && (EXT_REX_SSE_REG_P (operands[0])
   973                || EXT_REX_SSE_REG_P (operands[1])))
   974          {
   975            if (memory_operand (operands[0], <MODE>mode))
   976              {
   977                if (<MODE_SIZE> == 32)
   978                  return "vextract<shuffletype>64x4\t{$0x0, %g1, %0|%0,
%g1, 0x0}";
   979                else if (<MODE_SIZE> == 16)
   980                  return "vextract<shuffletype>32x4\t{$0x0, %g1, %0|%0,
%g1, 0x0}";
   981                else
   982                  gcc_unreachable ();
   983              }
   984            else if (memory_operand (operands[1], <MODE>mode))
   985              {
   986                if (<MODE_SIZE> == 32)
   987                  return "vbroadcast<shuffletype>64x4\t{%1, %g0|%g0,
%1}";
   988                else if (<MODE_SIZE> == 16)
   989                  return "vbroadcast<shuffletype>32x4\t{%1, %g0|%g0,
%1}";
   990                else
   991                  gcc_unreachable ();
   992              }
   993            else
   994              /* Reg -> reg move is always aligned.  Just use wider move.
 */
   995              switch (get_attr_mode (insn))
   996                {
   997                case MODE_V8SF:
   998                case MODE_V4SF:
   999                  return "vmovaps\t{%g1, %g0|%g0, %g1}";
  1000                case MODE_V4DF:
  1001                case MODE_V2DF:
  1002                  return "vmovapd\t{%g1, %g0|%g0, %g1}";
  1003                case MODE_OI:
  1004                case MODE_TI:
  1005                  return "vmovdqa64\t{%g1, %g0|%g0, %g1}";
  1006                default:
  1007                  gcc_unreachable ();
  1008                }

Adding to CC port maintainers.

--- Comment #4 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
Yep, it seems that we are missing TARGET_AVX512VL check here. I am also not
very familiar with avx512 ISA extension. Hj, would it be possible for you to
check if we have more missing tests here?

Reply via email to