Hi! Looking at the output of builtin-convertvector-1.c (f4), this patch changes the generated code: vcvttpd2dqy (%rdi), %xmm0 - vmovdqa %xmm0, %xmm0 vmovaps %xmm0, (%rsi) - vzeroupper ret The problem is that without vec_extract patterns to extract 128-bit vectors from 256-bit ones, the expander creates TImode extraction and combine + simplify-rtx.c isn't able to optimize it out properly due to vector -> non-vector -> vector mode subregs in there. We already have vec_extract patterns to extract 256-bit vectors from 512-bit ones and we have all the vec_extract_{lo,hi}_* named insns even for the 128-bit out of 256-bit vectors, so this patch just makes those available to the expander.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-01-06 Jakub Jelinek <ja...@redhat.com> * config/i386/sse.md (vec_extract<mode><ssehalfvecmodelower>): Use V_256_512 iterator instead of V_512 and TARGET_AVX instead of TARGET_AVX512F as condition. --- gcc/config/i386/sse.md.jj 2019-01-04 09:56:08.548495229 +0100 +++ gcc/config/i386/sse.md 2019-01-05 21:33:34.057288059 +0100 @@ -8362,9 +8362,9 @@ (define_expand "vec_extract<mode><ssesca (define_expand "vec_extract<mode><ssehalfvecmodelower>" [(match_operand:<ssehalfvecmode> 0 "nonimmediate_operand") - (match_operand:V_512 1 "register_operand") + (match_operand:V_256_512 1 "register_operand") (match_operand 2 "const_0_to_1_operand")] - "TARGET_AVX512F" + "TARGET_AVX" { if (INTVAL (operands[2])) emit_insn (gen_vec_extract_hi_<mode> (operands[0], operands[1])); Jakub