Re: [PATCH] Switch vec_init and vec_extract optabs to 2 mode optab to allow extraction of vector from vector or initialization of vector from smaller vectors (PR target/80846)

Richard Earnshaw (lists) Tue, 01 Aug 2017 01:10:27 -0700

On 25/07/17 10:14, Jakub Jelinek wrote:
> Hi!
> 
> The following patch adjusts the vec_init and vec_extract optabs, so that
> they don't have in the expander names just the vector mode, but also another
> mode, for vec_extract the mode of the result and for vec_init the mode of
> the elts of the vector passed as second operand.
> 
> Without this patch, the second mode has been implicit, GET_MODE_INNER of
> the vector mode, so one could just extract a single element from a vector
> or construct vector from elements.  While that is most common, we allow
> in GIMPLE e.g. construction of V8DImode from 4 V2DImode elements etc.
> and the vectorizer uses them.  By having the second mode in the name
> it allows the generic code (vectorizer, expansion) to query whether the
> backend supports such vector from vector expansions or inits from vector
> elts and use them if available.
> 
> For vec_extract, if we say want to extract high V2SImode from V4SImode
> the fallback is try to expand it as DImode extraction from V2DImode.
> This works well in many cases, but doesn't really work for very large
> vectors, say if we want to extract high V8SImode from V16SImode on x86,
> we'd need OImode extraction from V2OImode, which is something the backend
> doesn't have any support for.
> For vec_init, the fallback is usually to go through memory, which is slow in
> many cases.
> 
> This patch only adds new vector from vector extract and init patterns to
> the i386 backend, but I had to change many other targets too, because
> it needs to have the element mode in the vec_extract/vec_init expander
> names.  Seems most of the backends didn't really have a mode attribute
> usable for this or had it only in uppercase, while for the names we need
> lowercase.  Some backends had a convention on how to name lower case
> vs. upper case modes, others didn't have any.  So I'm CCing maintainers
> of affected backends to seek advice on what mode attributes they want to
> use.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, where it improves
> e.g. the code generation for slp-43.c and slp-45.c testcases.
> make cc1 tested in cross-compilers to the remaining targets.
> 
> Ok for trunk?
> 
> 2017-07-25  Jakub Jelinek  <[email protected]>
> 
>       PR target/80846
>       * optabs.def (vec_extract_optab, vec_init_optab): Change from
>       a direct optab to conversion optab.
>       * optabs.c (expand_vector_broadcast): Use convert_optab_handler
>       with GET_MODE_INNER as last argument instead of optab_handler.
>       * expmed.c (extract_bit_field_1): Likewise.  Use vector from
>       vector extraction if possible and optab is available.
>       * expr.c (store_constructor): Use convert_optab_handler instead
>       of optab_handler.  Use vector initialization from smaller
>       vectors if possible and optab is available.
>       * tree-vect-stmts.c (vectorizable_load): Likewise.
>       * doc/md.texi (vec_extract, vec_init): Document that the optabs
>       now have two modes.
>       * config/i386/i386.c (ix86_expand_vector_init): Handle expansion
>       of vec_init from half-sized vectors with the same element mode.
>       * config/i386/sse.md (ssehalfvecmode): Add V4TI case.
>       (ssehalfvecmodelower, ssescalarmodelower): New mode attributes.
>       (reduc_plus_scal_v8df, reduc_plus_scal_v4df, reduc_plus_scal_v2df,
>       reduc_plus_scal_v16sf, reduc_plus_scal_v8sf, reduc_plus_scal_v4sf,
>       reduc_<code>_scal_<mode>, reduc_umin_scal_v8hi): Add element mode
>       after mode in gen_vec_extract* calls.
>       (vec_extract<mode>): Renamed to ...
>       (vec_extract<mode><ssescalarmodelower>): ... this.
>       (vec_extract<mode><ssehalfvecmodelower>): New expander.
>       (rotl<mode>3, rotr<mode>3, <shift_insn><mode>3, ashrv2di3): Add
>       element mode after mode in gen_vec_init* calls.
>       (VEC_INIT_HALF_MODE): New mode iterator.
>       (vec_init<mode>): Renamed to ...
>       (vec_init<mode><ssescalarmodelower>): ... this.
>       (vec_init<mode><ssehalfvecmodelower>): New expander.
>       * config/i386/mmx.md (vec_extractv2sf): Renamed to ...
>       (vec_extractv2sfsf): ... this.
>       (vec_initv2sf): Renamed to ...
>       (vec_initv2sfsf): ... this.
>       (vec_extractv2si): Renamed to ...
>       (vec_extractv2sisi): ... this.
>       (vec_initv2si): Renamed to ...
>       (vec_initv2sisi): ... this.
>       (vec_extractv4hi): Renamed to ...
>       (vec_extractv4hihi): ... this.
>       (vec_initv4hi): Renamed to ...
>       (vec_initv4hihi): ... this.
>       (vec_extractv8qi): Renamed to ...
>       (vec_extractv8qiqi): ... this.
>       (vec_initv8qi): Renamed to ...
>       (vec_initv8qiqi): ... this.
>       * config/rs6000/vector.md (VEC_base_l): New mode attribute.
>       (vec_init<mode>): Renamed to ...
>       (vec_init<mode><VEC_base_l>): ... this.
>       (vec_extract<mode>): Renamed to ...
>       (vec_extract<mode><VEC_base_l>): ... this.
>       * config/rs6000/paired.md (vec_initv2sf): Renamed to ...
>       (vec_initv2sfsf): ... this.
>       * config/rs6000/altivec.md (splitter, altivec_copysign_v4sf3,
>       vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi,
>       vec_unpacku_lo_v8hi, mulv16qi3, altivec_vreve<mode>2): Add
>       element mode after mode in gen_vec_init* calls.
>       * config/aarch64/aarch64-simd.md (vec_init<mode>): Renamed to ...
>       (vec_init<mode><Vel>): ... this.
>       (vec_extract<mode>): Renamed to ...
>       (vec_extract<mode><Vel>): ... this.
>       * config/aarch64/iterators.md (Vel): New mode attribute.
>       * config/s390/s390.c (s390_expand_vec_strlen, s390_expand_vec_movstr):
>       Add element mode after mode in gen_vec_extract* calls.
>       * config/s390/vector.md (non_vec_l): New mode attribute.
>       (vec_extract<mode>): Renamed to ...
>       (vec_extract<mode><non_vec_l>): ... this.
>       (vec_init<mode>): Renamed to ...
>       (vec_init<mode><non_vec_l>): ... this.
>       * config/s390/s390-builtins.def (s390_vlgvb, s390_vlgvh, s390_vlgvf,
>       s390_vlgvf_flt, s390_vlgvg, s390_vlgvg_dbl): Add element mode after
>       vec_extract mode.
>       * config/arm/iterators.md (V_elem_l): New mode attribute.
>       * config/arm/neon.md (vec_extract<mode>): Renamed to ...
>       (vec_extract<mode><V_elem_l>): ... this.
>       (vec_extractv2di): Renamed to ...
>       (vec_extractv2didi): ... this.
>       (vec_init<mode>): Renamed to ...
>       (vec_init<mode><V_elem_l>): ... this.
>       (reduc_plus_scal_<mode>, reduc_plus_scal_v2di, reduc_smin_scal_<mode>,
>       reduc_smax_scal_<mode>, reduc_umin_scal_<mode>,
>       reduc_umax_scal_<mode>, neon_vget_lane<mode>, neon_vget_laneu<mode>):
>       Add element mode after gen_vec_extract* calls.
>       * config/mips/mips-msa.md (vec_init<mode>): Renamed to ...
>       (vec_init<mode><unitmode>): ... this.
>       (vec_extract<mode>): Renamed to ...
>       (vec_extract<mode><unitmode>): ... this.
>       * config/mips/loongson.md (vec_init<mode>): Renamed to ...
>       (vec_init<mode><unitmode>): ... this.
>       * config/mips/mips-ps-3d.md (vec_initv2sf): Renamed to ...
>       (vec_initv2sfsf): ... this.
>       (vec_extractv2sf): Renamed to ...
>       (vec_extractv2sfsf): ... this.
>       (reduc_plus_scal_v2sf, reduc_smin_scal_v2sf, reduc_smax_scal_v2sf):
>       Add element mode after gen_vec_extract* calls.
>       * config/mips/mips.md (unitmode): New mode iterator.
>       * config/spu/spu.c (spu_expand_prologue, spu_allocate_stack,
>       spu_builtin_extract): Add element mode after gen_vec_extract* calls.
>       * config/spu/spu.md (inner_l): New mode attribute.
>       (vec_init<mode>): Renamed to ...
>       (vec_init<mode><inner_l>): ... this.
>       (vec_extract<mode>): Renamed to ...
>       (vec_extract<mode><inner_l>): ... this.
>       * config/sparc/sparc.md (veltmode): New mode iterator.
>       (vec_init<VMALL:mode>): Renamed to ...
>       (vec_init<VMALL:mode><VMALL:veltmode>): ... this.
>       * config/ia64/vect.md (vec_initv2si): Renamed to ...
>       (vec_initv2sisi): ... this.
>       (vec_initv2sf): Renamed to ...
>       (vec_initv2sfsf): ... this.
>       (vec_extractv2sf): Renamed to ...
>       (vec_extractv2sfsf): ... this.
>       * config/powerpcspe/vector.md (VEC_base_l): New mode attribute.
>       (vec_init<mode>): Renamed to ...
>       (vec_init<mode><VEC_base_l>): ... this.
>       (vec_extract<mode>): Renamed to ...
>       (vec_extract<mode><VEC_base_l>): ... this.
>       * config/powerpcspe/paired.md (vec_initv2sf): Renamed to ...
>       (vec_initv2sfsf): ... this.
>       * config/powerpcspe/altivec.md (splitter, altivec_copysign_v4sf3,
>       vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi,
>       vec_unpacku_lo_v8hi, mulv16qi3): Add element mode after mode in
>       gen_vec_init* calls.
>


Arm & AArch64 bits are OK.

R.

> --- gcc/optabs.def.jj 2017-07-24 10:57:45.944815535 +0200
> +++ gcc/optabs.def    2017-07-24 16:11:23.066229910 +0200
> @@ -89,6 +89,8 @@ OPTAB_CD(vec_cmpu_optab, "vec_cmpu$a$b")
>  OPTAB_CD(vec_cmpeq_optab, "vec_cmpeq$a$b")
>  OPTAB_CD(maskload_optab, "maskload$a$b")
>  OPTAB_CD(maskstore_optab, "maskstore$a$b")
> +OPTAB_CD(vec_extract_optab, "vec_extract$a$b")
> +OPTAB_CD(vec_init_optab, "vec_init$a$b")
>  
>  OPTAB_NL(add_optab, "add$P$a3", PLUS, "add", '3', gen_int_fp_fixed_libfunc)
>  OPTAB_NX(add_optab, "add$F$a3")
> @@ -294,8 +296,6 @@ OPTAB_D (udot_prod_optab, "udot_prod$I$a
>  OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
>  OPTAB_D (usad_optab, "usad$I$a")
>  OPTAB_D (ssad_optab, "ssad$I$a")
> -OPTAB_D (vec_extract_optab, "vec_extract$a")
> -OPTAB_D (vec_init_optab, "vec_init$a")
>  OPTAB_D (vec_pack_sfix_trunc_optab, "vec_pack_sfix_trunc_$a")
>  OPTAB_D (vec_pack_ssat_optab, "vec_pack_ssat_$a")
>  OPTAB_D (vec_pack_trunc_optab, "vec_pack_trunc_$a")
> --- gcc/optabs.c.jj   2017-07-24 10:57:46.216812275 +0200
> +++ gcc/optabs.c      2017-07-24 16:11:23.067229898 +0200
> @@ -386,7 +386,8 @@ expand_vector_broadcast (machine_mode vm
>    /* ??? If the target doesn't have a vec_init, then we have no easy way
>       of performing this operation.  Most of this sort of generic support
>       is hidden away in the vector lowering support in gimple.  */
> -  icode = optab_handler (vec_init_optab, vmode);
> +  icode = convert_optab_handler (vec_init_optab, vmode,
> +                              GET_MODE_INNER (vmode));
>    if (icode == CODE_FOR_nothing)
>      return NULL;
>  
> --- gcc/expmed.c.jj   2017-07-24 10:57:45.914815894 +0200
> +++ gcc/expmed.c      2017-07-24 16:11:23.071229850 +0200
> @@ -1566,6 +1566,55 @@ extract_bit_field_1 (rtx str_rtx, unsign
>        return op0;
>      }
>  
> +  /* First try to check for vector from vector extractions.  */
> +  if (VECTOR_MODE_P (GET_MODE (op0))
> +      && !MEM_P (op0)
> +      && VECTOR_MODE_P (tmode)
> +      && GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (tmode))
> +    {
> +      machine_mode new_mode = GET_MODE (op0);
> +      if (GET_MODE_INNER (new_mode) != GET_MODE_INNER (tmode))
> +     {
> +       new_mode = mode_for_vector (GET_MODE_INNER (tmode),
> +                                   GET_MODE_BITSIZE (GET_MODE (op0))
> +                                   / GET_MODE_UNIT_BITSIZE (tmode));
> +       if (!VECTOR_MODE_P (new_mode)
> +           || GET_MODE_SIZE (new_mode) != GET_MODE_SIZE (GET_MODE (op0))
> +           || GET_MODE_INNER (new_mode) != GET_MODE_INNER (tmode)
> +           || !targetm.vector_mode_supported_p (new_mode))
> +         new_mode = VOIDmode;
> +     }
> +      if (new_mode != VOIDmode
> +       && (convert_optab_handler (vec_extract_optab, new_mode, tmode)
> +           != CODE_FOR_nothing)
> +       && ((bitnum + bitsize - 1) / GET_MODE_BITSIZE (tmode)
> +           == bitnum / GET_MODE_BITSIZE (tmode)))
> +     {
> +       struct expand_operand ops[3];
> +       machine_mode outermode = new_mode;
> +       machine_mode innermode = tmode;
> +       enum insn_code icode
> +         = convert_optab_handler (vec_extract_optab, outermode, innermode);
> +       unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
> +
> +       if (new_mode != GET_MODE (op0))
> +         op0 = gen_lowpart (new_mode, op0);
> +       create_output_operand (&ops[0], target, innermode);
> +       ops[0].target = 1;
> +       create_input_operand (&ops[1], op0, outermode);
> +       create_integer_operand (&ops[2], pos);
> +       if (maybe_expand_insn (icode, 3, ops))
> +         {
> +           if (alt_rtl && ops[0].target)
> +             *alt_rtl = target;
> +           target = ops[0].value;
> +           if (GET_MODE (target) != mode)
> +             return gen_lowpart (tmode, target);
> +           return target;
> +         }
> +     }
> +    }
> +
>    /* See if we can get a better vector mode before extracting.  */
>    if (VECTOR_MODE_P (GET_MODE (op0))
>        && !MEM_P (op0)
> @@ -1599,14 +1648,17 @@ extract_bit_field_1 (rtx str_rtx, unsign
>       available.  */
>    if (VECTOR_MODE_P (GET_MODE (op0))
>        && !MEM_P (op0)
> -      && optab_handler (vec_extract_optab, GET_MODE (op0)) != 
> CODE_FOR_nothing
> +      && (convert_optab_handler (vec_extract_optab, GET_MODE (op0),
> +                              GET_MODE_INNER (GET_MODE (op0)))
> +       != CODE_FOR_nothing)
>        && ((bitnum + bitsize - 1) / GET_MODE_UNIT_BITSIZE (GET_MODE (op0))
>         == bitnum / GET_MODE_UNIT_BITSIZE (GET_MODE (op0))))
>      {
>        struct expand_operand ops[3];
>        machine_mode outermode = GET_MODE (op0);
>        machine_mode innermode = GET_MODE_INNER (outermode);
> -      enum insn_code icode = optab_handler (vec_extract_optab, outermode);
> +      enum insn_code icode
> +     = convert_optab_handler (vec_extract_optab, outermode, innermode);
>        unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
>  
>        create_output_operand (&ops[0], target, innermode);
> --- gcc/expr.c.jj     2017-07-24 10:57:45.963815307 +0200
> +++ gcc/expr.c        2017-07-24 16:11:23.073229826 +0200
> @@ -6589,6 +6589,7 @@ store_constructor (tree exp, rtx target,
>       rtvec vector = NULL;
>       unsigned n_elts;
>       alias_set_type alias;
> +     bool vec_vec_init_p = false;
>  
>       gcc_assert (eltmode != BLKmode);
>  
> @@ -6596,27 +6597,30 @@ store_constructor (tree exp, rtx target,
>       if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))
>         {
>           machine_mode mode = GET_MODE (target);
> +         machine_mode emode = eltmode;
>  
> -         icode = (int) optab_handler (vec_init_optab, mode);
> -         /* Don't use vec_init<mode> if some elements have VECTOR_TYPE.  */
> -         if (icode != CODE_FOR_nothing)
> +         if (CONSTRUCTOR_NELTS (exp)
> +             && (TREE_CODE (TREE_TYPE (CONSTRUCTOR_ELT (exp, 0)->value))
> +                 == VECTOR_TYPE))
>             {
> -             tree value;
> -
> -             FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (exp), idx, value)
> -               if (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE)
> -                 {
> -                   icode = CODE_FOR_nothing;
> -                   break;
> -                 }
> +             tree etype = TREE_TYPE (CONSTRUCTOR_ELT (exp, 0)->value);
> +             gcc_assert (CONSTRUCTOR_NELTS (exp) * TYPE_VECTOR_SUBPARTS 
> (etype)
> +                         == n_elts);
> +             emode = TYPE_MODE (etype);
>             }
> +         icode = (int) convert_optab_handler (vec_init_optab, mode, emode);
>           if (icode != CODE_FOR_nothing)
>             {
> -             unsigned int i;
> +             unsigned int i, n = n_elts;
>  
> -             vector = rtvec_alloc (n_elts);
> -             for (i = 0; i < n_elts; i++)
> -               RTVEC_ELT (vector, i) = CONST0_RTX (GET_MODE_INNER (mode));
> +             if (emode != eltmode)
> +               {
> +                 n = CONSTRUCTOR_NELTS (exp);
> +                 vec_vec_init_p = true;
> +               }
> +             vector = rtvec_alloc (n);
> +             for (i = 0; i < n; i++)
> +               RTVEC_ELT (vector, i) = CONST0_RTX (emode);
>             }
>         }
>  
> @@ -6634,10 +6638,10 @@ store_constructor (tree exp, rtx target,
>  
>           FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (exp), idx, value)
>             {
> -             int n_elts_here = tree_to_uhwi
> -               (int_const_binop (TRUNC_DIV_EXPR,
> -                                 TYPE_SIZE (TREE_TYPE (value)),
> -                                 TYPE_SIZE (elttype)));
> +             tree sz = TYPE_SIZE (TREE_TYPE (value));
> +             int n_elts_here
> +               = tree_to_uhwi (int_const_binop (TRUNC_DIV_EXPR, sz,
> +                                                TYPE_SIZE (elttype)));
>  
>               count += n_elts_here;
>               if (mostly_zeros_p (value))
> @@ -6687,18 +6691,21 @@ store_constructor (tree exp, rtx target,
>  
>           if (vector)
>             {
> -             /* vec_init<mode> should not be used if there are VECTOR_TYPE
> -                elements.  */
> -             gcc_assert (TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE);
> -             RTVEC_ELT (vector, eltpos)
> -               = expand_normal (value);
> +             if (vec_vec_init_p)
> +               {
> +                 gcc_assert (ce->index == NULL_TREE);
> +                 gcc_assert (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE);
> +                 eltpos = idx;
> +               }
> +             else
> +               gcc_assert (TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE);
> +             RTVEC_ELT (vector, eltpos) = expand_normal (value);
>             }
>           else
>             {
> -             machine_mode value_mode =
> -               TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE
> -               ? TYPE_MODE (TREE_TYPE (value))
> -               : eltmode;
> +             machine_mode value_mode
> +               = (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE
> +                  ? TYPE_MODE (TREE_TYPE (value)) : eltmode);
>               bitpos = eltpos * elt_size;
>               store_constructor_field (target, bitsize, bitpos, 0,
>                                        bitregion_end, value_mode,
> @@ -6707,9 +6714,9 @@ store_constructor (tree exp, rtx target,
>         }
>  
>       if (vector)
> -       emit_insn (GEN_FCN (icode)
> -                  (target,
> -                   gen_rtx_PARALLEL (GET_MODE (target), vector)));
> +       emit_insn (GEN_FCN (icode) (target,
> +                                   gen_rtx_PARALLEL (GET_MODE (target),
> +                                                     vector)));
>       break;
>        }
>  
> --- gcc/tree-vect-stmts.c.jj  2017-07-24 10:57:46.004814816 +0200
> +++ gcc/tree-vect-stmts.c     2017-07-24 16:11:23.049230114 +0200
> @@ -6996,29 +6996,43 @@ vectorizable_load (gimple *stmt, gimple_
>       {
>         if (group_size < nunits)
>           {
> -           /* Avoid emitting a constructor of vector elements by performing
> -              the loads using an integer type of the same size,
> -              constructing a vector of those and then re-interpreting it
> -              as the original vector type.  This works around the fact
> -              that the vec_init optab was only designed for scalar
> -              element modes and thus expansion goes through memory.
> -              This avoids a huge runtime penalty due to the general
> -              inability to perform store forwarding from smaller stores
> -              to a larger load.  */
> -           unsigned lsize
> -             = group_size * TYPE_PRECISION (TREE_TYPE (vectype));
> -           machine_mode elmode = mode_for_size (lsize, MODE_INT, 0);
> -           machine_mode vmode = mode_for_vector (elmode,
> -                                                 nunits / group_size);
> -           /* If we can't construct such a vector fall back to
> -              element loads of the original vector type.  */
> +           /* First check if vec_init optab supports construction from
> +              vector elts directly.  */
> +           machine_mode elmode = TYPE_MODE (TREE_TYPE (vectype));
> +           machine_mode vmode = mode_for_vector (elmode, group_size);
>             if (VECTOR_MODE_P (vmode)
> -               && optab_handler (vec_init_optab, vmode) != CODE_FOR_nothing)
> +               && (convert_optab_handler (vec_init_optab,
> +                                          TYPE_MODE (vectype), vmode)
> +                   != CODE_FOR_nothing))
>               {
>                 nloads = nunits / group_size;
>                 lnel = group_size;
> -               ltype = build_nonstandard_integer_type (lsize, 1);
> -               lvectype = build_vector_type (ltype, nloads);
> +               ltype = build_vector_type (TREE_TYPE (vectype), group_size);
> +             }
> +           else
> +             {
> +               /* Otherwise avoid emitting a constructor of vector elements
> +                  by performing the loads using an integer type of the same
> +                  size, constructing a vector of those and then
> +                  re-interpreting it as the original vector type.
> +                  This avoids a huge runtime penalty due to the general
> +                  inability to perform store forwarding from smaller stores
> +                  to a larger load.  */
> +               unsigned lsize
> +                 = group_size * TYPE_PRECISION (TREE_TYPE (vectype));
> +               elmode = mode_for_size (lsize, MODE_INT, 0);
> +               vmode = mode_for_vector (elmode, nunits / group_size);
> +               /* If we can't construct such a vector fall back to
> +                  element loads of the original vector type.  */
> +               if (VECTOR_MODE_P (vmode)
> +                   && (convert_optab_handler (vec_init_optab, vmode, elmode)
> +                       != CODE_FOR_nothing))
> +                 {
> +                   nloads = nunits / group_size;
> +                   lnel = group_size;
> +                   ltype = build_nonstandard_integer_type (lsize, 1);
> +                   lvectype = build_vector_type (ltype, nloads);
> +                 }
>               }
>           }
>         else
> --- gcc/doc/md.texi.jj        2017-07-24 10:57:45.989814996 +0200
> +++ gcc/doc/md.texi   2017-07-24 17:09:55.536882382 +0200
> @@ -4871,15 +4871,22 @@ This pattern is not allowed to @code{FAI
>  Set given field in the vector value.  Operand 0 is the vector to modify,
>  operand 1 is new value of field and operand 2 specify the field index.
>  
> -@cindex @code{vec_extract@var{m}} instruction pattern
> -@item @samp{vec_extract@var{m}}
> +@cindex @code{vec_extract@var{m}@var{n}} instruction pattern
> +@item @samp{vec_extract@var{m}@var{n}}
>  Extract given field from the vector value.  Operand 1 is the vector, operand 
> 2
> -specify field index and operand 0 place to store value into.
> +specify field index and operand 0 place to store value into.  The
> +@var{n} mode is the mode of the field or vector of fields that should be
> +extracted, should be either element mode of the vector mode @var{m}, or
> +a vector mode with the same element mode and smaller number of elements.
> +If @var{n} is a vector mode, the index is counted in units of that mode.
>  
> -@cindex @code{vec_init@var{m}} instruction pattern
> -@item @samp{vec_init@var{m}}
> +@cindex @code{vec_init@var{m}@var{n}} instruction pattern
> +@item @samp{vec_init@var{m}@var{n}}
>  Initialize the vector to given values.  Operand 0 is the vector to initialize
> -and operand 1 is parallel containing values for individual fields.
> +and operand 1 is parallel containing values for individual fields.  The
> +@var{n} mode is the mode of the elements, should be either element mode of
> +the vector mode @var{m}, or a vector mode with the same element mode and
> +smaller number of elements.
>  
>  @cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
>  @item @samp{vec_cmp@var{m}@var{n}}
> --- gcc/config/i386/i386.c.jj 2017-07-24 10:58:11.831505333 +0200
> +++ gcc/config/i386/i386.c    2017-07-24 16:11:23.060229982 +0200
> @@ -44297,6 +44297,34 @@ ix86_expand_vector_init (bool mmx_ok, rt
>    int i;
>    rtx x;
>  
> +  /* Handle first initialization from vector elts.  */
> +  if (n_elts != XVECLEN (vals, 0))
> +    {
> +      rtx subtarget = target;
> +      x = XVECEXP (vals, 0, 0);
> +      gcc_assert (GET_MODE_INNER (GET_MODE (x)) == inner_mode);
> +      if (GET_MODE_NUNITS (GET_MODE (x)) * 2 == n_elts)
> +     {
> +       rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) };
> +       if (inner_mode == QImode || inner_mode == HImode)
> +         {
> +           mode = mode_for_vector (SImode,
> +                                   n_elts * GET_MODE_SIZE (inner_mode) / 4);
> +           inner_mode
> +             = mode_for_vector (SImode,
> +                                n_elts * GET_MODE_SIZE (inner_mode) / 8);
> +           ops[0] = gen_lowpart (inner_mode, ops[0]);
> +           ops[1] = gen_lowpart (inner_mode, ops[1]);
> +           subtarget = gen_reg_rtx (mode);
> +         }
> +       ix86_expand_vector_init_concat (mode, subtarget, ops, 2);
> +       if (subtarget != target)
> +         emit_move_insn (target, gen_lowpart (GET_MODE (target), subtarget));
> +       return;
> +     }
> +      gcc_unreachable ();
> +    }
> +
>    for (i = 0; i < n_elts; ++i)
>      {
>        x = XVECEXP (vals, 0, i);
> --- gcc/config/i386/sse.md.jj 2017-07-24 10:57:45.807817176 +0200
> +++ gcc/config/i386/sse.md    2017-07-24 16:54:35.658088768 +0200
> @@ -658,13 +658,21 @@ (define_mode_attr ssedoublevecmode
>  
>  ;; Mapping of vector modes to a vector mode of half size
>  (define_mode_attr ssehalfvecmode
> -  [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI") (V8DI "V4DI")
> +  [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI") (V8DI "V4DI") (V4TI "V2TI")
>     (V32QI "V16QI") (V16HI  "V8HI") (V8SI  "V4SI") (V4DI "V2DI")
>     (V16QI  "V8QI") (V8HI   "V4HI") (V4SI  "V2SI")
>     (V16SF "V8SF") (V8DF "V4DF")
>     (V8SF  "V4SF") (V4DF "V2DF")
>     (V4SF  "V2SF")])
>  
> +(define_mode_attr ssehalfvecmodelower
> +  [(V64QI "v32qi") (V32HI "v16hi") (V16SI "v8si") (V8DI "v4di") (V4TI "v2ti")
> +   (V32QI "v16qi") (V16HI  "v8hi") (V8SI  "v4si") (V4DI "v2di")
> +   (V16QI  "v8qi") (V8HI   "v4hi") (V4SI  "v2si")
> +   (V16SF "v8sf") (V8DF "v4df")
> +   (V8SF  "v4sf") (V4DF "v2df")
> +   (V4SF  "v2sf")])
> +
>  ;; Mapping of vector modes ti packed single mode of the same size
>  (define_mode_attr ssePSmode
>    [(V16SI "V16SF") (V8DF "V16SF")
> @@ -690,6 +698,16 @@ (define_mode_attr ssescalarmode
>     (V8DF "DF")  (V4DF "DF")  (V2DF "DF")
>     (V4TI "TI")  (V2TI "TI")])
>  
> +;; Mapping of vector modes back to the scalar modes
> +(define_mode_attr ssescalarmodelower
> +  [(V64QI "qi") (V32QI "qi") (V16QI "qi")
> +   (V32HI "hi") (V16HI "hi") (V8HI "hi")
> +   (V16SI "si") (V8SI "si")  (V4SI "si")
> +   (V8DI "di")  (V4DI "di")  (V2DI "di")
> +   (V16SF "sf") (V8SF "sf")  (V4SF "sf")
> +   (V8DF "df")  (V4DF "df")  (V2DF "df")
> +   (V4TI "ti")  (V2TI "ti")])
> +
>  ;; Mapping of vector modes to the 128bit modes
>  (define_mode_attr ssexmmmode
>    [(V64QI "V16QI") (V32QI "V16QI") (V16QI "V16QI")
> @@ -2356,7 +2374,7 @@ (define_expand "reduc_plus_scal_v8df"
>  {
>    rtx tmp = gen_reg_rtx (V8DFmode);
>    ix86_expand_reduc (gen_addv8df3, tmp, operands[1]);
> -  emit_insn (gen_vec_extractv8df (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extractv8dfdf (operands[0], tmp, const0_rtx));
>    DONE;
>  })
>  
> @@ -2371,7 +2389,7 @@ (define_expand "reduc_plus_scal_v4df"
>    emit_insn (gen_avx_haddv4df3 (tmp, operands[1], operands[1]));
>    emit_insn (gen_avx_vperm2f128v4df3 (tmp2, tmp, tmp, GEN_INT (1)));
>    emit_insn (gen_addv4df3 (vec_res, tmp, tmp2));
> -  emit_insn (gen_vec_extractv4df (operands[0], vec_res, const0_rtx));
> +  emit_insn (gen_vec_extractv4dfdf (operands[0], vec_res, const0_rtx));
>    DONE;
>  })
>  
> @@ -2382,7 +2400,7 @@ (define_expand "reduc_plus_scal_v2df"
>  {
>    rtx tmp = gen_reg_rtx (V2DFmode);
>    emit_insn (gen_sse3_haddv2df3 (tmp, operands[1], operands[1]));
> -  emit_insn (gen_vec_extractv2df (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extractv2dfdf (operands[0], tmp, const0_rtx));
>    DONE;
>  })
>  
> @@ -2393,7 +2411,7 @@ (define_expand "reduc_plus_scal_v16sf"
>  {
>    rtx tmp = gen_reg_rtx (V16SFmode);
>    ix86_expand_reduc (gen_addv16sf3, tmp, operands[1]);
> -  emit_insn (gen_vec_extractv16sf (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extractv16sfsf (operands[0], tmp, const0_rtx));
>    DONE;
>  })
>  
> @@ -2409,7 +2427,7 @@ (define_expand "reduc_plus_scal_v8sf"
>    emit_insn (gen_avx_haddv8sf3 (tmp2, tmp, tmp));
>    emit_insn (gen_avx_vperm2f128v8sf3 (tmp, tmp2, tmp2, GEN_INT (1)));
>    emit_insn (gen_addv8sf3 (vec_res, tmp, tmp2));
> -  emit_insn (gen_vec_extractv8sf (operands[0], vec_res, const0_rtx));
> +  emit_insn (gen_vec_extractv8sfsf (operands[0], vec_res, const0_rtx));
>    DONE;
>  })
>  
> @@ -2427,7 +2445,7 @@ (define_expand "reduc_plus_scal_v4sf"
>      }
>    else
>      ix86_expand_reduc (gen_addv4sf3, vec_res, operands[1]);
> -  emit_insn (gen_vec_extractv4sf (operands[0], vec_res, const0_rtx));
> +  emit_insn (gen_vec_extractv4sfsf (operands[0], vec_res, const0_rtx));
>    DONE;
>  })
>  
> @@ -2449,7 +2467,8 @@ (define_expand "reduc_<code>_scal_<mode>
>  {
>    rtx tmp = gen_reg_rtx (<MODE>mode);
>    ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
> -  emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
> +                                                     const0_rtx));
>    DONE;
>  })
>  
> @@ -2461,7 +2480,8 @@ (define_expand "reduc_<code>_scal_<mode>
>  {
>    rtx tmp = gen_reg_rtx (<MODE>mode);
>    ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
> -  emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
> +                                                     const0_rtx));
>    DONE;
>  })
>  
> @@ -2473,7 +2493,8 @@ (define_expand "reduc_<code>_scal_<mode>
>  {
>    rtx tmp = gen_reg_rtx (<MODE>mode);
>    ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
> -  emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
> +                                                     const0_rtx));
>    DONE;
>  })
>  
> @@ -2485,7 +2506,7 @@ (define_expand "reduc_umin_scal_v8hi"
>  {
>    rtx tmp = gen_reg_rtx (V8HImode);
>    ix86_expand_reduc (gen_uminv8hi3, tmp, operands[1]);
> -  emit_insn (gen_vec_extractv8hi (operands[0], tmp, const0_rtx));
> +  emit_insn (gen_vec_extractv8hihi (operands[0], tmp, const0_rtx));
>    DONE;
>  })
>  
> @@ -7881,7 +7902,7 @@ (define_mode_iterator VEC_EXTRACT_MODE
>     (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") V2DF
>     (V4TI "TARGET_AVX512F") (V2TI "TARGET_AVX")])
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><ssescalarmodelower>"
>    [(match_operand:<ssescalarmode> 0 "register_operand")
>     (match_operand:VEC_EXTRACT_MODE 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -7892,6 +7913,19 @@ (define_expand "vec_extract<mode>"
>    DONE;
>  })
>  
> +(define_expand "vec_extract<mode><ssehalfvecmodelower>"
> +  [(match_operand:<ssehalfvecmode> 0 "nonimmediate_operand")
> +   (match_operand:V_512 1 "register_operand")
> +   (match_operand 2 "const_0_to_1_operand")]
> +  "TARGET_AVX512F"
> +{
> +  if (INTVAL (operands[2]))
> +    emit_insn (gen_vec_extract_hi_<mode> (operands[0], operands[1]));
> +  else
> +    emit_insn (gen_vec_extract_lo_<mode> (operands[0], operands[1]));
> +  DONE;
> +})
> +
>  ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>  ;;
>  ;; Parallel double-precision floating point element swizzling
> @@ -16693,7 +16727,7 @@ (define_expand "rotl<mode>3"
>        for (i = 0; i < <ssescalarnum>; i++)
>       RTVEC_ELT (vs, i) = op2;
>  
> -      emit_insn (gen_vec_init<mode> (reg, par));
> +      emit_insn (gen_vec_init<mode><ssescalarmodelower> (reg, par));
>        emit_insn (gen_xop_vrotl<mode>3 (operands[0], operands[1], reg));
>        DONE;
>      }
> @@ -16725,7 +16759,7 @@ (define_expand "rotr<mode>3"
>        for (i = 0; i < <ssescalarnum>; i++)
>       RTVEC_ELT (vs, i) = op2;
>  
> -      emit_insn (gen_vec_init<mode> (reg, par));
> +      emit_insn (gen_vec_init<mode><ssescalarmodelower> (reg, par));
>        emit_insn (gen_neg<mode>2 (neg, reg));
>        emit_insn (gen_xop_vrotl<mode>3 (operands[0], operands[1], neg));
>        DONE;
> @@ -17019,7 +17053,7 @@ (define_expand "<shift_insn><mode>3"
>          XVECEXP (par, 0, i) = operands[2];
>  
>        tmp = gen_reg_rtx (V16QImode);
> -      emit_insn (gen_vec_initv16qi (tmp, par));
> +      emit_insn (gen_vec_initv16qiqi (tmp, par));
>  
>        if (negate)
>       emit_insn (gen_negv16qi2 (tmp, tmp));
> @@ -17055,7 +17089,7 @@ (define_expand "ashrv2di3"
>        for (i = 0; i < 2; i++)
>       XVECEXP (par, 0, i) = operands[2];
>  
> -      emit_insn (gen_vec_initv2di (reg, par));
> +      emit_insn (gen_vec_initv2didi (reg, par));
>  
>        if (negate)
>       emit_insn (gen_negv2di2 (reg, reg));
> @@ -18775,7 +18809,7 @@ (define_insn_and_split "avx_<castmode><a
>                                 <ssehalfvecmode>mode);
>  })
>  
> -;; Modes handled by vec_init patterns.
> +;; Modes handled by vec_init expanders.
>  (define_mode_iterator VEC_INIT_MODE
>    [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
>     (V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI
> @@ -18785,11 +18819,31 @@ (define_mode_iterator VEC_INIT_MODE
>     (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") (V2DF "TARGET_SSE2")
>     (V4TI "TARGET_AVX512F") (V2TI "TARGET_AVX")])
>  
> -(define_expand "vec_init<mode>"
> +;; Likewise, but for initialization from half sized vectors.
> +;; Thus, these are all VEC_INIT_MODE modes except V2??.
> +(define_mode_iterator VEC_INIT_HALF_MODE
> +  [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
> +   (V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI
> +   (V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX") V4SI
> +   (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX")
> +   (V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF
> +   (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX")
> +   (V4TI "TARGET_AVX512F")])
> +
> +(define_expand "vec_init<mode><ssescalarmodelower>"
>    [(match_operand:VEC_INIT_MODE 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
>  {
> +  ix86_expand_vector_init (false, operands[0], operands[1]);
> +  DONE;
> +})
> +
> +(define_expand "vec_init<mode><ssehalfvecmodelower>"
> +  [(match_operand:VEC_INIT_HALF_MODE 0 "register_operand")
> +   (match_operand 1)]
> +  "TARGET_SSE"
> +{
>    ix86_expand_vector_init (false, operands[0], operands[1]);
>    DONE;
>  })
> --- gcc/config/i386/mmx.md.jj 2017-07-24 10:57:45.869816434 +0200
> +++ gcc/config/i386/mmx.md    2017-07-24 16:11:23.065229922 +0200
> @@ -641,7 +641,7 @@ (define_split
>    [(set (match_dup 0) (match_dup 1))]
>    "operands[1] = adjust_address (operands[1], SFmode, 4);")
>  
> -(define_expand "vec_extractv2sf"
> +(define_expand "vec_extractv2sfsf"
>    [(match_operand:SF 0 "register_operand")
>     (match_operand:V2SF 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -652,7 +652,7 @@ (define_expand "vec_extractv2sf"
>    DONE;
>  })
>  
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
> @@ -1344,7 +1344,7 @@ (define_insn_and_split "*vec_extractv2si
>    operands[1] = adjust_address (operands[1], SImode, INTVAL (operands[2]) * 
> 4);
>  })
>  
> -(define_expand "vec_extractv2si"
> +(define_expand "vec_extractv2sisi"
>    [(match_operand:SI 0 "register_operand")
>     (match_operand:V2SI 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -1355,7 +1355,7 @@ (define_expand "vec_extractv2si"
>    DONE;
>  })
>  
> -(define_expand "vec_initv2si"
> +(define_expand "vec_initv2sisi"
>    [(match_operand:V2SI 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
> @@ -1375,7 +1375,7 @@ (define_expand "vec_setv4hi"
>    DONE;
>  })
>  
> -(define_expand "vec_extractv4hi"
> +(define_expand "vec_extractv4hihi"
>    [(match_operand:HI 0 "register_operand")
>     (match_operand:V4HI 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -1386,7 +1386,7 @@ (define_expand "vec_extractv4hi"
>    DONE;
>  })
>  
> -(define_expand "vec_initv4hi"
> +(define_expand "vec_initv4hihi"
>    [(match_operand:V4HI 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
> @@ -1406,7 +1406,7 @@ (define_expand "vec_setv8qi"
>    DONE;
>  })
>  
> -(define_expand "vec_extractv8qi"
> +(define_expand "vec_extractv8qiqi"
>    [(match_operand:QI 0 "register_operand")
>     (match_operand:V8QI 1 "register_operand")
>     (match_operand 2 "const_int_operand")]
> @@ -1417,7 +1417,7 @@ (define_expand "vec_extractv8qi"
>    DONE;
>  })
>  
> -(define_expand "vec_initv8qi"
> +(define_expand "vec_initv8qiqi"
>    [(match_operand:V8QI 0 "register_operand")
>     (match_operand 1)]
>    "TARGET_SSE"
> --- gcc/config/rs6000/vector.md.jj    2017-06-08 20:50:49.000000000 +0200
> +++ gcc/config/rs6000/vector.md       2017-07-24 17:44:44.699580927 +0200
> @@ -74,6 +74,16 @@ (define_mode_attr VEC_base [(V16QI "QI")
>                           (V1TI  "TI")
>                           (TI    "TI")])
>  
> +;; As above, but in lower case
> +(define_mode_attr VEC_base_l [(V16QI "qi")
> +                           (V8HI  "hi")
> +                           (V4SI  "si")
> +                           (V2DI  "di")
> +                           (V4SF  "sf")
> +                           (V2DF  "df")
> +                           (V1TI  "ti")
> +                           (TI    "ti")])
> +
>  ;; Same size integer type for floating point data
>  (define_mode_attr VEC_int [(V4SF  "v4si")
>                          (V2DF  "v2di")])
> @@ -1016,7 +1026,7 @@ (define_expand "fixuns_trunc<mode><VEC_i
>  
>  
>  ;; Vector initialization, set, extract
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><VEC_base_l>"
>    [(match_operand:VEC_E 0 "vlogical_operand" "")
>     (match_operand:VEC_E 1 "" "")]
>    "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
> @@ -1035,7 +1045,7 @@ (define_expand "vec_set<mode>"
>    DONE;
>  })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><VEC_base_l>"
>    [(match_operand:<VEC_base> 0 "register_operand" "")
>     (match_operand:VEC_E 1 "vlogical_operand" "")
>     (match_operand 2 "const_int_operand" "")]
> --- gcc/config/rs6000/paired.md.jj    2017-06-08 20:50:49.000000000 +0200
> +++ gcc/config/rs6000/paired.md       2017-07-24 17:48:20.324985029 +0200
> @@ -377,7 +377,7 @@ (define_insn "paired_muls1"
>    "ps_muls1 %0, %1, %2"
>    [(set_attr "type" "fp")])
>  
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "gpc_reg_operand" "=f")
>     (match_operand 1 "" "")]
>    "TARGET_PAIRED_FLOAT"
> --- gcc/config/rs6000/altivec.md.jj   2017-07-24 10:58:12.000000000 +0200
> +++ gcc/config/rs6000/altivec.md      2017-07-24 17:48:49.573633038 +0200
> @@ -311,7 +311,7 @@ (define_split
>    for (i = 0; i < num_elements; i++)
>      RTVEC_ELT (v, i) = constm1_rtx;
>  
> -  emit_insn (gen_vec_initv4si (dest, gen_rtx_PARALLEL (mode, v)));
> +  emit_insn (gen_vec_initv4sisi (dest, gen_rtx_PARALLEL (mode, v)));
>    emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, dest, dest)));
>    DONE;
>  })
> @@ -2267,7 +2267,7 @@ (define_expand "altivec_copysign_v4sf3"
>    RTVEC_ELT (v, 2) = GEN_INT (mask_val);
>    RTVEC_ELT (v, 3) = GEN_INT (mask_val);
>  
> -  emit_insn (gen_vec_initv4si (mask, gen_rtx_PARALLEL (V4SImode, v)));
> +  emit_insn (gen_vec_initv4sisi (mask, gen_rtx_PARALLEL (V4SImode, v)));
>    emit_insn (gen_vector_select_v4sf (operands[0], operands[1], operands[2],
>                                    gen_lowpart (V4SFmode, mask)));
>    DONE;
> @@ -3409,7 +3409,7 @@ (define_expand "vec_unpacku_hi_v16qi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 :  0);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ?  7 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3445,7 +3445,7 @@ (define_expand "vec_unpacku_hi_v8hi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ?  6 : 17);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ?  7 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3481,7 +3481,7 @@ (define_expand "vec_unpacku_lo_v16qi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 :  8);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3517,7 +3517,7 @@ (define_expand "vec_unpacku_lo_v8hi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 14 : 17);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3758,7 +3758,7 @@ (define_expand "mulv16qi3"
>       = gen_rtx_CONST_INT (QImode, BYTES_BIG_ENDIAN ? 2 * i + 17 : 15 - 2 * 
> i);
>    }
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_altivec_vmulesb (even, operands[1], operands[2]));
>    emit_insn (gen_altivec_vmulosb (odd, operands[1], operands[2]));
>    emit_insn (gen_altivec_vperm_v8hiv16qi (operands[0], even, odd, mask));
> @@ -3804,7 +3804,7 @@ (define_expand "altivec_vreve<mode>2"
>        RTVEC_ELT (v, i + j * size)
>       = GEN_INT (i + (num_elements - 1 - j) * size);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_altivec_vperm_<mode> (operands[0], operands[1],
>            operands[1], mask));
>    DONE;
> --- gcc/config/aarch64/aarch64-simd.md.jj     2017-07-24 15:01:21.000000000 
> +0200
> +++ gcc/config/aarch64/aarch64-simd.md        2017-07-24 17:19:05.660170375 
> +0200
> @@ -5617,9 +5617,9 @@ (define_expand "aarch64_set_qreg<VSTRUCT
>    DONE;
>  })
>  
> -;; Standard pattern name vec_init<mode>.
> +;; Standard pattern name vec_init<mode><Vel>.
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><Vel>"
>    [(match_operand:VALL_F16 0 "register_operand" "")
>     (match_operand 1 "" "")]
>    "TARGET_SIMD"
> @@ -5674,9 +5674,9 @@ (define_insn "aarch64_urecpe<mode>"
>   "urecpe\\t%0.<Vtype>, %1.<Vtype>"
>    [(set_attr "type" "neon_fp_recpe_<Vetype><q>")])
>  
> -;; Standard pattern name vec_extract<mode>.
> +;; Standard pattern name vec_extract<mode><Vel>.
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><Vel>"
>    [(match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "")
>     (match_operand:VALL_F16 1 "register_operand" "")
>     (match_operand:SI 2 "immediate_operand" "")]
> --- gcc/config/aarch64/iterators.md.jj        2017-03-19 11:57:22.000000000 
> +0100
> +++ gcc/config/aarch64/iterators.md   2017-07-24 17:17:50.318091273 +0200
> @@ -520,6 +520,17 @@ (define_mode_attr VEL [(V8QI "QI") (V16Q
>                       (SI   "SI") (HI   "HI")
>                       (QI   "QI")])
>  
> +;; Define element mode for each vector mode (lower case).
> +(define_mode_attr Vel [(V8QI "qi") (V16QI "qi")
> +                     (V4HI "hi") (V8HI "hi")
> +                     (V2SI "si") (V4SI "si")
> +                     (DI "di")   (V2DI "di")
> +                     (V4HF "hf") (V8HF "hf")
> +                     (V2SF "sf") (V4SF "sf")
> +                     (V2DF "df") (DF "df")
> +                     (SI   "si") (HI   "hi")
> +                     (QI   "qi")])
> +
>  ;; 64-bit container modes the inner or scalar source mode.
>  (define_mode_attr VCOND [(HI "V4HI") (SI "V2SI")
>                        (V4HI "V4HI") (V8HI "V4HI")
> --- gcc/config/s390/s390.c.jj 2017-07-17 10:08:39.000000000 +0200
> +++ gcc/config/s390/s390.c    2017-07-24 17:58:24.416715142 +0200
> @@ -5792,7 +5792,7 @@ s390_expand_vec_strlen (rtx target, rtx
>    add_int_reg_note (s390_emit_ccraw_jump (8, NE, loop_start_label),
>                   REG_BR_PROB,
>                   profile_probability::very_likely ().to_reg_br_prob_note ());
> -  emit_insn (gen_vec_extractv16qi (len, result_reg, GEN_INT (7)));
> +  emit_insn (gen_vec_extractv16qiqi (len, result_reg, GEN_INT (7)));
>  
>    /* If the string pointer wasn't aligned we have loaded less then 16
>       bytes and the remaining bytes got filled with zeros (by vll).
> @@ -5850,7 +5850,7 @@ s390_expand_vec_movstr (rtx result, rtx
>    emit_insn (gen_vlbb (vsrc, src, GEN_INT (6)));
>    emit_insn (gen_lcbb (loadlen, src_addr, GEN_INT (6)));
>    emit_insn (gen_vfenezv16qi (vpos, vsrc, vsrc));
> -  emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
> +  emit_insn (gen_vec_extractv16qiqi (gpos_qi, vpos, GEN_INT (7)));
>    emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
>    /* gpos is the byte index if a zero was found and 16 otherwise.
>       So if it is lower than the loaded bytes we have a hit.  */
> @@ -5928,7 +5928,7 @@ s390_expand_vec_movstr (rtx result, rtx
>    force_expand_binop (Pmode, add_optab, dst_addr_reg, offset, dst_addr_reg,
>                     1, OPTAB_DIRECT);
>  
> -  emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
> +  emit_insn (gen_vec_extractv16qiqi (gpos_qi, vpos, GEN_INT (7)));
>    emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
>  
>    emit_insn (gen_vstlv16qi (vsrc, gpos, gen_rtx_MEM (BLKmode, 
> dst_addr_reg)));
> --- gcc/config/s390/vector.md.jj      2017-04-25 15:51:31.000000000 +0200
> +++ gcc/config/s390/vector.md 2017-07-24 17:57:37.665277768 +0200
> @@ -90,6 +90,17 @@ (define_mode_attr non_vec[(V1QI "QI") (V
>                         (V1DF "DF") (V2DF "DF")
>                         (V1TF "TF") (TF "TF")])
>  
> +; Like above, but in lower case.
> +(define_mode_attr non_vec_l[(V1QI "qi") (V2QI "qi") (V4QI "qi") (V8QI "qi")
> +                         (V16QI "qi")
> +                         (V1HI "hi") (V2HI "hi") (V4HI "hi") (V8HI "hi")
> +                         (V1SI "si") (V2SI "si") (V4SI "si")
> +                         (V1DI "di") (V2DI "di")
> +                         (V1TI "ti") (TI "ti")
> +                         (V1SF "sf") (V2SF "sf") (V4SF "sf")
> +                         (V1DF "df") (V2DF "df")
> +                         (V1TF "tf") (TF "tf")])
> +
>  ; The instruction suffix for integer instructions and instructions
>  ; which do not care about whether it is floating point or integer.
>  (define_mode_attr bhfgq[(V1QI "b") (V2QI "b") (V4QI "b") (V8QI "b") (V16QI 
> "b")
> @@ -453,7 +464,7 @@ (define_insn "*vec_set<mode>_plus"
>  ; FIXME: Support also vector mode operands for 0
>  ; FIXME: This should be (vec_select ..) or something but it does only allow 
> constant selectors :(
>  ; This is used via RTL standard name as well as for expanding the builtin
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><non_vec_l>"
>    [(set (match_operand:<non_vec> 0 "nonimmediate_operand" "")
>       (unspec:<non_vec> [(match_operand:V  1 "register_operand" "")
>                          (match_operand:SI 2 "nonmemory_operand" "")]
> @@ -485,7 +496,7 @@ (define_insn "*vec_extract<mode>_plus"
>    "vlgv<bhfgq>\t%0,%v1,%Y3(%2)"
>    [(set_attr "op_type" "VRS")])
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><non_vec_l>"
>    [(match_operand:V_128 0 "register_operand" "")
>     (match_operand:V_128 1 "nonmemory_operand" "")]
>    "TARGET_VX"
> --- gcc/config/s390/s390-builtins.def.jj      2017-03-24 15:08:56.000000000 
> +0100
> +++ gcc/config/s390/s390-builtins.def 2017-07-24 18:02:22.571849086 +0200
> @@ -450,12 +450,12 @@ OB_DEF_VAR (s390_vec_extract_u64,
>  OB_DEF_VAR (s390_vec_extract_b64,       s390_vlgvg,         0,               
>    O2_ELEM,            BT_OV_ULONGLONG_BV2DI_INT)
>  OB_DEF_VAR (s390_vec_extract_dbl,       s390_vlgvg_dbl,     0,               
>    O2_ELEM,            BT_OV_DBL_V2DF_INT)                      /* vlgvg */
>  
> -B_DEF      (s390_vlgvb,                 vec_extractv16qi,   0,               
>    B_VX,               O2_ELEM,            BT_FN_UCHAR_UV16QI_INT)
> -B_DEF      (s390_vlgvh,                 vec_extractv8hi,    0,               
>    B_VX,               O2_ELEM,            BT_FN_USHORT_UV8HI_INT)
> -B_DEF      (s390_vlgvf,                 vec_extractv4si,    0,               
>    B_VX,               O2_ELEM,            BT_FN_UINT_UV4SI_INT)
> -B_DEF      (s390_vlgvf_flt,             vec_extractv4sf,    0,               
>    B_INT | B_VXE,      O2_ELEM,            BT_FN_FLT_V4SF_INT)
> -B_DEF      (s390_vlgvg,                 vec_extractv2di,    0,               
>    B_VX,               O2_ELEM,            BT_FN_ULONGLONG_UV2DI_INT)
> -B_DEF      (s390_vlgvg_dbl,             vec_extractv2df,    0,               
>    B_INT | B_VX,       O2_ELEM,            BT_FN_DBL_V2DF_INT)
> +B_DEF      (s390_vlgvb,                 vec_extractv16qiqi, 0,               
>    B_VX,               O2_ELEM,            BT_FN_UCHAR_UV16QI_INT)
> +B_DEF      (s390_vlgvh,                 vec_extractv8hihi,  0,               
>    B_VX,               O2_ELEM,            BT_FN_USHORT_UV8HI_INT)
> +B_DEF      (s390_vlgvf,                 vec_extractv4sisi,  0,               
>    B_VX,               O2_ELEM,            BT_FN_UINT_UV4SI_INT)
> +B_DEF      (s390_vlgvf_flt,             vec_extractv4sfsf,  0,               
>    B_INT | B_VXE,      O2_ELEM,            BT_FN_FLT_V4SF_INT)
> +B_DEF      (s390_vlgvg,                 vec_extractv2didi,  0,               
>    B_VX,               O2_ELEM,            BT_FN_ULONGLONG_UV2DI_INT)
> +B_DEF      (s390_vlgvg_dbl,             vec_extractv2dfdf,  0,               
>    B_INT | B_VX,       O2_ELEM,            BT_FN_DBL_V2DF_INT)
>  
>  OB_DEF     (s390_vec_insert_and_zero,   
> s390_vec_insert_and_zero_s8,s390_vec_insert_and_zero_dbl,B_VX,BT_FN_OV4SI_INTCONSTPTR)
>  OB_DEF_VAR (s390_vec_insert_and_zero_s8,s390_vllezb,        0,               
>    0,                  BT_OV_V16QI_SCHARCONSTPTR)
> --- gcc/config/arm/iterators.md.jj    2017-05-05 09:20:02.000000000 +0200
> +++ gcc/config/arm/iterators.md       2017-07-24 17:25:15.665681575 +0200
> @@ -444,6 +444,14 @@ (define_mode_attr V_elem [(V8QI "QI") (V
>                            (V2SF "SF") (V4SF "SF")
>                            (DI "DI")   (V2DI "DI")])
>  
> +;; As above but in lower case.
> +(define_mode_attr V_elem_l [(V8QI "qi") (V16QI "qi")
> +                         (V4HI "hi") (V8HI "hi")
> +                         (V4HF "hf") (V8HF "hf")
> +                         (V2SI "si") (V4SI "si")
> +                         (V2SF "sf") (V4SF "sf")
> +                         (DI "di")   (V2DI "di")])
> +
>  ;; Element modes for vector extraction, padded up to register size.
>  
>  (define_mode_attr V_ext [(V8QI "SI") (V16QI "SI")
> --- gcc/config/arm/neon.md.jj 2017-07-17 10:08:41.000000000 +0200
> +++ gcc/config/arm/neon.md    2017-07-24 17:27:42.173917259 +0200
> @@ -412,7 +412,7 @@ (define_expand "vec_set<mode>"
>    DONE;
>  })
>  
> -(define_insn "vec_extract<mode>"
> +(define_insn "vec_extract<mode><V_elem_l>"
>    [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
>          (vec_select:<V_elem>
>            (match_operand:VD_LANE 1 "s_register_operand" "w,w")
> @@ -434,7 +434,7 @@ (define_insn "vec_extract<mode>"
>    [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
>  )
>  
> -(define_insn "vec_extract<mode>"
> +(define_insn "vec_extract<mode><V_elem_l>"
>    [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
>       (vec_select:<V_elem>
>            (match_operand:VQ2 1 "s_register_operand" "w,w")
> @@ -460,7 +460,7 @@ (define_insn "vec_extract<mode>"
>    [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
>  )
>  
> -(define_insn "vec_extractv2di"
> +(define_insn "vec_extractv2didi"
>    [(set (match_operand:DI 0 "nonimmediate_operand" "=Um,r")
>       (vec_select:DI
>            (match_operand:V2DI 1 "s_register_operand" "w,w")
> @@ -479,7 +479,7 @@ (define_insn "vec_extractv2di"
>    [(set_attr "type" "neon_store1_one_lane_q,neon_to_gp_q")]
>  )
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><V_elem_l>"
>    [(match_operand:VDQ 0 "s_register_operand" "")
>     (match_operand 1 "" "")]
>    "TARGET_NEON"
> @@ -1581,7 +1581,7 @@ (define_expand "reduc_plus_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>                       &gen_neon_vpadd_internal<mode>);
>    /* The same result is actually computed into every element.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -1607,7 +1607,7 @@ (define_expand "reduc_plus_scal_v2di"
>    rtx vec = gen_reg_rtx (V2DImode);
>  
>    emit_insn (gen_arm_reduc_plus_internal_v2di (vec, operands[1]));
> -  emit_insn (gen_vec_extractv2di (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extractv2didi (operands[0], vec, const0_rtx));
>  
>    DONE;
>  })
> @@ -1631,7 +1631,7 @@ (define_expand "reduc_smin_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>                       &gen_neon_vpsmin<mode>);
>    /* The result is computed into every element of the vector.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -1658,7 +1658,7 @@ (define_expand "reduc_smax_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>                       &gen_neon_vpsmax<mode>);
>    /* The result is computed into every element of the vector.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -1685,7 +1685,7 @@ (define_expand "reduc_umin_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>                       &gen_neon_vpumin<mode>);
>    /* The result is computed into every element of the vector.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -1711,7 +1711,7 @@ (define_expand "reduc_umax_scal_<mode>"
>    neon_pairwise_reduce (vec, operands[1], <MODE>mode,
>                       &gen_neon_vpumax<mode>);
>    /* The result is computed into every element of the vector.  */
> -  emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
> +  emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
>    DONE;
>  })
>  
> @@ -3272,7 +3272,8 @@ (define_expand "neon_vget_lane<mode>"
>      }
>  
>    if (GET_MODE_UNIT_BITSIZE (<MODE>mode) == 32)
> -    emit_insn (gen_vec_extract<mode> (operands[0], operands[1], 
> operands[2]));
> +    emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], operands[1],
> +                                             operands[2]));
>    else
>      emit_insn (gen_neon_vget_lane<mode>_sext_internal (operands[0],
>                                                      operands[1],
> @@ -3301,7 +3302,8 @@ (define_expand "neon_vget_laneu<mode>"
>      }
>  
>    if (GET_MODE_UNIT_BITSIZE (<MODE>mode) == 32)
> -    emit_insn (gen_vec_extract<mode> (operands[0], operands[1], 
> operands[2]));
> +    emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], operands[1],
> +                                             operands[2]));
>    else
>      emit_insn (gen_neon_vget_lane<mode>_zext_internal (operands[0],
>                                                      operands[1],
> --- gcc/config/mips/mips-msa.md.jj    2017-03-31 20:36:09.000000000 +0200
> +++ gcc/config/mips/mips-msa.md       2017-07-24 17:33:32.657689124 +0200
> @@ -231,7 +231,7 @@ (define_mode_attr bitimm
>     (V4SI  "uimm5")
>     (V2DI  "uimm6")])
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><unitmode>"
>    [(match_operand:MSA 0 "register_operand")
>     (match_operand:MSA 1 "")]
>    "ISA_HAS_MSA"
> @@ -311,7 +311,7 @@ (define_expand "vec_unpacku_lo_<mode>"
>    DONE;
>  })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><unitmode>"
>    [(match_operand:<UNITMODE> 0 "register_operand")
>     (match_operand:IMSA 1 "register_operand")
>     (match_operand 2 "const_<indeximm>_operand")]
> @@ -329,7 +329,7 @@ (define_expand "vec_extract<mode>"
>    DONE;
>  })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><unitmode>"
>    [(match_operand:<UNITMODE> 0 "register_operand")
>     (match_operand:FMSA 1 "register_operand")
>     (match_operand 2 "const_<indeximm>_operand")]
> --- gcc/config/mips/loongson.md.jj    2017-01-01 12:45:40.000000000 +0100
> +++ gcc/config/mips/loongson.md       2017-07-24 18:08:29.736433972 +0200
> @@ -119,7 +119,7 @@ (define_insn "mov<mode>_internal"
>  
>  ;; Initialization of a vector.
>  
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><unitmode>"
>    [(set (match_operand:VWHB 0 "register_operand")
>       (match_operand 1 ""))]
>    "TARGET_HARD_FLOAT && TARGET_LOONGSON_VECTORS"
> --- gcc/config/mips/mips-ps-3d.md.jj  2017-01-01 12:45:40.000000000 +0100
> +++ gcc/config/mips/mips-ps-3d.md     2017-07-24 17:34:13.540195876 +0200
> @@ -254,7 +254,7 @@ (define_expand "mips_pll_ps"
>  })
>  
>  ; vec_init
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "register_operand")
>     (match_operand:V2SF 1 "")]
>    "TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT"
> @@ -282,7 +282,7 @@ (define_insn "vec_concatv2sf"
>  ;; emulated.  There is no other way to get a vector mode bitfield extract
>  ;; currently.
>  
> -(define_insn "vec_extractv2sf"
> +(define_insn "vec_extractv2sfsf"
>    [(set (match_operand:SF 0 "register_operand" "=f")
>       (vec_select:SF (match_operand:V2SF 1 "register_operand" "f")
>                      (parallel
> @@ -379,7 +379,7 @@ (define_expand "reduc_plus_scal_v2sf"
>      rtx temp = gen_reg_rtx (V2SFmode);
>      emit_insn (gen_mips_addr_ps (temp, operands[1], operands[1]));
>      rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
> -    emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
> +    emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
>      DONE;
>    })
>  
> @@ -757,7 +757,7 @@ (define_expand "reduc_smin_scal_v2sf"
>    rtx temp = gen_reg_rtx (V2SFmode);
>    mips_expand_vec_reduc (temp, operands[1], gen_sminv2sf3);
>    rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
> -  emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
> +  emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
>    DONE;
>  })
>  
> @@ -769,6 +769,6 @@ (define_expand "reduc_smax_scal_v2sf"
>    rtx temp = gen_reg_rtx (V2SFmode);
>    mips_expand_vec_reduc (temp, operands[1], gen_smaxv2sf3);
>    rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
> -  emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
> +  emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
>    DONE;
>  })
> --- gcc/config/mips/mips.md.jj        2017-06-15 11:03:32.000000000 +0200
> +++ gcc/config/mips/mips.md   2017-07-24 19:00:15.519582707 +0200
> @@ -917,6 +917,11 @@ (define_mode_attr UNITMODE [(SF "SF") (D
>                           (V16QI "QI") (V8HI "HI") (V4SI "SI") (V2DI "DI")
>                           (V2DF "DF")])
>  
> +;; As above, but in lower case.
> +(define_mode_attr unitmode [(SF "sf") (DF "df") (V2SF "sf") (V4SF "sf")
> +                         (V16QI "qi") (V8QI "qi") (V8HI "hi") (V4HI "hi")
> +                         (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")])
> +
>  ;; This attribute gives the integer mode that has the same size as a
>  ;; fixed-point mode.
>  (define_mode_attr IMODE [(QQ "QI") (HQ "HI") (SQ "SI") (DQ "DI")
> --- gcc/config/spu/spu.c.jj   2017-07-17 10:08:39.000000000 +0200
> +++ gcc/config/spu/spu.c      2017-07-24 18:06:01.693214125 +0200
> @@ -1773,7 +1773,7 @@ spu_expand_prologue (void)
>             size_v4si = scratch_v4si;
>           }
>         emit_insn (gen_cgt_v4si (scratch_v4si, sp_v4si, size_v4si));
> -       emit_insn (gen_vec_extractv4si
> +       emit_insn (gen_vec_extractv4sisi
>                    (scratch_reg_0, scratch_v4si, GEN_INT (1)));
>         emit_insn (gen_spu_heq (scratch_reg_0, GEN_INT (0)));
>       }
> @@ -5368,7 +5368,7 @@ spu_allocate_stack (rtx op0, rtx op1)
>      {
>        rtx avail = gen_reg_rtx(SImode);
>        rtx result = gen_reg_rtx(SImode);
> -      emit_insn (gen_vec_extractv4si (avail, sp, GEN_INT (1)));
> +      emit_insn (gen_vec_extractv4sisi (avail, sp, GEN_INT (1)));
>        emit_insn (gen_cgt_si(result, avail, GEN_INT (-1)));
>        emit_insn (gen_spu_heq (result, GEN_INT(0) ));
>      }
> @@ -5684,22 +5684,22 @@ spu_builtin_extract (rtx ops[])
>        switch (mode)
>       {
>       case V16QImode:
> -       emit_insn (gen_vec_extractv16qi (ops[0], ops[1], ops[2]));
> +       emit_insn (gen_vec_extractv16qiqi (ops[0], ops[1], ops[2]));
>         break;
>       case V8HImode:
> -       emit_insn (gen_vec_extractv8hi (ops[0], ops[1], ops[2]));
> +       emit_insn (gen_vec_extractv8hihi (ops[0], ops[1], ops[2]));
>         break;
>       case V4SFmode:
> -       emit_insn (gen_vec_extractv4sf (ops[0], ops[1], ops[2]));
> +       emit_insn (gen_vec_extractv4sfsf (ops[0], ops[1], ops[2]));
>         break;
>       case V4SImode:
> -       emit_insn (gen_vec_extractv4si (ops[0], ops[1], ops[2]));
> +       emit_insn (gen_vec_extractv4sisi (ops[0], ops[1], ops[2]));
>         break;
>       case V2DImode:
> -       emit_insn (gen_vec_extractv2di (ops[0], ops[1], ops[2]));
> +       emit_insn (gen_vec_extractv2didi (ops[0], ops[1], ops[2]));
>         break;
>       case V2DFmode:
> -       emit_insn (gen_vec_extractv2df (ops[0], ops[1], ops[2]));
> +       emit_insn (gen_vec_extractv2dfdf (ops[0], ops[1], ops[2]));
>         break;
>       default:
>         abort ();
> --- gcc/config/spu/spu.md.jj  2017-01-01 12:45:40.000000000 +0100
> +++ gcc/config/spu/spu.md     2017-07-24 18:05:05.591888718 +0200
> @@ -256,6 +256,13 @@ (define_mode_attr inner  [(V16QI "QI")
>                         (V2DI  "DI")
>                         (V4SF  "SF")
>                         (V2DF  "DF")])
> +;; Like above, but in lower case
> +(define_mode_attr inner_l [(V16QI "qi")
> +                        (V8HI  "hi")
> +                        (V4SI  "si")
> +                        (V2DI  "di")
> +                        (V4SF  "sf")
> +                        (V2DF  "df")])
>  (define_mode_attr vmult  [(V16QI "1")
>                         (V8HI  "2")
>                         (V4SI  "4")
> @@ -4318,7 +4325,7 @@ (define_expand "restore_stack_nonlocal"
>  ;; vector patterns
>  
>  ;; Vector initialization
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><inner_l>"
>    [(match_operand:V 0 "register_operand" "")
>     (match_operand 1 "" "")]
>    ""
> @@ -4347,7 +4354,7 @@ (define_expand "vec_set<mode>"
>      operands[6] = GEN_INT (size);
>    })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><inner_l>"
>    [(set (match_operand:<inner> 0 "spu_reg_operand" "=r")
>       (vec_select:<inner> (match_operand:V 1 "spu_reg_operand" "r")
>                           (parallel [(match_operand 2 "const_int_operand" 
> "i")])))]
> --- gcc/config/sparc/sparc.md.jj      2017-07-17 10:08:39.000000000 +0200
> +++ gcc/config/sparc/sparc.md 2017-07-24 18:11:52.396997069 +0200
> @@ -8621,6 +8621,8 @@ (define_mode_attr vconstr [(V1SI "f") (V
>  (define_mode_attr vfptype [(V1SI "single") (V2HI "single") (V4QI "single")
>                          (V1DI "double") (V2SI "double") (V4HI "double")
>                          (V8QI "double")])
> +(define_mode_attr veltmode [(V1SI "si") (V2HI "hi") (V4QI "qi") (V1DI "di")
> +                         (V2SI "si") (V4HI "hi") (V8QI "qi")])
>  
>  (define_expand "mov<VMALL:mode>"
>    [(set (match_operand:VMALL 0 "nonimmediate_operand" "")
> @@ -8762,7 +8764,7 @@ (define_split
>    DONE;
>  })
>  
> -(define_expand "vec_init<VMALL:mode>"
> +(define_expand "vec_init<VMALL:mode><VMALL:veltmode>"
>    [(match_operand:VMALL 0 "register_operand" "")
>     (match_operand:VMALL 1 "" "")]
>    "TARGET_VIS"
> --- gcc/config/ia64/vect.md.jj        2017-01-01 12:45:42.000000000 +0100
> +++ gcc/config/ia64/vect.md   2017-07-24 17:29:28.996628899 +0200
> @@ -1015,7 +1015,7 @@ (define_insn "*vec_interleave_highv2si"
>  }
>    [(set_attr "itanium_class" "mmshf")])
>  
> -(define_expand "vec_initv2si"
> +(define_expand "vec_initv2sisi"
>    [(match_operand:V2SI 0 "gr_register_operand" "")
>     (match_operand 1 "" "")]
>    ""
> @@ -1299,7 +1299,7 @@ (define_insn "*fselect"
>    "fselect %0 = %F2, %F3, %1"
>    [(set_attr "itanium_class" "fmisc")])
>  
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "fr_register_operand" "")
>     (match_operand 1 "" "")]
>    ""
> @@ -1483,7 +1483,7 @@ (define_insn_and_split "*vec_extractv2sf
>    operands[1] = gen_rtx_REG (SFmode, REGNO (operands[1]));
>  })
>  
> -(define_expand "vec_extractv2sf"
> +(define_expand "vec_extractv2sfsf"
>    [(set (match_operand:SF 0 "register_operand" "")
>       (unspec:SF [(match_operand:V2SF 1 "register_operand" "")
>                   (match_operand:DI 2 "const_int_operand" "")]
> --- gcc/config/powerpcspe/vector.md.jj        2017-05-25 10:37:03.000000000 
> +0200
> +++ gcc/config/powerpcspe/vector.md   2017-07-24 17:41:21.897027743 +0200
> @@ -74,6 +74,16 @@ (define_mode_attr VEC_base [(V16QI "QI")
>                           (V1TI  "TI")
>                           (TI    "TI")])
>  
> +;; As above, but in lower case
> +(define_mode_attr VEC_base_l [(V16QI "qi")
> +                           (V8HI  "hi")
> +                           (V4SI  "si")
> +                           (V2DI  "di")
> +                           (V4SF  "sf")
> +                           (V2DF  "df")
> +                           (V1TI  "ti")
> +                           (TI    "ti")])
> +
>  ;; Same size integer type for floating point data
>  (define_mode_attr VEC_int [(V4SF  "v4si")
>                          (V2DF  "v2di")])
> @@ -1017,7 +1027,7 @@ (define_expand "fixuns_trunc<mode><VEC_i
>  
>  
>  ;; Vector initialization, set, extract
> -(define_expand "vec_init<mode>"
> +(define_expand "vec_init<mode><VEC_base_l>"
>    [(match_operand:VEC_E 0 "vlogical_operand" "")
>     (match_operand:VEC_E 1 "" "")]
>    "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
> @@ -1036,7 +1046,7 @@ (define_expand "vec_set<mode>"
>    DONE;
>  })
>  
> -(define_expand "vec_extract<mode>"
> +(define_expand "vec_extract<mode><VEC_base_l>"
>    [(match_operand:<VEC_base> 0 "register_operand" "")
>     (match_operand:VEC_E 1 "vlogical_operand" "")
>     (match_operand 2 "const_int_operand" "")]
> --- gcc/config/powerpcspe/paired.md.jj        2017-05-25 10:37:04.000000000 
> +0200
> +++ gcc/config/powerpcspe/paired.md   2017-07-24 17:42:17.980351097 +0200
> @@ -377,7 +377,7 @@ (define_insn "paired_muls1"
>    "ps_muls1 %0, %1, %2"
>    [(set_attr "type" "fp")])
>  
> -(define_expand "vec_initv2sf"
> +(define_expand "vec_initv2sfsf"
>    [(match_operand:V2SF 0 "gpc_reg_operand" "=f")
>     (match_operand 1 "" "")]
>    "TARGET_PAIRED_FLOAT"
> --- gcc/config/powerpcspe/altivec.md.jj       2017-05-25 10:37:05.000000000 
> +0200
> +++ gcc/config/powerpcspe/altivec.md  2017-07-24 17:42:49.897966010 +0200
> @@ -301,7 +301,7 @@ (define_split
>    for (i = 0; i < num_elements; i++)
>      RTVEC_ELT (v, i) = constm1_rtx;
>  
> -  emit_insn (gen_vec_initv4si (dest, gen_rtx_PARALLEL (mode, v)));
> +  emit_insn (gen_vec_initv4sisi (dest, gen_rtx_PARALLEL (mode, v)));
>    emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, dest, dest)));
>    DONE;
>  })
> @@ -2222,7 +2222,7 @@ (define_expand "altivec_copysign_v4sf3"
>    RTVEC_ELT (v, 2) = GEN_INT (mask_val);
>    RTVEC_ELT (v, 3) = GEN_INT (mask_val);
>  
> -  emit_insn (gen_vec_initv4si (mask, gen_rtx_PARALLEL (V4SImode, v)));
> +  emit_insn (gen_vec_initv4sisi (mask, gen_rtx_PARALLEL (V4SImode, v)));
>    emit_insn (gen_vector_select_v4sf (operands[0], operands[1], operands[2],
>                                    gen_lowpart (V4SFmode, mask)));
>    DONE;
> @@ -3014,7 +3014,7 @@ (define_expand "vec_unpacku_hi_v16qi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 :  0);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ?  7 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3050,7 +3050,7 @@ (define_expand "vec_unpacku_hi_v8hi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ?  6 : 17);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ?  7 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3086,7 +3086,7 @@ (define_expand "vec_unpacku_lo_v16qi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 :  8);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3122,7 +3122,7 @@ (define_expand "vec_unpacku_lo_v8hi"
>    RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 14 : 17);
>    RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
>    DONE;
>  }")
> @@ -3363,7 +3363,7 @@ (define_expand "mulv16qi3"
>       = gen_rtx_CONST_INT (QImode, BYTES_BIG_ENDIAN ? 2 * i + 17 : 15 - 2 * 
> i);
>    }
>  
> -  emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
> +  emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
>    emit_insn (gen_altivec_vmulesb (even, operands[1], operands[2]));
>    emit_insn (gen_altivec_vmulosb (odd, operands[1], operands[2]));
>    emit_insn (gen_altivec_vperm_v8hiv16qi (operands[0], even, odd, mask));
> 
>       Jakub
>

Re: [PATCH] Switch vec_init and vec_extract optabs to 2 mode optab to allow extraction of vector from vector or initialization of vector from smaller vectors (PR target/80846)

Reply via email to