On Tue, Aug 30, 2011 at 2:03 PM, Richard Guenther
<richard.guent...@gmail.com> wrote:
> On Tue, Aug 30, 2011 at 4:31 AM, Artem Shinkarov
> <artyom.shinkar...@gmail.com> wrote:
>> Hi
>>
>> This is a patch for the explicit vector shuffling we have discussed a
>> long time ago here:
>> http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01092.html
>>
>> The new patch introduces the new tree code, as we agreed, and expands
>> this code by checking the vshuffle pattern in the backend.
>>
>> The patch at the moment lacks of some examples, but mainly it works
>> fine for me. It would be nice if i386 gurus could look into the way I
>> am doing the expansion.
>>
>> Middle-end parts seems to be more or less fine, they have not changed
>> much from the previous time.
>
> +@code{__builtin_shuffle (vec, mask)} and
> +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct
>
> the latter would be __builtin_shuffle2.

Why??
That was the syntax we agreed on that elegantly handles both cases in one place.

> +bool
> +expand_vec_shuffle_expr_p (enum machine_mode mode, tree v0,
> +                          tree v1, tree mask)
> +{
> +#define inner_type_size(vec) \
> +  GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (TREE_TYPE (vec))))
>
> missing comment.  No #defines like this please, just initialize
> two temporary variables.
>
> +
> +rtx
> +expand_vec_shuffle_expr (tree type, tree v0, tree v1, tree mask, rtx target)
> +{
>
> comment.
>
> +vshuffle:
> +  gcc_assert (v1 == v0);
> +
> +  icode = direct_optab_handler (vshuffle_optab, mode);
>
> hmm, so we don't have a vshuffle2 optab but always go via the
> builtin function, but only for constant masks there?  I wonder
> if we should arrange for targets to only support a vshuffle
> optab (thus, transition away from the builtin) and so
> unconditionally have a vshuffle2 optab only (with possibly
> equivalent v1 and v0?)

I have only implemented the case with non-constant mask that supports
only one argument. I think that it would be enough for the first
version. Later we can introduce vshuffle2 pattern and reuse the code
that expands vshuffle at the moment.

> I suppose Richard might remember what he had in mind back
> when we discussed this.
>
> Index: gcc/c-typeck.c
> ===================================================================
> --- gcc/c-typeck.c      (revision 177758)
> +++ gcc/c-typeck.c      (working copy)
> @@ -2815,6 +2815,68 @@ build_function_call_vec (location_t loc,
>       && !check_builtin_function_arguments (fundecl, nargs, argarray))
>     return error_mark_node;
>
> +  /* Typecheck a builtin function which is declared with variable
> +     argument list.  */
> +  if (fundecl && DECL_BUILT_IN (fundecl)
> +      && DECL_BUILT_IN_CLASS (fundecl) == BUILT_IN_NORMAL)
>
> just add to check_builtin_function_arguments which is called right
> in front of your added code.
>
> +          /* Here we change the return type of the builtin function
> +             from int f(...) --> t f(...) where t is a type of the
> +             first argument.  */
> +          fundecl = copy_node (fundecl);
> +          TREE_TYPE (fundecl) = build_function_type (TREE_TYPE (firstarg),
> +                                        TYPE_ARG_TYPES (TREE_TYPE 
> (fundecl)));
> +          function = build_fold_addr_expr (fundecl);
>
> oh, hum - now I remember ;)  Eventually the C frontend should handle
> this not via the function call mechanism but similar to how Joseph
> added __builtin_complex support with
>
> 2011-08-19  Joseph Myers  <jos...@codesourcery.com>
>
>        * c-parser.c (c_parser_postfix_expression): Handle RID_BUILTIN_COMPLEX.
>        * doc/extend.texi (__builtin_complex): Document.
>
> and then emit VEC_SHUFFLE_EXPRs directly from the frontend.  Joseph?
>
>          FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (inside_init), ix, 
> value)
> -           if (!CONSTANT_CLASS_P (value))
> +         if (!CONSTANT_CLASS_P (value))
>
> watch out for spurious whitespace changes.
>
> Index: gcc/gimplify.c
> ===================================================================
> --- gcc/gimplify.c      (revision 177758)
> +++ gcc/gimplify.c      (working copy)
> @@ -7050,6 +7050,7 @@ gimplify_expr (tree *expr_p, gimple_seq
>          break;
>
>        case BIT_FIELD_REF:
> +       case VEC_SHUFFLE_EXPR:
>
> I don't think that's quite the right place given the is_gimple_lvalue
> predicate on the first operand.  More like
>
>        case VEC_SHUFFLE_EXPR:
>           goto expr_3;
>
> +/* Vector shuffle expression. A = VEC_SHUFFLE_EXPR<v0, v1, maks>
>
> typo, mask
>
> +   means
> +
> +   freach i in length (mask):
> +     A = mask[i] < length (v0) ? v0[mask[i]] : v1[mask[i]]
> +*/
> +DEFTREECODE (VEC_SHUFFLE_EXPR, "vec_shuffle_expr", tcc_expression, 3)
>
> what is the (is there any?) constraint on the operand types, especially
> the mask type?
>
> Index: gcc/gimple.c
> ===================================================================
> --- gcc/gimple.c        (revision 177758)
> +++ gcc/gimple.c        (working copy)
> @@ -2623,6 +2623,7 @@ get_gimple_rhs_num_ops (enum tree_code c
>       || (SYM) == ADDR_EXPR                                                \
>       || (SYM) == WITH_SIZE_EXPR                                           \
>       || (SYM) == SSA_NAME                                                 \
> +      || (SYM) == VEC_SHUFFLE_EXPR                                         \
>       || (SYM) == VEC_COND_EXPR) ? GIMPLE_SINGLE_RHS                       \
>    : GIMPLE_INVALID_RHS),
>  #define END_OF_BASE_TREE_CODES (unsigned char) GIMPLE_INVALID_RHS,
>
> please make it GIMPLE_TERNARY_RHS instead.
>
> which requires adjustment at least here:
>
> Index: gcc/tree-ssa-operands.c
> ===================================================================
> --- gcc/tree-ssa-operands.c     (revision 177758)
> +++ gcc/tree-ssa-operands.c     (working copy)
> @@ -943,6 +943,7 @@ get_expr_operands (gimple stmt, tree *ex
>
>     case COND_EXPR:
>     case VEC_COND_EXPR:
> +    case VEC_SHUFFLE_EXPR:
>
>
> I think it would be nicer if the builtin would be handled by the frontend
> not as builtin but like __builtin_complex and we'd just deal with
> VEC_SHUFFLE_EXPR throughout the middle-end, eventually
> lowering it in tree-vect-generic.c.  So I didn't look at the lowering
> code in detail because that would obviously change then.
>
> Defering to Joseph for a decision here and to x86 maintainers for
> the target specific bits.
>
> Thanks,
> Richard.

I'll go and see how the __builtin_complex are treated, and try to
adjust the patch.


Thanks,
Artem.

>> ChangeLog:
>> 2011-08-30 Artjoms Sinkarovs <artyom.shinkar...@gmailc.com>
>>
>>        gcc/
>>        * optabs.c (expand_vec_shuffle_expr_p): New function. Checks
>>        if given expression can be expanded by the target.
>>        (expand_vec_shuffle_expr): New function. Expand VEC_SHUFFLE_EXPR
>>        using target vector instructions.
>>        * optabs.h: New optab vshuffle.
>>        (expand_vec_shuffle_expr_p): New prototype.
>>        (expand_vec_shuffle_expr): New prototype.
>>        * genopinit.c: Adjust to support vecshuffle.
>>        * builtins.def: New builtin __builtin_shuffle.
>>        * c-typeck.c (build_function_call_vec): Typecheck
>>        __builtin_shuffle, allowing only two or three arguments.
>>        Change the type of builtin depending on the arguments.
>>        (digest_init): Warn when constructor has less elements than
>>        vector type.
>>        * gimplify.c (gimplify_exp): Adjusted to support VEC_SHUFFLE_EXPR.
>>        * tree.def: New tree code VEC_SHUFFLE_EXPR.
>>        * tree-vect-generic.c (vector_element): New function. Returns an
>>        element of the vector at the given position.
>>        (lower_builtin_shuffle): Change builtin_shuffle with VEC_SHUFLLE_EXPR
>>        or expand an expression piecewise.
>>        (expand_vector_operations_1): Adjusted.
>>        (gate_expand_vector_operations_noop): New gate function.
>>        * gimple.c (get_gimple_rhs_num_ops): Adjust.
>>        * passes.c: Move veclower down.
>>        * tree-pretty-print.c (dump_generic_node): Recognize
>>        VEC_SHUFFLE_EXPR as valid expression.
>>        * tree-ssa-operands: Adjust.
>>
>>        gcc/config/i386
>>        * sse.md: (sseshuffint) New mode_attr. Correspondence between the
>>        vector and the type of the mask when shuffling.
>>        (vecshuffle<mode>): New expansion.
>>        * i386-protos.h (ix86_expand_vshuffle): New prototype.
>>        * i386.c (ix86_expand_vshuffle): Expand vshuffle using pshufb.
>>        (ix86_vectorize_builtin_vec_perm_ok): Adjust.
>>
>>        gcc/doc
>>        * extend.texi: Adjust.
>>
>>        gcc/testsuite
>>        * gcc.c-torture/execute/vect-shuffle-2.c: New test.
>>        * gcc.c-torture/execute/vect-shuffle-4.c: New test.
>>        * gcc.c-torture/execute/vect-shuffle-1.c: New test.
>>        * gcc.c-torture/execute/vect-shuffle-3.c: New test.
>>
>> bootstrapped on x86_64-unknown-linux-gnu. The AVX parts are not
>> tested, because I don't have actual hardware. It works with -mavx, the
>> assembler code looks fine to me. I'll test it on a real hardware in
>> couple of days.
>>
>>
>>
>> Thanks,
>> Artem Shinkarov.
>>
>

Reply via email to