Richard Biener <rguent...@suse.de> writes:
> The following makes SSA rewrite (update-address-taken) recognize
> sets of aligned sub-vectors in aligned position
> (v2qi into v16qi, but esp. v8qi into v16qi).  It uses the
> BIT_INSERT_EXPR support for this, enabling that for vector
> typed values.  This makes us turn for example
>
> typedef unsigned char v16qi __attribute__((vector_size(16)));
> v16qi load (const void *p)
> {
>   v16qi r;
>   __builtin_memcpy (&r, p, 8);
>   return r;
> }
>
> into the following
>
> load (const void * p)
> {
>   v16qi r;
>   long unsigned int _3;
>   v16qi _5;
>   vector(8) unsigned char _7;
>
>   <bb 2> :
>   _3 = MEM[(char * {ref-all})p_2(D)];
>   _7 = VIEW_CONVERT_EXPR<vector(8) unsigned char>(_3);
>   r_9 = BIT_INSERT_EXPR <r_8(D), _7, 0 (64 bits)>;
>   _5 = r_9;
>   return _5;
>
> this isn't yet nicely expanded since the BIT_INSERT_EXPR
> expansion simply goes through store_bit_field and there's
> no vector-mode vec_set.
>
> Similar as to the single-element insert SSA rewrite already
> handles the transform is conditional on the involved
> vector types having non-BLKmode.  This is somewhat bad
> since the transform is supposed to enable SSA optimizations
> by rewriting memory vectors into SSA form.  Since splitting
> of larger generic vectors happens very much later only
> this pessimizes their use.  But the BIT_INSERT_EXPR
> expansion doesn't cope with BLKmode entities (source or
> destination).
>
> Extending BIT_INSERT_EXPR this way seems natural given
> the support of CONSTRUCTORs with smaller vectors.
> BIT_FIELD_REF isn't particularly restricted so can be
> used to extract sub-vectors as well.
>
> Code generation is as bad as before (RTL expansion eventually
> spills) but SSA optimizations are enabled on less trivial
> testcases.
>
> Boostrap / regtest running on x86_64-unknown-linux-gnu.
>
> Comments?
>
> Richard.
>
> 2019-05-14  Richard Biener  <rguent...@suse.de>
>
>       PR tree-optimization/90424
>       * tree-ssa.c (non_rewritable_lvalue_p): Handle inserts from
>       aligned subvectors.
>       (execute_update_addresses_taken): Likewise.
>       * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
>
>       * g++.target/i386/pr90424-1.C: New testcase.
>       * g++.target/i386/pr90424-2.C: Likewise.
>
> Index: gcc/tree-ssa.c
> ===================================================================
> --- gcc/tree-ssa.c    (revision 271155)
> +++ gcc/tree-ssa.c    (working copy)
> @@ -1521,14 +1521,28 @@ non_rewritable_lvalue_p (tree lhs)
>        if (DECL_P (decl)
>         && VECTOR_TYPE_P (TREE_TYPE (decl))
>         && TYPE_MODE (TREE_TYPE (decl)) != BLKmode
> -       && operand_equal_p (TYPE_SIZE_UNIT (TREE_TYPE (lhs)),
> -                           TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl))), 0)
> +       && multiple_of_p (sizetype,
> +                         TYPE_SIZE_UNIT (TREE_TYPE (decl)),
> +                         TYPE_SIZE_UNIT (TREE_TYPE (lhs)))
>         && known_ge (mem_ref_offset (lhs), 0)
>         && known_gt (wi::to_poly_offset (TYPE_SIZE_UNIT (TREE_TYPE (decl))),
>                      mem_ref_offset (lhs))
>         && multiple_of_p (sizetype, TREE_OPERAND (lhs, 1),
>                           TYPE_SIZE_UNIT (TREE_TYPE (lhs))))
> -     return false;
> +     {
> +       if (operand_equal_p (TYPE_SIZE_UNIT (TREE_TYPE (lhs)),
> +                            TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl))),
> +                            0))
> +         return false;
> +       /* For sub-vector inserts the insert vector mode has to be
> +          supported.  */
> +       tree vtype = build_vector_type
> +           (TREE_TYPE (TREE_TYPE (decl)),
> +            tree_to_uhwi (TYPE_SIZE (TREE_TYPE (lhs)))
> +            / tree_to_uhwi (TYPE_SIZE (TREE_TYPE (TREE_TYPE (decl)))));

AFAICT nothing guarantees tree_fits_uhwi_p for the lhs, so this isn't
poly-int clean.  Is there a guarantee that lhs is a multiple of the
element size even for integers?  Or are we just relying on a vector
of 0 elements being rejected?

Maybe something like:

          tree elt_type = TREE_TYPE (TREE_TYPE (decl));
          unsigned int elt_bits = tree_to_uhwi (TYPE_SIZE (elt_type));
          poly_uint64 lhs_bits, nelts;
          if (poly_int_tree_p (TYPE_SIZE (TREE_TYPE (lhs)), &lhs_bits)
              && multiple_p (lhs_bits, elt_bits, &nelts))
            {
              tree vtype = build_vector_type (elt_type, nelts);

?  

Thanks,
Richard

Reply via email to