Hi,

On Wed, Apr 12, 2017 at 01:55:01PM +0200, Richard Biener wrote:
> On Wed, 12 Apr 2017, Martin Jambor wrote:
> 
> > Hi,
> > 
> > the patch below is an attempt to deal with PR 80293 as non-invasively
> > as possible.  Basically, it switches off total SRA scalarization of
> > any local aggregates which contains an array of elements that have one
> > byte (or less).
> > 
> > The logic behind this is that accessing such arrays element-wise
> > usually results in poor code and that such char arrays are often used
> > for non-statically-typed content anyway, and we do not want to copy
> > that byte per byte.
> > 
> > Alan, do you think this could impact your constant pool scalarization
> > too severely?
> 
> Hmm, isn't one of the issues that we have
> 
>         if (VAR_P (var) && scalarizable_type_p (TREE_TYPE (var)))
>           {
>             if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var)))
>                 <= max_scalarization_size)
>               {
>                 create_total_scalarization_access (var);
> 
> which limits the size of scalarizable vars but not the number
> of accesses we create for total scalarization?

Well, yes, but at least I understood from your comment #4 in the bug
that you did not want to limit the number of replacements for gcc 7?

> 
> Is scalarizable_type_p only used in contexts where we have no hint
> of the actual accesses?

There are should_scalarize_away_bitmap and
cannot_scalarize_away_bitmap hints.

Total scalarization is a hack process where we chop up small
aggregates according to their types - as opposed to actual accesses,
meaning it is an alien process to the rest of SRA - knowing that they
will go completely away afterwards (that is ensured by
cannot_scalarize_away_bitmap) and that this will facilitate copy
propagation (this is indicated by should_scalarize_away_bitmap, which
has a bit set if there is a non-scalar assignment _from_ (a part of)
the aggregate).

> That is, for the constant pool case we
> usually have
> 
>   x = .LC0;
>   .. = x[2];
> 
> so we have a "hint" that accesses on x are those we'd want to
> optimize to accesses to .LC0.

You don't need total scalarization for this, just the existence of
read from x[2] would be sufficient but thanks for pointing me to the
right direction.  For constant pool decl scalarization, it is not
important to introduce artificial accesses for x but for .LC0.
Therefore, I have adapted the patch to allow char arrays for const
decls only and verified that it scalarizes a const-pool array of chars
on Aarch64.  The (otherwise yet untested) result is below.

What do you think?

Martin


2017-04-13  Martin Jambor  <mjam...@suse.cz>

        * tree-sra.c (scalarizable_type_p): New parameter const_decl, make
        char arrays not totally scalarizable if it is false.
        (analyze_all_variable_accesses): Pass correct value in the new
        parameter.

testsuite/
        * g++.dg/tree-ssa/pr80293.C: New test.
---
 gcc/testsuite/g++.dg/tree-ssa/pr80293.C | 45 +++++++++++++++++++++++++++++++++
 gcc/tree-sra.c                          | 21 ++++++++++-----
 2 files changed, 60 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr80293.C

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr80293.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr80293.C
new file mode 100644
index 00000000000..7faf35ae983
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr80293.C
@@ -0,0 +1,45 @@
+// { dg-do compile }
+// { dg-options "-O2 -std=gnu++11 -fdump-tree-optimized" } */
+
+#include <array>
+
+// Return a copy of the underlying memory of an arbitrary value.
+template <
+    typename T,
+    typename = typename 
std::enable_if<std::is_trivially_copyable<T>::value>::type
+>
+auto getMem(
+    T const & value
+) -> std::array<char, sizeof(T)> {
+    auto ret = std::array<char, sizeof(T)>{};
+    __builtin_memcpy(ret.data(), &value, sizeof(T));
+    return ret;
+}
+
+template <
+    typename T,
+    typename = typename 
std::enable_if<std::is_trivially_copyable<T>::value>::type
+>
+auto fromMem(
+    std::array<char, sizeof(T)> const & buf
+) -> T {
+    auto ret = T{};
+    __builtin_memcpy(&ret, buf.data(), sizeof(T));
+    return ret;
+}
+
+double foo1(std::uint64_t arg) {
+    return fromMem<double>(getMem(arg));
+}
+
+double foo2(std::uint64_t arg) {
+    return *reinterpret_cast<double*>(&arg);
+}
+
+double foo3(std::uint64_t arg) {
+    double ret;
+    __builtin_memcpy(&ret, &arg, sizeof(arg));
+    return ret;
+}
+
+// { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "optimized" } }
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 02453d3ed9a..ab06be66131 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -949,10 +949,12 @@ create_access (tree expr, gimple *stmt, bool write)
 
 /* Return true iff TYPE is scalarizable - i.e. a RECORD_TYPE or fixed-length
    ARRAY_TYPE with fields that are either of gimple register types (excluding
-   bit-fields) or (recursively) scalarizable types.  */
+   bit-fields) or (recursively) scalarizable types.  CONST_DECL must be true if
+   we are considering a decl from constant pool.  If it is false, char arrays
+   will be refused.  */
 
 static bool
-scalarizable_type_p (tree type)
+scalarizable_type_p (tree type, bool const_decl)
 {
   gcc_assert (!is_gimple_reg_type (type));
   if (type_contains_placeholder_p (type))
@@ -970,7 +972,7 @@ scalarizable_type_p (tree type)
            return false;
 
          if (!is_gimple_reg_type (ft)
-             && !scalarizable_type_p (ft))
+             && !scalarizable_type_p (ft, const_decl))
            return false;
        }
 
@@ -978,10 +980,16 @@ scalarizable_type_p (tree type)
 
   case ARRAY_TYPE:
     {
+      HOST_WIDE_INT min_elem_size;
+      if (const_decl)
+       min_elem_size = 0;
+      else
+       min_elem_size = BITS_PER_UNIT;
+
       if (TYPE_DOMAIN (type) == NULL_TREE
          || !tree_fits_shwi_p (TYPE_SIZE (type))
          || !tree_fits_shwi_p (TYPE_SIZE (TREE_TYPE (type)))
-         || (tree_to_shwi (TYPE_SIZE (TREE_TYPE (type))) <= 0)
+         || (tree_to_shwi (TYPE_SIZE (TREE_TYPE (type))) <= min_elem_size)
          || !tree_fits_shwi_p (TYPE_MIN_VALUE (TYPE_DOMAIN (type))))
        return false;
       if (tree_to_shwi (TYPE_SIZE (type)) == 0
@@ -995,7 +1003,7 @@ scalarizable_type_p (tree type)
 
       tree elem = TREE_TYPE (type);
       if (!is_gimple_reg_type (elem)
-        && !scalarizable_type_p (elem))
+         && !scalarizable_type_p (elem, const_decl))
        return false;
       return true;
     }
@@ -2653,7 +2661,8 @@ analyze_all_variable_accesses (void)
       {
        tree var = candidate (i);
 
-       if (VAR_P (var) && scalarizable_type_p (TREE_TYPE (var)))
+       if (VAR_P (var) && scalarizable_type_p (TREE_TYPE (var),
+                                               constant_decl_p (var)))
          {
            if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var)))
                <= max_scalarization_size)
-- 
2.12.0

Reply via email to