On Wed, Oct 29, 2014 at 6:34 AM, Zoran Jovanovic
<[email protected]> wrote:
> Hello,
> This is new patch version in which reported issue is fixed.
> Also, patch is rebased to the revision 216452 and some minor code clean-up is
> done.
FYI. This causes gfc_add_interface_mapping in fortrant/trans-expr.c to
be miscompiled for aarch64-linux-gnu. I am still debugging it and
trying to get a smaller testcase.
Thanks,
Andrew
>
> --------------------------------------------------------------------------------------------------------------------------
>
> Lowering is applied only for bit-fields copy sequences that are merged.
> Data structure representing bit-field copy sequences is renamed and reduced
> in size.
> Optimization turned on by default for -O2 and higher.
> Some comments fixed.
>
> Benchmarking performed on WebKit for Android.
> Code size reduction noticed on several files, best examples are:
>
> core/rendering/style/StyleMultiColData (632->520 bytes)
> core/platform/graphics/FontDescription (1715->1475 bytes)
> core/rendering/style/FillLayer (5069->4513 bytes)
> core/rendering/style/StyleRareInheritedData (5618->5346)
> core/css/CSSSelectorList(4047->3887)
> core/platform/animation/CSSAnimationData (3844->3440 bytes)
> core/css/resolver/FontBuilder (13818->13350 bytes)
> core/platform/graphics/Font (16447->15975 bytes)
>
>
> Example:
>
> One of the motivating examples for this work was copy constructor of the
> class which contains bit-fields.
>
> C++ code:
> class A
> {
> public:
> A(const A &x);
> unsigned a : 1;
> unsigned b : 2;
> unsigned c : 4;
> };
>
> A::A(const A&x)
> {
> a = x.a;
> b = x.b;
> c = x.c;
> }
>
> GIMPLE code without optimization:
>
> <bb 2>:
> _3 = x_2(D)->a;
> this_4(D)->a = _3;
> _6 = x_2(D)->b;
> this_4(D)->b = _6;
> _8 = x_2(D)->c;
> this_4(D)->c = _8;
> return;
>
> Optimized GIMPLE code:
> <bb 2>:
> _10 = x_2(D)->D.1867;
> _11 = BIT_FIELD_REF <_10, 7, 0>;
> _12 = this_4(D)->D.1867;
> _13 = _12 & 128;
> _14 = (unsigned char) _11;
> _15 = _13 | _14;
> this_4(D)->D.1867 = _15;
> return;
>
> Generated MIPS32r2 assembly code without optimization:
> lw $3,0($5)
> lbu $2,0($4)
> andi $3,$3,0x1
> andi $2,$2,0xfe
> or $2,$2,$3
> sb $2,0($4)
> lw $3,0($5)
> andi $2,$2,0xf9
> andi $3,$3,0x6
> or $2,$2,$3
> sb $2,0($4)
> lw $3,0($5)
> andi $2,$2,0x87
> andi $3,$3,0x78
> or $2,$2,$3
> j $31
> sb $2,0($4)
>
> Optimized MIPS32r2 assembly code:
> lw $3,0($5)
> lbu $2,0($4)
> andi $3,$3,0x7f
> andi $2,$2,0x80
> or $2,$3,$2
> j $31
> sb $2,0($4)
>
>
> Algorithm works on basic block level and consists of following 3 major steps:
> 1. Go through basic block statements list. If there are statement pairs that
> implement copy of bit field content from one memory location to another
> record statements pointers and other necessary data in corresponding data
> structure.
> 2. Identify records that represent adjacent bit field accesses and mark them
> as merged.
> 3. Lower bit-field accesses by using new field size for those that can be
> merged.
>
>
> New command line option "-fmerge-bitfields" is introduced.
>
>
> Tested - passed gcc regression tests for MIPS32r2.
>
>
> Changelog -
>
> gcc/ChangeLog:
> 2014-04-22 Zoran Jovanovic ([email protected])
> * common.opt (fmerge-bitfields): New option.
> * doc/invoke.texi: Add reference to "-fmerge-bitfields".
> * doc/invoke.texi: Add "-fmerge-bitfields" to the list of optimization
> flags turned on at -O2.
> * tree-sra.c (lower_bitfields): New function.
> Entry for (-fmerge-bitfields).
> (part_of_union_p): New function.
> (bf_access_candidate_p): New function.
> (lower_bitfield_read): New function.
> (lower_bitfield_write): New function.
> (bitfield_stmt_bfcopy_pair::hash): New function.
> (bitfield_stmt_bfcopy_pair::equal): New function.
> (bitfield_stmt_bfcopy_pair::remove): New function.
> (create_and_insert_bfcopy): New function.
> (get_bit_offset): New function.
> (add_stmt_bfcopy_pair): New function.
> (cmp_bfcopies): New function.
> (get_merged_bit_field_size): New function.
> * dwarf2out.c (simple_type_size_in_bits): Move to tree.c.
> (field_byte_offset): Move declaration to tree.h and make it extern.
> * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
> * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
> * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c.
> * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h.
> * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c.
> (simple_type_size_in_bits): Move from dwarf2out.c.
> * tree.h (expressions_equal_p): Add declaration.
> (field_byte_offset): Add declaration.
>
> Patch -
>
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index 5db5e1e..cec145c 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2270,6 +2270,10 @@ ftree-sra
> Common Report Var(flag_tree_sra) Optimization
> Perform scalar replacement of aggregates
>
> +fmerge-bitfields
> +Common Report Var(flag_tree_bitfield_merge) Optimization
> +Merge loads and stores of consecutive bitfields
> +
> ftree-ter
> Common Report Var(flag_tree_ter) Optimization
> Replace temporary expressions in the SSA->normal pass
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 23f272f..d4205e1 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -421,7 +421,7 @@ Objective-C and Objective-C++ Dialects}.
> -fsplit-ivs-in-unroller -fsplit-wide-types -fssa-phiopt -fstack-protector
> @gol
> -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
> -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
> --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> +-fmerge-bitfields -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
> -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
> -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
> -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
> @@ -7152,6 +7152,7 @@ also turns on the following optimization flags:
> -fipa-sra @gol
> -fipa-icf @gol
> -fisolate-erroneous-paths-dereference @gol
> +-fmerge-bitfields @gol
> -foptimize-sibling-calls @gol
> -foptimize-strlen @gol
> -fpartial-inlining @gol
> @@ -8133,6 +8134,12 @@ pointer alignment information.
> This pass only operates on local scalar variables and is enabled by default
> at @option{-O} and higher. It requires that @option{-ftree-ccp} is enabled.
>
> +@item -fmerge-bitfields
> +@opindex fmerge-bitfields
> +Combines several adjacent bit-field accesses that copy values
> +from one memory location to another into one single bit-field access.
> +This is enabled by default at @option{-O2} and higher.
> +
> @item -ftree-ccp
> @opindex ftree-ccp
> Perform sparse conditional constant propagation (CCP) on trees. This
> diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
> index 8c65176..88c29b1 100644
> --- a/gcc/dwarf2out.c
> +++ b/gcc/dwarf2out.c
> @@ -3205,8 +3205,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned
> int);
> static tree field_type (const_tree);
> static unsigned int simple_type_align_in_bits (const_tree);
> static unsigned int simple_decl_align_in_bits (const_tree);
> -static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
> -static HOST_WIDE_INT field_byte_offset (const_tree);
> static void add_AT_location_description (dw_die_ref, enum
> dwarf_attribute,
> dw_loc_list_ref);
> static void add_data_member_location_attribute (dw_die_ref, tree);
> @@ -10413,25 +10411,6 @@ is_base_type (tree type)
> return 0;
> }
>
> -/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
> - node, return the size in bits for the type if it is a constant, or else
> - return the alignment for the type if the type's size is not constant, or
> - else return BITS_PER_WORD if the type actually turns out to be an
> - ERROR_MARK node. */
> -
> -static inline unsigned HOST_WIDE_INT
> -simple_type_size_in_bits (const_tree type)
> -{
> - if (TREE_CODE (type) == ERROR_MARK)
> - return BITS_PER_WORD;
> - else if (TYPE_SIZE (type) == NULL_TREE)
> - return 0;
> - else if (tree_fits_uhwi_p (TYPE_SIZE (type)))
> - return tree_to_uhwi (TYPE_SIZE (type));
> - else
> - return TYPE_ALIGN (type);
> -}
> -
> /* Similarly, but return an offset_int instead of UHWI. */
>
> static inline offset_int
> @@ -14906,7 +14885,7 @@ round_up_to_align (const offset_int &t, unsigned int
> align)
> because the offset is actually variable. (We can't handle the latter case
> just yet). */
>
> -static HOST_WIDE_INT
> +HOST_WIDE_INT
> field_byte_offset (const_tree decl)
> {
> offset_int object_offset_in_bits;
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 3054196..ef3007b 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -500,6 +500,7 @@ static const struct default_options
> default_options_table[] =
> { OPT_LEVELS_2_PLUS, OPT_fipa_icf, NULL, 1 },
> { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 },
> { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },
> + { OPT_LEVELS_2_PLUS, OPT_fmerge_bitfields, NULL, 1 },
>
> /* -O3 optimizations. */
> { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> new file mode 100644
> index 0000000..638db58
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fmerge-bitfields -fdump-tree-esra" } */
> +
> +struct S
> +{
> + unsigned f1:7;
> + unsigned f2:9;
> + unsigned f3:3;
> + unsigned f4:5;
> + unsigned f5:1;
> + unsigned f6:2;
> +};
> +
> +unsigned
> +foo (struct S *p1, struct S *p2, int *ptr)
> +{
> + p2->f1 = p1->f1;
> + p2->f2 = p1->f2;
> + p2->f3 = p1->f3;
> + *ptr = 7;
> + p2->f4 = p1->f4;
> + p2->f5 = p1->f5;
> + p2->f6 = p1->f6;
> + return 0;
> +}
> +
> +/* Check if bit-field access is lowered. */
> +/* { dg-final { scan-tree-dump "BIT_FIELD_REF" "esra" } } */
> +/* { dg-final { cleanup-tree-dump "esra" } } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> new file mode 100644
> index 0000000..a7b906f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
> @@ -0,0 +1,89 @@
> +/* Check whether use of -fmerge-bitfields results in presence of
> + overlapping unions results in incorrect code. */
> +/* { dg-options "-O2 -fmerge-bitfields" } */
> +/* { dg-do run } */
> +extern void abort (void);
> +
> +struct s1
> +{
> + unsigned f1:4;
> + unsigned f2:4;
> + unsigned f3:4;
> +};
> +
> +struct s2
> +{
> + unsigned char c;
> + unsigned f1:4;
> + unsigned f2:4;
> + unsigned f3:4;
> +};
> +
> +struct s3
> +{
> + unsigned f1:3;
> + unsigned f2:3;
> + unsigned f3:3;
> +};
> +
> +struct s4
> +{
> + unsigned f0:3;
> + unsigned f1:3;
> + unsigned f2:3;
> + unsigned f3:3;
> +};
> +
> +union un_1
> +{
> + struct s1 a;
> + struct s2 b;
> +};
> +
> +union un_2
> +{
> + struct s3 a;
> + struct s4 b;
> +};
> +
> +void f1 (union un_1 *p1, union un_1 *p2)
> +{
> + p2->a.f3 = p1->b.f3;
> + p2->a.f2 = p1->b.f2;
> + p2->a.f1 = p1->b.f1;
> +
> + if (p1->a.f1 != 3)
> + abort ();
> +}
> +
> +void f2 (union un_2 *p1, union un_2 *p2)
> +{
> + p2->b.f1 = p1->a.f1;
> + p2->b.f2 = p1->a.f2;
> + p2->b.f3 = p1->a.f3;
> +
> + if (p2->b.f1 != 0 || p2->b.f2 != 0 || p2->b.f3 != 0)
> + abort ();
> +}
> +
> +int main ()
> +{
> + union un_1 u1;
> + union un_2 u2;
> +
> + u1.b.f1 = 1;
> + u1.b.f2 = 2;
> + u1.b.f3 = 3;
> + u1.b.c = 0;
> +
> + f1 (&u1, &u1);
> +
> + u2.b.f0 = 0;
> + u2.b.f1 = 1;
> + u2.b.f2 = 2;
> + u2.b.f3 = 3;
> +
> + f2 (&u2, &u2);
> +
> + return 0;
> +}
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index fb24114..60dd095 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -2243,7 +2243,7 @@ analyze_access_subtree (struct access *root, struct
> access *parent,
> if (INTEGRAL_TYPE_P (root->type)
> && (TREE_CODE (root->type) != INTEGER_TYPE
> || TYPE_PRECISION (root->type) != root->size)
> - /* But leave bitfield accesses alone. */
> + /* But leave bit-field accesses alone. */
> && (TREE_CODE (root->expr) != COMPONENT_REF
> || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1))))
> {
> @@ -3519,12 +3519,632 @@ perform_intra_sra (void)
> return ret;
> }
>
> +/* Bit-field copy sequences. */
> +
> +struct bfcopy
> +{
> + bfcopy ():merged (false), modified (false), is_barrier (false), next (0),
> + head_copy (0)
> + {
> + }
> +
> + gimple load_stmt; /* Bit-field load statement. */
> + gimple store_stmt; /* Bit-field store statement. */
> + unsigned src_offset_words; /* Bit-field offset at src in words. */
> + unsigned src_bit_offset; /* Bit-field offset inside source word. */
> + unsigned src_bit_size; /* Size of bit-field in source word. */
> + unsigned dst_offset_words; /* Bit-field offset at dst in words. */
> + unsigned dst_bit_offset; /* Bit-field offset inside destination
> + word. */
> + unsigned src_field_offset; /* Source field offset. */
> + unsigned dst_bit_size; /* Size of bit-field in destination word. */
> + tree src_addr; /* Address of source memory access. */
> + tree dst_addr; /* Address of destination memory access. */
> + unsigned merged:1; /* 1 if copy is merged with another one. */
> + unsigned modified:1; /* 1 if bit-field size is modified. */
> + unsigned is_barrier:1; /* 1 if copy is barrier (call or mem
> + access). */
> + struct bfcopy *next; /* Copy with which this one is merged. */
> + tree bitfield_representative; /* Bit field representative of
> original
> + declaration. */
> + struct bfcopy *head_copy; /* Head of copies list where this one is
> + merged. */
> +};
> +
> +typedef struct bfcopy *bfcopy_p;
> +
> +/* Returns true if given COMPONENT_REF is part of an union. */
> +
> +static bool part_of_union_p (tree component)
> +{
> + tree tmp = component;
> + bool res = false;
> + while (TREE_CODE (tmp) == COMPONENT_REF)
> + {
> + if (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE)
> + {
> + res = true;
> + break;
> + }
> + tmp = TREE_OPERAND (tmp, 0);
> + }
> + if (tmp && (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE))
> + res = true;
> + return res;
> +}
> +
> +/* Return TRUE if REF is a bit-field access. If *OFF is not NULL it will
> + contain offset of the bit-field within the representative in bits. */
> +
> +static bool
> +bf_access_candidate_p (tree ref, unsigned HOST_WIDE_INT * off)
> +{
> + if (TREE_CODE (ref) != COMPONENT_REF)
> + return false;
> + if (part_of_union_p (ref))
> + return false;
> + tree field = TREE_OPERAND (ref, 1);
> + if (!DECL_BIT_FIELD_TYPE (field))
> + return false;
> +
> + tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
> + if (!rep)
> + return false;
> + /* Do not lower if representative is bigger than one word. */
> + if (TREE_CODE (TREE_TYPE (rep)) == ARRAY_TYPE)
> + return false;
> +
> + if (!off)
> + return true;
> +
> + if (tree_fits_uhwi_p (DECL_FIELD_OFFSET (field))
> + && tree_fits_uhwi_p (DECL_FIELD_OFFSET (rep)))
> + *off = (tree_to_uhwi (DECL_FIELD_OFFSET (field))
> + - tree_to_uhwi (DECL_FIELD_OFFSET (rep))) * BITS_PER_UNIT;
> + else
> + *off = 0;
> + *off += (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field))
> + - tree_to_uhwi (DECL_FIELD_BIT_OFFSET (rep)));
> +
> + return true;
> +}
> +
> +
> +/* Lower the bit-field read at *GSI. */
> +
> +static void
> +lower_bitfield_read (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off,
> + tree size, tree type)
> +{
> + gimple stmt = gsi_stmt (*gsi);
> + tree ref = gimple_assign_rhs1 (stmt);
> + tree field = TREE_OPERAND (ref, 1);
> + tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
> + tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
> + gimple load = gimple_build_assign (loadres,
> + build3 (COMPONENT_REF, TREE_TYPE (rep),
> + TREE_OPERAND (ref, 0), rep,
> + NULL_TREE));
> + gimple_set_vuse (load, gimple_vuse (stmt));
> + gsi_insert_before (gsi, load, GSI_SAME_STMT);
> + if (!type)
> + type = TREE_TYPE (ref);
> + gimple_assign_set_rhs1 (stmt,
> + build3 (BIT_FIELD_REF, type,
> + loadres, size, bitsize_int (off)));
> + update_stmt (stmt);
> +}
> +
> +/* Lower the bit-field write at *GSI. */
> +
> +static void
> +lower_bitfield_write (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off,
> + tree size)
> +{
> + gimple stmt = gsi_stmt (*gsi);
> + tree ref = gimple_assign_lhs (stmt);
> + tree field = TREE_OPERAND (ref, 1);
> + tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
> + tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
> + gimple load = gimple_build_assign (loadres,
> + build3 (COMPONENT_REF, TREE_TYPE (rep),
> + unshare_expr
> + (TREE_OPERAND (ref, 0)),
> + rep,
> + NULL_TREE));
> + gimple_set_vuse (load, gimple_vuse (stmt));
> + gsi_insert_before (gsi, load, GSI_SAME_STMT);
> + /* Mask out bits. */
> + tree masked = make_ssa_name (TREE_TYPE (rep), NULL);
> + tree mask = double_int_to_tree (TREE_TYPE (rep),
> + ~double_int::mask
> + (TREE_INT_CST_LOW (size)).lshift (off));
> + gimple tems = gimple_build_assign_with_ops (BIT_AND_EXPR,
> + masked, loadres, mask);
> + gsi_insert_before (gsi, tems, GSI_SAME_STMT);
> + /* Zero-extend the value to representative size. */
> + tree tem2;
> + if (!TYPE_UNSIGNED (TREE_TYPE (field)))
> + {
> + tem2 = make_ssa_name (unsigned_type_for (TREE_TYPE (field)), NULL);
> + tems = gimple_build_assign_with_ops (NOP_EXPR, tem2,
> + gimple_assign_rhs1 (stmt),
> + NULL_TREE);
> + gsi_insert_before (gsi, tems, GSI_SAME_STMT);
> + }
> + else
> + tem2 = gimple_assign_rhs1 (stmt);
> + tree tem = make_ssa_name (TREE_TYPE (rep), NULL);
> + tems = gimple_build_assign_with_ops (NOP_EXPR, tem, tem2, NULL_TREE);
> + gsi_insert_before (gsi, tems, GSI_SAME_STMT);
> + /* Shift the value into place. */
> + if (off != 0)
> + {
> + tem2 = make_ssa_name (TREE_TYPE (rep), NULL);
> + tems = gimple_build_assign_with_ops (LSHIFT_EXPR, tem2, tem,
> + size_int (off));
> + gsi_insert_before (gsi, tems, GSI_SAME_STMT);
> + }
> + else
> + tem2 = tem;
> + /* Merge masked loaded value and value. */
> + tree modres = make_ssa_name (TREE_TYPE (rep), NULL);
> + gimple mod = gimple_build_assign_with_ops (BIT_IOR_EXPR, modres,
> + masked, tem2);
> + gsi_insert_before (gsi, mod, GSI_SAME_STMT);
> + /* Finally adjust the store. */
> + gimple_assign_set_rhs1 (stmt, modres);
> + gimple_assign_set_lhs (stmt,
> + build3 (COMPONENT_REF, TREE_TYPE (rep),
> + TREE_OPERAND (ref, 0), rep, NULL_TREE));
> + update_stmt (stmt);
> +}
> +
> +/* Connects statement with bit-field copy sequence to which that statement
> + belong. */
> +
> +struct bitfield_stmt_bfcopy_pair
> +{
> + gimple stmt;
> + bfcopy *copy;
> + bitfield_stmt_bfcopy_pair (gimple s, bfcopy * c):stmt (s), copy (c)
> + {
> + };
> + /* hash_table support. */
> + typedef bitfield_stmt_bfcopy_pair value_type;
> + typedef bitfield_stmt_bfcopy_pair compare_type;
> + static inline hashval_t hash (const bitfield_stmt_bfcopy_pair * const);
> + static inline int equal (const bitfield_stmt_bfcopy_pair * const,
> + const bitfield_stmt_bfcopy_pair * const);
> + static inline void remove (bitfield_stmt_bfcopy_pair *);
> +};
> +
> +hashval_t
> +bitfield_stmt_bfcopy_pair::hash (const bitfield_stmt_bfcopy_pair * const a)
> +{
> + return hashval_t (gimple_uid (a->stmt));
> +}
> +
> +int
> +bitfield_stmt_bfcopy_pair::equal (const bitfield_stmt_bfcopy_pair * const a,
> + const bitfield_stmt_bfcopy_pair * const b)
> +{
> + return a->stmt == b->stmt;
> +}
> +
> +void
> +bitfield_stmt_bfcopy_pair::remove (bitfield_stmt_bfcopy_pair * a)
> +{
> + delete a;
> +}
> +
> +/* Create new bfcopy structure and add it to given bitfield_copies
> + htab. */
> +
> +static bfcopy_p
> +create_and_insert_bfcopy (vec <bfcopy_p>*bitfield_copies)
> +{
> + bfcopy_p copy = new bfcopy;
> + bitfield_copies->safe_push (copy);
> + return copy;
> +}
> +
> +/* Get offset of the field in bits from the start of the record. */
> +
> +static inline HOST_WIDE_INT
> +get_bit_offset (tree decl)
> +{
> + tree type = DECL_BIT_FIELD_TYPE (decl);
> + HOST_WIDE_INT bitpos_int;
> +
> + /* Must be a field and a bit-field. */
> + gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
> + /* Bit position and decl size should be integer constants that can be
> + represented in a single HOST_WIDE_INT. */
> + if (!tree_fits_uhwi_p (bit_position (decl))
> + || !tree_fits_uhwi_p (DECL_SIZE (decl)))
> + return -1;
> +
> + bitpos_int = int_bit_position (decl);
> + return bitpos_int;
> +}
> +
> +/* Creates new bitfield_stmt_bfcopy_pair structure and adds it to given
> + htab. */
> +
> +static bool
> +add_stmt_bfcopy_pair (hash_table < bitfield_stmt_bfcopy_pair > &bf_stmnt_cpy,
> + bfcopy * copy, gimple stmt)
> +{
> + bitfield_stmt_bfcopy_pair p (stmt, copy);
> + bitfield_stmt_bfcopy_pair **slot = bf_stmnt_cpy.find_slot (&p, INSERT);
> + if (!*slot)
> + {
> + *slot = new bitfield_stmt_bfcopy_pair (stmt, copy);
> + return true;
> + }
> + return false;
> +}
> +
> +/* Passed to qsort. Compares two bfcopy records. */
> +
> +static int
> +cmp_bfcopies (const void *p1, const void *p2)
> +{
> + const bfcopy_p *ap1 = (const bfcopy_p*) p1;
> + const bfcopy_p *ap2 = (const bfcopy_p*) p2;
> + const bfcopy_p a1 = *ap1;
> + const bfcopy_p a2 = *ap2;
> +
> + if (DECL_UID (a1->bitfield_representative) -
> + DECL_UID (a2->bitfield_representative))
> + return DECL_UID (a1->bitfield_representative) -
> + DECL_UID (a2->bitfield_representative);
> +
> + if (!expressions_equal_p (a1->src_addr, a2->src_addr))
> + return a1->src_addr - a2->src_addr;
> +
> + if (!expressions_equal_p (a1->dst_addr, a2->dst_addr))
> + return a1->dst_addr - a2->dst_addr;
> +
> + if (a1->src_offset_words - a2->src_offset_words)
> + return a1->src_offset_words - a2->src_offset_words;
> +
> + return a1->src_bit_offset - a2->src_bit_offset;
> +}
> +
> +/* Returns size of combined bitfields. Size cannot be larger than size
> + of largest directly accessible memory unit. */
> +
> +static int
> +get_merged_bit_field_size (bfcopy * copy)
> +{
> + bfcopy *tmp_copy = copy;
> + int size = 0;
> +
> + while (tmp_copy)
> + {
> + size += tmp_copy->src_bit_size;
> + tmp_copy = tmp_copy->next;
> + }
> + return size;
> +}
> +
> +/* Lower bit-field access to accesses to its
> + DECL_BIT_FIELD_REPRESENTATIVE. */
> +
> +static void
> +lower_bitfields (void)
> +{
> + basic_block bb;
> + hash_table<bitfield_stmt_bfcopy_pair> *bf_stmnt_cpy;
> + vec <bfcopy_p>bitfield_copies;
> + bfcopy_p copy;
> + bf_stmnt_cpy = new hash_table<bitfield_stmt_bfcopy_pair> (1);
> +
> + FOR_EACH_BB_FN (bb, cfun)
> + {
> + bf_stmnt_cpy->empty ();
> + tree prev_representative = NULL_TREE;
> + bitfield_copies.create (0);
> +
> + /* There are two passes, the first one identifies interesting
> + bit-field accesses and the second one actually lowers them. */
> + for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
> + !gsi_end_p (gsi); gsi_next (&gsi))
> + {
> + gimple stmt = gsi_stmt (gsi);
> +
> + if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt))
> + continue;
> +
> + tree ref = gimple_assign_rhs1 (stmt);
> + if (bf_access_candidate_p (ref, NULL))
> + {
> + gimple use_stmt;
> + use_operand_p use;
> + tree op0 = TREE_OPERAND (ref, 0);
> + tree op1 = TREE_OPERAND (ref, 1);
> +
> + if (TREE_CODE (DECL_CONTEXT (op1)) == UNION_TYPE
> + || TREE_CODE (DECL_CONTEXT (op1)) == QUAL_UNION_TYPE)
> + continue;
> +
> + if (TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME
> + && single_imm_use (gimple_assign_lhs (stmt), &use,
> + &use_stmt) && is_gimple_assign (use_stmt))
> + {
> + tree uses_stmt_lhs = gimple_assign_lhs (use_stmt);
> +
> + if (bf_access_candidate_p (uses_stmt_lhs, NULL))
> + {
> + tree use_op0 = TREE_OPERAND (uses_stmt_lhs, 0);
> + tree use_op1 = TREE_OPERAND (uses_stmt_lhs, 1);
> + tree use_repr = DECL_BIT_FIELD_REPRESENTATIVE (use_op1);
> + if (prev_representative
> + && (prev_representative != use_repr))
> + {
> + /* If previous bfcopy has different
> + representative then barrier is needed
> + between it and new access. */
> + copy = create_and_insert_bfcopy (&bitfield_copies);
> + copy->is_barrier = true;
> + }
> + prev_representative = use_repr;
> + /* Create new bit-field bfcopy structure. */
> + copy = create_and_insert_bfcopy (&bitfield_copies);
> + /* Collect bfcopy data - load instruction. */
> + copy->src_bit_size = tree_to_uhwi (DECL_SIZE (op1));
> + copy->src_bit_offset = get_bit_offset (op1);
> + copy->src_offset_words =
> + field_byte_offset (op1) / UNITS_PER_WORD;
> + copy->src_field_offset =
> + tree_to_uhwi (DECL_FIELD_OFFSET (op1));
> + copy->src_addr = op0;
> + copy->load_stmt = gsi_stmt (gsi);
> + /* Collect bfcopy data - store instruction. */
> + copy->dst_bit_size = tree_to_uhwi (DECL_SIZE (use_op1));
> + copy->dst_bit_offset = get_bit_offset (use_op1);
> + copy->dst_offset_words =
> + field_byte_offset (use_op1) / UNITS_PER_WORD;
> + copy->dst_addr = use_op0;
> + copy->store_stmt = use_stmt;
> + add_stmt_bfcopy_pair (*bf_stmnt_cpy, copy, stmt);
> + add_stmt_bfcopy_pair (*bf_stmnt_cpy, copy, use_stmt);
> + copy->bitfield_representative = use_repr;
> + }
> + }
> + }
> +
> + /* Insert barrier for merging if statement is function call or memory
> + access. */
> + bitfield_stmt_bfcopy_pair csdata (stmt, NULL);
> + if (!bf_stmnt_cpy->find (&csdata)
> + && ((gimple_code (stmt) == GIMPLE_CALL)
> + || (gimple_has_mem_ops (stmt))))
> + {
> + /* Create new bfcopy access structure. */
> + copy = create_and_insert_bfcopy (&bitfield_copies);
> + /* Mark it as barrier. */
> + copy->is_barrier = true;
> + }
> + }
> +
> + /* If there are no at least two sequences go to the next basic block. */
> + if (bitfield_copies.length () <= 1)
> + {
> + bitfield_copies.release ();
> + continue;
> + }
> + vec <bfcopy_p>bitfield_copies_merge = vNULL;
> + /* Try to merge different bfcopy sequences. */
> + for (int ix = 0; bitfield_copies.iterate (ix, ©); ix++)
> + {
> + bfcopy_p head_copy;
> + bfcopy_p mrg_copy;
> + bfcopy_p prev_copy;
> +
> + if (!bitfield_copies_merge.exists ())
> + bitfield_copies_merge.create (0);
> +
> + if (!copy->is_barrier)
> + bitfield_copies_merge.safe_push (copy);
> +
> + if (!copy->is_barrier
> + && !(copy == bitfield_copies.last ()
> + && !bitfield_copies_merge.is_empty ()))
> + continue;
> +
> + bitfield_copies_merge.qsort (cmp_bfcopies);
> +
> + head_copy = prev_copy = NULL;
> + int iy;
> + for (iy = 0; bitfield_copies_merge.iterate (iy, &mrg_copy); iy++)
> + {
> + if (head_copy
> + && expressions_equal_p (head_copy->src_addr,
> + mrg_copy->src_addr)
> + && expressions_equal_p (head_copy->dst_addr,
> + mrg_copy->dst_addr)
> + && prev_copy->src_offset_words
> + == mrg_copy->src_offset_words
> + && prev_copy->dst_offset_words
> + == mrg_copy->dst_offset_words
> + && prev_copy->src_bit_offset + prev_copy->src_bit_size
> + == mrg_copy->src_bit_offset
> + && prev_copy->dst_bit_offset + prev_copy->dst_bit_size
> + == mrg_copy->dst_bit_offset
> + && prev_copy->bitfield_representative
> + == mrg_copy->bitfield_representative)
> + {
> + /* Merge conditions are satisfied - merge copies. */
> + mrg_copy->merged = true;
> + prev_copy->next = mrg_copy;
> + head_copy->modified = true;
> + prev_copy = mrg_copy;
> + mrg_copy->head_copy = head_copy;
> + }
> + else
> + head_copy = prev_copy = mrg_copy;
> + }
> + bitfield_copies_merge.release ();
> + bitfield_copies_merge = vNULL;
> + }
> +
> + tree reaching_vuse = NULL_TREE;
> +
> + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
> + {
> + gimple stmt = gsi_stmt (gsi);
> +
> + /* Fix vuse info if necessary. */
> + if (gimple_has_mem_ops (stmt) && reaching_vuse != NULL_TREE)
> + {
> + gimple_set_vuse (stmt, reaching_vuse);
> + update_stmt (stmt);
> + }
> +
> + if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt))
> + {
> + gsi_next (&gsi);
> + if (gimple_vdef (stmt))
> + reaching_vuse = gimple_vdef (stmt);
> + continue;
> + }
> +
> + tree ref;
> + unsigned HOST_WIDE_INT off;
> + tree size;
> + bool deleted = false;
> +
> + /* Lower a bit-field read. */
> + ref = gimple_assign_rhs1 (stmt);
> + if (bf_access_candidate_p (ref, &off))
> + {
> + bfcopy *cc = NULL;
> + bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL);
> + bitfield_stmt_bfcopy_pair *p_st_cpy;
> + p_st_cpy = bf_stmnt_cpy->find (&st_cpy);
> + unsigned HOST_WIDE_INT mrg_off;
> + if (p_st_cpy)
> + cc = p_st_cpy->copy;
> +
> + if (cc && (cc->merged || cc->modified))
> + size =
> + build_int_cst (unsigned_type_node,
> + get_merged_bit_field_size (cc->head_copy ?
> + cc->head_copy :
> + cc));
> + else
> + size = DECL_SIZE (TREE_OPERAND (ref, 1));
> + if (cc && cc->merged
> + && bf_access_candidate_p (gimple_assign_rhs1
> + (cc->head_copy->load_stmt),
> + &mrg_off))
> + off = mrg_off;
> + if (cc && cc->merged)
> + {
> + tree head_rhs = gimple_assign_rhs1 (cc->head_copy->load_stmt);
> + switch (TREE_CODE (head_rhs))
> + {
> + case COMPONENT_REF:
> + if (bf_access_candidate_p (head_rhs, &mrg_off))
> + off = mrg_off;
> + break;
> + case BIT_FIELD_REF:
> + off = tree_to_uhwi (TREE_OPERAND (head_rhs, 2));
> + break;
> + default:
> + break;
> + }
> + }
> +
> + if (cc && (cc->modified))
> + {
> + tree tmp_ssa;
> + tree itype = make_node (INTEGER_TYPE);
> + TYPE_PRECISION (itype) = TREE_INT_CST_LOW (size);
> + fixup_unsigned_type (itype);
> + lower_bitfield_read (&gsi, off, size, itype);
> + tmp_ssa =
> + make_ssa_name (create_tmp_var (itype, NULL), cc->load_stmt);
> + gimple_assign_set_lhs (cc->load_stmt, tmp_ssa);
> + update_stmt (cc->load_stmt);
> + gimple_assign_set_rhs1 (cc->store_stmt, tmp_ssa);
> + update_stmt (cc->store_stmt);
> + }
> + else if (cc && cc->merged)
> + {
> + gsi_remove (&gsi, true);
> + deleted = true;
> + }
> + }
> + /* Lower a bit-field write. */
> + ref = gimple_assign_lhs (stmt);
> + if (bf_access_candidate_p (ref, &off))
> + {
> + bfcopy *cc = NULL;
> + bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL);
> + bitfield_stmt_bfcopy_pair *p_st_cpy;
> + unsigned HOST_WIDE_INT mrg_off;
> + p_st_cpy = bf_stmnt_cpy->find (&st_cpy);
> + if (p_st_cpy)
> + cc = p_st_cpy->copy;
> +
> + if (cc && (cc->merged || cc->modified))
> + size =
> + build_int_cst (unsigned_type_node,
> + get_merged_bit_field_size (cc->head_copy ?
> + cc->head_copy :
> + cc));
> + else
> + size = DECL_SIZE (TREE_OPERAND (ref, 1));
> + if (cc && cc->merged
> + &&
> + bf_access_candidate_p (gimple_assign_lhs
> + (cc->head_copy->store_stmt), &mrg_off))
> + off = mrg_off;
> +
> + if (cc && (cc->modified) && !(cc && cc->merged))
> + lower_bitfield_write (&gsi, off, size);
> + else if (cc && cc->merged)
> + {
> + if (gimple_vdef (stmt))
> + {
> + imm_use_iterator imm_iter;
> + gimple use_stmt;
> + use_operand_p use_p;
> + FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter,
> + gimple_vdef (stmt))
> + {
> + FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
> + SET_USE (use_p, reaching_vuse);
> + update_stmt (use_stmt);
> + }
> + }
> +
> + gsi_remove (&gsi, true);
> + deleted = true;
> + }
> + }
> + if (gimple_vdef (stmt) && !deleted)
> + reaching_vuse = gimple_vdef (stmt);
> + if (!deleted)
> + gsi_next (&gsi);
> + }
> + }
> + delete bf_stmnt_cpy;
> +}
> +
> /* Perform early intraprocedural SRA. */
> static unsigned int
> early_intra_sra (void)
> {
> sra_mode = SRA_MODE_EARLY_INTRA;
> - return perform_intra_sra ();
> + unsigned int res = perform_intra_sra ();
> + if (flag_tree_bitfield_merge)
> + lower_bitfields ();
> + return res;
> }
>
> /* Perform "late" intraprocedural SRA. */
> diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
> index 44656ea..cdb3850 100644
> --- a/gcc/tree-ssa-sccvn.c
> +++ b/gcc/tree-ssa-sccvn.c
> @@ -4410,29 +4410,6 @@ get_next_value_id (void)
> return next_value_id++;
> }
>
> -
> -/* Compare two expressions E1 and E2 and return true if they are equal. */
> -
> -bool
> -expressions_equal_p (tree e1, tree e2)
> -{
> - /* The obvious case. */
> - if (e1 == e2)
> - return true;
> -
> - /* If only one of them is null, they cannot be equal. */
> - if (!e1 || !e2)
> - return false;
> -
> - /* Now perform the actual comparison. */
> - if (TREE_CODE (e1) == TREE_CODE (e2)
> - && operand_equal_p (e1, e2, OEP_PURE_SAME))
> - return true;
> -
> - return false;
> -}
> -
> -
> /* Return true if the nary operation NARY may trap. This is a copy
> of stmt_could_throw_1_p adjusted to the SCCVN IL. */
>
> diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
> index ad99604..d5963b6 100644
> --- a/gcc/tree-ssa-sccvn.h
> +++ b/gcc/tree-ssa-sccvn.h
> @@ -21,10 +21,6 @@
> #ifndef TREE_SSA_SCCVN_H
> #define TREE_SSA_SCCVN_H
>
> -/* In tree-ssa-sccvn.c */
> -bool expressions_equal_p (tree, tree);
> -
> -
> /* TOP of the VN lattice. */
> extern tree VN_TOP;
>
> diff --git a/gcc/tree.c b/gcc/tree.c
> index 365e89c..8df9812 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -12286,4 +12286,44 @@ get_base_address (tree t)
> return t;
> }
>
> +/* Compare two expressions E1 and E2 and return true if they are equal. */
> +
> +bool
> +expressions_equal_p (const_tree e1, const_tree e2)
> +{
> + /* The obvious case. */
> + if (e1 == e2)
> + return true;
> +
> + /* If only one of them is null, they cannot be equal. */
> + if (!e1 || !e2)
> + return false;
> +
> + /* Now perform the actual comparison. */
> + if (TREE_CODE (e1) == TREE_CODE (e2)
> + && operand_equal_p (e1, e2, OEP_PURE_SAME))
> + return true;
> +
> + return false;
> +}
> +
> +/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
> + node, return the size in bits for the type if it is a constant, or else
> + return the alignment for the type if the type's size is not constant, or
> + else return BITS_PER_WORD if the type actually turns out to be an
> + ERROR_MARK node. */
> +
> +unsigned HOST_WIDE_INT
> +simple_type_size_in_bits (const_tree type)
> +{
> + if (TREE_CODE (type) == ERROR_MARK)
> + return BITS_PER_WORD;
> + else if (TYPE_SIZE (type) == NULL_TREE)
> + return 0;
> + else if (tree_fits_uhwi_p (TYPE_SIZE (type)))
> + return tree_to_uhwi (TYPE_SIZE (type));
> + else
> + return TYPE_ALIGN (type);
> +}
> +
> #include "gt-tree.h"
> diff --git a/gcc/tree.h b/gcc/tree.h
> index 45f127f..b903089 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -4068,6 +4068,7 @@ extern tree substitute_placeholder_in_expr (tree, tree);
> ((EXP) == 0 || TREE_CONSTANT (EXP) ? (EXP) \
> : substitute_placeholder_in_expr (EXP, OBJ))
>
> +extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
>
> /* stabilize_reference (EXP) returns a reference equivalent to EXP
> but it can be used multiple times
> @@ -4184,6 +4185,11 @@ inlined_function_outer_scope_p (const_tree block)
> (TREE = function_args_iter_cond (&(ITER))) != NULL_TREE;
> \
> function_args_iter_next (&(ITER)))
>
> +
> +/* In dwarf2out.c. */
> +HOST_WIDE_INT
> +field_byte_offset (const_tree decl);
> +
> /* In tree.c */
> extern unsigned crc32_string (unsigned, const char *);
> extern unsigned crc32_byte (unsigned, char);
> @@ -4340,6 +4346,7 @@ extern tree obj_type_ref_class (tree ref);
> extern bool types_same_for_odr (const_tree type1, const_tree type2);
> extern bool contains_bitfld_component_ref_p (const_tree);
> extern bool type_in_anonymous_namespace_p (const_tree);
> +extern bool expressions_equal_p (const_tree e1, const_tree e2);
> extern bool block_may_fallthru (const_tree);
> extern void using_eh_for_cleanups (void);
> extern bool using_eh_for_cleanups_p (void);
>
>
> Regards,
> Zoran
>
>
>
>> From: Andrew Pinski [[email protected]]
>> Sent: Thursday, October 16, 2014 4:09 AM
>> To: Zoran Jovanovic
>> Cc: [email protected]; Richard Biener; Bernhard Reutner-Fischer
>> Subject: Re: [PATCH] Add a new option "-fmerge-bitfields" (patch / doc
>> inside)
>>
>> On Tue, Apr 22, 2014 at 3:28 AM, Zoran Jovanovic
>> <[email protected]> wrote:
>> > Hello,
>> > Updated doc/invoke.texi by stating that new option is enabled by default
>> > at -O2 and higher.
>> > Also, -fmerge-bitfields added to the list of optimization flags enabled
>> > by default at -O2 and higher.
>>
>>
>> With this patch applied gcc from SPEC 2006 ICEs on aarch64-linux-gnu.
>> Here is a reduced testcase:
>> typedef struct rtx_def *rtx;
>> struct rtx_def
>> {
>> unsigned int unchanging : 1;
>> unsigned int volatil : 1;
>> unsigned int in_struct : 1;
>> unsigned integrated : 1;
>> unsigned frame_related : 1;
>> };
>> ix86_set_move_mem_attrs_1 (rtx x, rtx dstref)
>> {
>> ((x)->volatil) = ((dstref)->volatil);
>> ((x)->in_struct) = ((dstref)->in_struct);
>> ((x)->frame_related) = ((dstref)->frame_related);
>> ((x)->unchanging) = ((dstref)->unchanging);
>> }
>> --- CUT ---
>>
>> Thanks,
>> Andrew Pinski
>>