On Tue, Apr 22, 2014 at 3:28 AM, Zoran Jovanovic <zoran.jovano...@imgtec.com> wrote: > Hello, > Updated doc/invoke.texi by stating that new option is enabled by default at > -O2 and higher. > Also, -fmerge-bitfields added to the list of optimization flags enabled by > default at -O2 and higher.
With this patch applied gcc from SPEC 2006 ICEs on aarch64-linux-gnu. Here is a reduced testcase: typedef struct rtx_def *rtx; struct rtx_def { unsigned int unchanging : 1; unsigned int volatil : 1; unsigned int in_struct : 1; unsigned integrated : 1; unsigned frame_related : 1; }; ix86_set_move_mem_attrs_1 (rtx x, rtx dstref) { ((x)->volatil) = ((dstref)->volatil); ((x)->in_struct) = ((dstref)->in_struct); ((x)->frame_related) = ((dstref)->frame_related); ((x)->unchanging) = ((dstref)->unchanging); } --- CUT --- Thanks, Andrew Pinski > > Regards, > Zoran Jovanovic > > -------------------------------------------------------------------------------------------------------------------------- > > Lowering is applied only for bit-fields copy sequences that are merged. > Data structure representing bit-field copy sequences is renamed and reduced > in size. > Optimization turned on by default for -O2 and higher. > Some comments fixed. > > Benchmarking performed on WebKit for Android. > Code size reduction noticed on several files, best examples are: > > core/rendering/style/StyleMultiColData (632->520 bytes) > core/platform/graphics/FontDescription (1715->1475 bytes) > core/rendering/style/FillLayer (5069->4513 bytes) > core/rendering/style/StyleRareInheritedData (5618->5346) > core/css/CSSSelectorList(4047->3887) > core/platform/animation/CSSAnimationData (3844->3440 bytes) > core/css/resolver/FontBuilder (13818->13350 bytes) > core/platform/graphics/Font (16447->15975 bytes) > > > Example: > > One of the motivating examples for this work was copy constructor of the > class which contains bit-fields. > > C++ code: > class A > { > public: > A(const A &x); > unsigned a : 1; > unsigned b : 2; > unsigned c : 4; > }; > > A::A(const A&x) > { > a = x.a; > b = x.b; > c = x.c; > } > > GIMPLE code without optimization: > > <bb 2>: > _3 = x_2(D)->a; > this_4(D)->a = _3; > _6 = x_2(D)->b; > this_4(D)->b = _6; > _8 = x_2(D)->c; > this_4(D)->c = _8; > return; > > Optimized GIMPLE code: > <bb 2>: > _10 = x_2(D)->D.1867; > _11 = BIT_FIELD_REF <_10, 7, 0>; > _12 = this_4(D)->D.1867; > _13 = _12 & 128; > _14 = (unsigned char) _11; > _15 = _13 | _14; > this_4(D)->D.1867 = _15; > return; > > Generated MIPS32r2 assembly code without optimization: > lw $3,0($5) > lbu $2,0($4) > andi $3,$3,0x1 > andi $2,$2,0xfe > or $2,$2,$3 > sb $2,0($4) > lw $3,0($5) > andi $2,$2,0xf9 > andi $3,$3,0x6 > or $2,$2,$3 > sb $2,0($4) > lw $3,0($5) > andi $2,$2,0x87 > andi $3,$3,0x78 > or $2,$2,$3 > j $31 > sb $2,0($4) > > Optimized MIPS32r2 assembly code: > lw $3,0($5) > lbu $2,0($4) > andi $3,$3,0x7f > andi $2,$2,0x80 > or $2,$3,$2 > j $31 > sb $2,0($4) > > > Algorithm works on basic block level and consists of following 3 major steps: > 1. Go through basic block statements list. If there are statement pairs that > implement copy of bit field content from one memory location to another > record statements pointers and other necessary data in corresponding data > structure. > 2. Identify records that represent adjacent bit field accesses and mark them > as merged. > 3. Lower bit-field accesses by using new field size for those that can be > merged. > > > New command line option "-fmerge-bitfields" is introduced. > > > Tested - passed gcc regression tests for MIPS32r2. > > > Changelog - > > gcc/ChangeLog: > 2014-04-22 Zoran Jovanovic (zoran.jovano...@imgtec.com) > * common.opt (fmerge-bitfields): New option. > * doc/invoke.texi: Add reference to "-fmerge-bitfields". > * doc/invoke.texi: Add "-fmerge-bitfields" to the list of optimization > flags turned on at -O2. > * tree-sra.c (lower_bitfields): New function. > Entry for (-fmerge-bitfields). > (part_of_union_p): New function. > (bf_access_candidate_p): New function. > (lower_bitfield_read): New function. > (lower_bitfield_write): New function. > (bitfield_stmt_bfcopy_pair::hash): New function. > (bitfield_stmt_bfcopy_pair::equal): New function. > (bitfield_stmt_bfcopy_pair::remove): New function. > (create_and_insert_bfcopy): New function. > (get_bit_offset): New function. > (add_stmt_bfcopy_pair): New function. > (cmp_bfcopies): New function. > (get_merged_bit_field_size): New function. > * dwarf2out.c (simple_type_size_in_bits): Move to tree.c. > (field_byte_offset): Move declaration to tree.h and make it extern. > * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test. > * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test. > * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c. > * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h. > * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c. > (simple_type_size_in_bits): Move from dwarf2out.c. > * tree.h (expressions_equal_p): Add declaration. > (field_byte_offset): Add declaration. > > Patch - > > diff --git a/gcc/common.opt b/gcc/common.opt > index da275e5..52c7f58 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -2203,6 +2203,10 @@ ftree-sra > Common Report Var(flag_tree_sra) Optimization > Perform scalar replacement of aggregates > > +fmerge-bitfields > +Common Report Var(flag_tree_bitfield_merge) Optimization > +Merge loads and stores of consecutive bitfields > + > ftree-ter > Common Report Var(flag_tree_ter) Optimization > Replace temporary expressions in the SSA->normal pass > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 8004da8..6942174 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -411,7 +411,7 @@ Objective-C and Objective-C++ Dialects}. > -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol > -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol > -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol > --ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol > +-fmerge-bitfields -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol > -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol > -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol > -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol > @@ -6860,6 +6860,7 @@ also turns on the following optimization flags: > -findirect-inlining @gol > -fipa-sra @gol > -fisolate-erroneous-paths-dereference @gol > +-fmerge-bitfields @gol > -foptimize-sibling-calls @gol > -fpartial-inlining @gol > -fpeephole2 @gol > @@ -7789,6 +7790,12 @@ pointer alignment information. > This pass only operates on local scalar variables and is enabled by default > at @option{-O} and higher. It requires that @option{-ftree-ccp} is enabled. > > +@item -fmerge-bitfields > +@opindex fmerge-bitfields > +Combines several adjacent bit-field accesses that copy values > +from one memory location to another into one single bit-field access. > +This is enabled by default at @option{-O2} and higher. > + > @item -ftree-ccp > @opindex ftree-ccp > Perform sparse conditional constant propagation (CCP) on trees. This > diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c > index 8caf940..71a3e6b 100644 > --- a/gcc/dwarf2out.c > +++ b/gcc/dwarf2out.c > @@ -3119,8 +3119,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned > int); > static tree field_type (const_tree); > static unsigned int simple_type_align_in_bits (const_tree); > static unsigned int simple_decl_align_in_bits (const_tree); > -static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree); > -static HOST_WIDE_INT field_byte_offset (const_tree); > static void add_AT_location_description (dw_die_ref, enum > dwarf_attribute, > dw_loc_list_ref); > static void add_data_member_location_attribute (dw_die_ref, tree); > @@ -10279,25 +10277,6 @@ is_base_type (tree type) > return 0; > } > > -/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE > - node, return the size in bits for the type if it is a constant, or else > - return the alignment for the type if the type's size is not constant, or > - else return BITS_PER_WORD if the type actually turns out to be an > - ERROR_MARK node. */ > - > -static inline unsigned HOST_WIDE_INT > -simple_type_size_in_bits (const_tree type) > -{ > - if (TREE_CODE (type) == ERROR_MARK) > - return BITS_PER_WORD; > - else if (TYPE_SIZE (type) == NULL_TREE) > - return 0; > - else if (tree_fits_uhwi_p (TYPE_SIZE (type))) > - return tree_to_uhwi (TYPE_SIZE (type)); > - else > - return TYPE_ALIGN (type); > -} > - > /* Similarly, but return a double_int instead of UHWI. */ > > static inline double_int > @@ -14672,7 +14651,7 @@ round_up_to_align (double_int t, unsigned int align) > because the offset is actually variable. (We can't handle the latter case > just yet). */ > > -static HOST_WIDE_INT > +HOST_WIDE_INT > field_byte_offset (const_tree decl) > { > double_int object_offset_in_bits; > diff --git a/gcc/opts.c b/gcc/opts.c > index 1873b96..b2692b0 100644 > --- a/gcc/opts.c > +++ b/gcc/opts.c > @@ -497,6 +497,7 @@ static const struct default_options > default_options_table[] = > { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 }, > + { OPT_LEVELS_2_PLUS, OPT_fmerge_bitfields, NULL, 1 }, > > /* -O3 optimizations. */ > { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c > b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c > new file mode 100644 > index 0000000..638db58 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c > @@ -0,0 +1,30 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fmerge-bitfields -fdump-tree-esra" } */ > + > +struct S > +{ > + unsigned f1:7; > + unsigned f2:9; > + unsigned f3:3; > + unsigned f4:5; > + unsigned f5:1; > + unsigned f6:2; > +}; > + > +unsigned > +foo (struct S *p1, struct S *p2, int *ptr) > +{ > + p2->f1 = p1->f1; > + p2->f2 = p1->f2; > + p2->f3 = p1->f3; > + *ptr = 7; > + p2->f4 = p1->f4; > + p2->f5 = p1->f5; > + p2->f6 = p1->f6; > + return 0; > +} > + > +/* Check if bit-field access is lowered. */ > +/* { dg-final { scan-tree-dump "BIT_FIELD_REF" "esra" } } */ > +/* { dg-final { cleanup-tree-dump "esra" } } */ > + > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c > b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c > new file mode 100644 > index 0000000..a7b906f > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c > @@ -0,0 +1,89 @@ > +/* Check whether use of -fmerge-bitfields results in presence of > + overlapping unions results in incorrect code. */ > +/* { dg-options "-O2 -fmerge-bitfields" } */ > +/* { dg-do run } */ > +extern void abort (void); > + > +struct s1 > +{ > + unsigned f1:4; > + unsigned f2:4; > + unsigned f3:4; > +}; > + > +struct s2 > +{ > + unsigned char c; > + unsigned f1:4; > + unsigned f2:4; > + unsigned f3:4; > +}; > + > +struct s3 > +{ > + unsigned f1:3; > + unsigned f2:3; > + unsigned f3:3; > +}; > + > +struct s4 > +{ > + unsigned f0:3; > + unsigned f1:3; > + unsigned f2:3; > + unsigned f3:3; > +}; > + > +union un_1 > +{ > + struct s1 a; > + struct s2 b; > +}; > + > +union un_2 > +{ > + struct s3 a; > + struct s4 b; > +}; > + > +void f1 (union un_1 *p1, union un_1 *p2) > +{ > + p2->a.f3 = p1->b.f3; > + p2->a.f2 = p1->b.f2; > + p2->a.f1 = p1->b.f1; > + > + if (p1->a.f1 != 3) > + abort (); > +} > + > +void f2 (union un_2 *p1, union un_2 *p2) > +{ > + p2->b.f1 = p1->a.f1; > + p2->b.f2 = p1->a.f2; > + p2->b.f3 = p1->a.f3; > + > + if (p2->b.f1 != 0 || p2->b.f2 != 0 || p2->b.f3 != 0) > + abort (); > +} > + > +int main () > +{ > + union un_1 u1; > + union un_2 u2; > + > + u1.b.f1 = 1; > + u1.b.f2 = 2; > + u1.b.f3 = 3; > + u1.b.c = 0; > + > + f1 (&u1, &u1); > + > + u2.b.f0 = 0; > + u2.b.f1 = 1; > + u2.b.f2 = 2; > + u2.b.f3 = 3; > + > + f2 (&u2, &u2); > + > + return 0; > +} > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index 49bbee3..b4f7f26 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -2266,7 +2266,7 @@ analyze_access_subtree (struct access *root, struct > access *parent, > if (INTEGRAL_TYPE_P (root->type) > && (TREE_CODE (root->type) != INTEGER_TYPE > || TYPE_PRECISION (root->type) != root->size) > - /* But leave bitfield accesses alone. */ > + /* But leave bit-field accesses alone. */ > && (TREE_CODE (root->expr) != COMPONENT_REF > || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1)))) > { > @@ -3517,12 +3517,629 @@ perform_intra_sra (void) > return ret; > } > > +/* Bit-field copy sequences. */ > + > +struct bfcopy > +{ > + bfcopy ():merged (false), modified (false), is_barrier (false), next (0), > + head_copy (0) > + { > + } > + > + gimple load_stmt; /* Bit-field load statement. */ > + gimple store_stmt; /* Bit-field store statement. */ > + unsigned src_offset_words; /* Bit-field offset at src in words. */ > + unsigned src_bit_offset; /* Bit-field offset inside source word. */ > + unsigned src_bit_size; /* Size of bit-field in source word. */ > + unsigned dst_offset_words; /* Bit-field offset at dst in words. */ > + unsigned dst_bit_offset; /* Bit-field offset inside destination > + word. */ > + unsigned src_field_offset; /* Source field offset. */ > + unsigned dst_bit_size; /* Size of bit-field in destination word. */ > + tree src_addr; /* Address of source memory access. */ > + tree dst_addr; /* Address of destination memory access. */ > + unsigned merged:1; /* 1 if copy is merged with another one. */ > + unsigned modified:1; /* 1 if bit-field size is modified. */ > + unsigned is_barrier:1; /* 1 if copy is barrier (call or mem > + access). */ > + struct bfcopy *next; /* Copy with which this one is merged. */ > + tree bitfield_representative; /* Bit field representative of > original > + declaration. */ > + struct bfcopy *head_copy; /* Head of copies list where this one is > + merged. */ > +}; > + > +/* Returns true if given COMPONENT_REF is part of an union. */ > + > +static bool part_of_union_p (tree component) > +{ > + tree tmp = component; > + bool res = false; > + while (TREE_CODE (tmp) == COMPONENT_REF) > + { > + if (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE) > + { > + res = true; > + break; > + } > + tmp = TREE_OPERAND (tmp, 0); > + } > + if (tmp && (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE)) > + res = true; > + return res; > +} > + > +/* Return TRUE if REF is a bit-field access. If *OFF is not NULL it will > + contain offset of the bit-field within the representative in bits. */ > + > +static bool > +bf_access_candidate_p (tree ref, unsigned HOST_WIDE_INT * off) > +{ > + if (TREE_CODE (ref) != COMPONENT_REF) > + return false; > + if (part_of_union_p (ref)) > + return false; > + tree field = TREE_OPERAND (ref, 1); > + if (!DECL_BIT_FIELD_TYPE (field)) > + return false; > + > + tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field); > + if (!rep) > + return false; > + /* Do not lower if representative is bigger than one word. */ > + if (TREE_CODE (TREE_TYPE (rep)) == ARRAY_TYPE) > + return false; > + > + if (!off) > + return true; > + > + if (tree_fits_uhwi_p (DECL_FIELD_OFFSET (field)) > + && tree_fits_uhwi_p (DECL_FIELD_OFFSET (rep))) > + *off = (tree_to_uhwi (DECL_FIELD_OFFSET (field)) > + - tree_to_uhwi (DECL_FIELD_OFFSET (rep))) * BITS_PER_UNIT; > + else > + *off = 0; > + *off += (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) > + - tree_to_uhwi (DECL_FIELD_BIT_OFFSET (rep))); > + > + return true; > +} > + > + > +/* Lower the bit-field read at *GSI. */ > + > +static void > +lower_bitfield_read (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off, > + tree size, tree type) > +{ > + gimple stmt = gsi_stmt (*gsi); > + tree ref = gimple_assign_rhs1 (stmt); > + tree field = TREE_OPERAND (ref, 1); > + tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field); > + tree loadres = make_ssa_name (TREE_TYPE (rep), NULL); > + gimple load = gimple_build_assign (loadres, > + build3 (COMPONENT_REF, TREE_TYPE (rep), > + TREE_OPERAND (ref, 0), rep, > + NULL_TREE)); > + gimple_set_vuse (load, gimple_vuse (stmt)); > + gsi_insert_before (gsi, load, GSI_SAME_STMT); > + if (!type) > + type = TREE_TYPE (ref); > + gimple_assign_set_rhs1 (stmt, > + build3 (BIT_FIELD_REF, type, > + loadres, size, bitsize_int (off))); > + update_stmt (stmt); > +} > + > +/* Lower the bit-field write at *GSI. */ > + > +static void > +lower_bitfield_write (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off, > + tree size) > +{ > + gimple stmt = gsi_stmt (*gsi); > + tree ref = gimple_assign_lhs (stmt); > + tree field = TREE_OPERAND (ref, 1); > + tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field); > + tree loadres = make_ssa_name (TREE_TYPE (rep), NULL); > + gimple load = gimple_build_assign (loadres, > + build3 (COMPONENT_REF, TREE_TYPE (rep), > + unshare_expr > + (TREE_OPERAND (ref, 0)), > + rep, > + NULL_TREE)); > + gimple_set_vuse (load, gimple_vuse (stmt)); > + gsi_insert_before (gsi, load, GSI_SAME_STMT); > + /* Mask out bits. */ > + tree masked = make_ssa_name (TREE_TYPE (rep), NULL); > + tree mask = double_int_to_tree (TREE_TYPE (rep), > + ~double_int::mask > + (TREE_INT_CST_LOW (size)).lshift (off)); > + gimple tems = gimple_build_assign_with_ops (BIT_AND_EXPR, > + masked, loadres, mask); > + gsi_insert_before (gsi, tems, GSI_SAME_STMT); > + /* Zero-extend the value to representative size. */ > + tree tem2; > + if (!TYPE_UNSIGNED (TREE_TYPE (field))) > + { > + tem2 = make_ssa_name (unsigned_type_for (TREE_TYPE (field)), NULL); > + tems = gimple_build_assign_with_ops (NOP_EXPR, tem2, > + gimple_assign_rhs1 (stmt), > + NULL_TREE); > + gsi_insert_before (gsi, tems, GSI_SAME_STMT); > + } > + else > + tem2 = gimple_assign_rhs1 (stmt); > + tree tem = make_ssa_name (TREE_TYPE (rep), NULL); > + tems = gimple_build_assign_with_ops (NOP_EXPR, tem, tem2, NULL_TREE); > + gsi_insert_before (gsi, tems, GSI_SAME_STMT); > + /* Shift the value into place. */ > + if (off != 0) > + { > + tem2 = make_ssa_name (TREE_TYPE (rep), NULL); > + tems = gimple_build_assign_with_ops (LSHIFT_EXPR, tem2, tem, > + size_int (off)); > + gsi_insert_before (gsi, tems, GSI_SAME_STMT); > + } > + else > + tem2 = tem; > + /* Merge masked loaded value and value. */ > + tree modres = make_ssa_name (TREE_TYPE (rep), NULL); > + gimple mod = gimple_build_assign_with_ops (BIT_IOR_EXPR, modres, > + masked, tem2); > + gsi_insert_before (gsi, mod, GSI_SAME_STMT); > + /* Finally adjust the store. */ > + gimple_assign_set_rhs1 (stmt, modres); > + gimple_assign_set_lhs (stmt, > + build3 (COMPONENT_REF, TREE_TYPE (rep), > + TREE_OPERAND (ref, 0), rep, NULL_TREE)); > + update_stmt (stmt); > +} > + > +/* Connects statement with bit-field copy sequence to which that statement > + belong. */ > + > +struct bitfield_stmt_bfcopy_pair > +{ > + gimple stmt; > + bfcopy *copy; > + bitfield_stmt_bfcopy_pair (gimple s, bfcopy * c):stmt (s), copy (c) > + { > + }; > + /* hash_table support. */ > + typedef bitfield_stmt_bfcopy_pair value_type; > + typedef bitfield_stmt_bfcopy_pair compare_type; > + static inline hashval_t hash (const bitfield_stmt_bfcopy_pair * const); > + static inline int equal (const bitfield_stmt_bfcopy_pair * const, > + const bitfield_stmt_bfcopy_pair * const); > + static inline void remove (bitfield_stmt_bfcopy_pair *); > +}; > + > +hashval_t > +bitfield_stmt_bfcopy_pair::hash (const bitfield_stmt_bfcopy_pair * const a) > +{ > + return hashval_t (gimple_uid (a->stmt)); > +} > + > +int > +bitfield_stmt_bfcopy_pair::equal (const bitfield_stmt_bfcopy_pair * const a, > + const bitfield_stmt_bfcopy_pair * const b) > +{ > + return a->stmt == b->stmt; > +} > + > +void > +bitfield_stmt_bfcopy_pair::remove (bitfield_stmt_bfcopy_pair * a) > +{ > + delete a; > +} > + > +/* Create new bfcopy structure and add it to given bitfield_copies > + htab. */ > + > +static struct bfcopy * > +create_and_insert_bfcopy (vec < struct bfcopy *>*bitfield_copies) > +{ > + bfcopy *copy = new bfcopy; > + bitfield_copies->safe_push (copy); > + return copy; > +} > + > +/* Get offset of the field in bits from the start of the record. */ > + > +static inline HOST_WIDE_INT > +get_bit_offset (tree decl) > +{ > + tree type = DECL_BIT_FIELD_TYPE (decl); > + HOST_WIDE_INT bitpos_int; > + > + /* Must be a field and a bit-field. */ > + gcc_assert (type && TREE_CODE (decl) == FIELD_DECL); > + /* Bit position and decl size should be integer constants that can be > + represented in a single HOST_WIDE_INT. */ > + if (!tree_fits_uhwi_p (bit_position (decl)) > + || !tree_fits_uhwi_p (DECL_SIZE (decl))) > + return -1; > + > + bitpos_int = int_bit_position (decl); > + return bitpos_int; > +} > + > +/* Creates new bitfield_stmt_bfcopy_pair structure and adds it to given > + htab. */ > + > +static bool > +add_stmt_bfcopy_pair (hash_table < bitfield_stmt_bfcopy_pair > &bf_stmnt_cpy, > + bfcopy * copy, gimple stmt) > +{ > + bitfield_stmt_bfcopy_pair p (stmt, copy); > + bitfield_stmt_bfcopy_pair **slot = bf_stmnt_cpy.find_slot (&p, INSERT); > + if (!*slot) > + { > + *slot = new bitfield_stmt_bfcopy_pair (stmt, copy); > + return true; > + } > + return false; > +} > + > +/* Passed to qsort. Compares two bfcopy records. */ > + > +static int > +cmp_bfcopies (const void *p1, const void *p2) > +{ > + const struct bfcopy *a1 = *((const struct bfcopy **) p1); > + const struct bfcopy *a2 = *((const struct bfcopy **) p2); > + > + if (DECL_UID (a1->bitfield_representative) - > + DECL_UID (a2->bitfield_representative)) > + return DECL_UID (a1->bitfield_representative) - > + DECL_UID (a2->bitfield_representative); > + > + if (!expressions_equal_p (a1->src_addr, a2->src_addr)) > + return a1->src_addr - a2->src_addr; > + > + if (!expressions_equal_p (a1->dst_addr, a2->dst_addr)) > + return a1->dst_addr - a2->dst_addr; > + > + if (a1->src_offset_words - a2->src_offset_words) > + return a1->src_offset_words - a2->src_offset_words; > + > + return a1->src_bit_offset - a2->src_bit_offset; > +} > + > +/* Returns size of combined bitfields. Size cannot be larger than size > + of largest directly accessible memory unit. */ > + > +static int > +get_merged_bit_field_size (bfcopy * copy) > +{ > + bfcopy *tmp_copy = copy; > + int size = 0; > + > + while (tmp_copy) > + { > + size += tmp_copy->src_bit_size; > + tmp_copy = tmp_copy->next; > + } > + return size; > +} > + > +/* Lower bit-field access to accesses to its > + DECL_BIT_FIELD_REPRESENTATIVE. */ > + > +static void > +lower_bitfields (void) > +{ > + basic_block bb; > + hash_table < bitfield_stmt_bfcopy_pair > bf_stmnt_cpy; > + vec < struct bfcopy *>bitfield_copies; > + struct bfcopy *copy; > + bf_stmnt_cpy.create (1); > + > + FOR_EACH_BB_FN (bb, cfun) > + { > + bf_stmnt_cpy.empty (); > + tree prev_representative = NULL_TREE; > + bitfield_copies.create (0); > + > + /* There are two passes, the first one identifies interesting > + bit-field accesses and the second one actually lowers them. */ > + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); > + !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple stmt = gsi_stmt (gsi); > + > + if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt)) > + continue; > + > + tree ref = gimple_assign_rhs1 (stmt); > + if (bf_access_candidate_p (ref, NULL)) > + { > + gimple use_stmt; > + use_operand_p use; > + tree op0 = TREE_OPERAND (ref, 0); > + tree op1 = TREE_OPERAND (ref, 1); > + > + if (TREE_CODE (DECL_CONTEXT (op1)) == UNION_TYPE > + || TREE_CODE (DECL_CONTEXT (op1)) == QUAL_UNION_TYPE) > + continue; > + > + if (TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME > + && single_imm_use (gimple_assign_lhs (stmt), &use, > + &use_stmt) && is_gimple_assign (use_stmt)) > + { > + tree uses_stmt_lhs = gimple_assign_lhs (use_stmt); > + > + if (bf_access_candidate_p (uses_stmt_lhs, NULL)) > + { > + tree use_op0 = TREE_OPERAND (uses_stmt_lhs, 0); > + tree use_op1 = TREE_OPERAND (uses_stmt_lhs, 1); > + tree use_repr = DECL_BIT_FIELD_REPRESENTATIVE (use_op1); > + if (prev_representative > + && (prev_representative != use_repr)) > + { > + /* If previous bfcopy has different > + representative then barrier is needed > + between it and new access. */ > + copy = create_and_insert_bfcopy (&bitfield_copies); > + copy->is_barrier = true; > + } > + prev_representative = use_repr; > + /* Create new bit-field bfcopy structure. */ > + copy = create_and_insert_bfcopy (&bitfield_copies); > + /* Collect bfcopy data - load instruction. */ > + copy->src_bit_size = tree_to_uhwi (DECL_SIZE (op1)); > + copy->src_bit_offset = get_bit_offset (op1); > + copy->src_offset_words = > + field_byte_offset (op1) / UNITS_PER_WORD; > + copy->src_field_offset = > + tree_to_uhwi (DECL_FIELD_OFFSET (op1)); > + copy->src_addr = op0; > + copy->load_stmt = gsi_stmt (gsi); > + /* Collect bfcopy data - store instruction. */ > + copy->dst_bit_size = tree_to_uhwi (DECL_SIZE (use_op1)); > + copy->dst_bit_offset = get_bit_offset (use_op1); > + copy->dst_offset_words = > + field_byte_offset (use_op1) / UNITS_PER_WORD; > + copy->dst_addr = use_op0; > + copy->store_stmt = use_stmt; > + add_stmt_bfcopy_pair (bf_stmnt_cpy, copy, stmt); > + add_stmt_bfcopy_pair (bf_stmnt_cpy, copy, use_stmt); > + copy->bitfield_representative = use_repr; > + } > + } > + } > + > + /* Insert barrier for merging if statement is function call or memory > + access. */ > + bitfield_stmt_bfcopy_pair csdata (stmt, NULL); > + if (!bf_stmnt_cpy.find (&csdata) > + && ((gimple_code (stmt) == GIMPLE_CALL) > + || (gimple_has_mem_ops (stmt)))) > + { > + /* Create new bfcopy access structure. */ > + copy = create_and_insert_bfcopy (&bitfield_copies); > + /* Mark it as barrier. */ > + copy->is_barrier = true; > + } > + } > + > + /* If there are no at least two sequences go to the next basic block. */ > + if (bitfield_copies.length () <= 1) > + { > + bitfield_copies.release (); > + continue; > + } > + vec < struct bfcopy *>bitfield_copies_merge = vNULL; > + /* Try to merge different bfcopy sequences. */ > + for (int ix = 0; bitfield_copies.iterate (ix, ©); ix++) > + { > + struct bfcopy *head_copy; > + struct bfcopy *mrg_copy; > + struct bfcopy *prev_copy; > + > + if (!bitfield_copies_merge.exists ()) > + bitfield_copies_merge.create (0); > + > + if (!copy->is_barrier) > + bitfield_copies_merge.safe_push (copy); > + > + if (!copy->is_barrier > + && !(copy == bitfield_copies.last () > + && !bitfield_copies_merge.is_empty ())) > + continue; > + > + bitfield_copies_merge.qsort (cmp_bfcopies); > + > + head_copy = prev_copy = NULL; > + int iy; > + for (iy = 0; bitfield_copies_merge.iterate (iy, &mrg_copy); iy++) > + { > + if (head_copy > + && expressions_equal_p (head_copy->src_addr, > + mrg_copy->src_addr) > + && expressions_equal_p (head_copy->dst_addr, > + mrg_copy->dst_addr) > + && prev_copy->src_offset_words > + == mrg_copy->src_offset_words > + && prev_copy->dst_offset_words > + == mrg_copy->dst_offset_words > + && prev_copy->src_bit_offset + prev_copy->src_bit_size > + == mrg_copy->src_bit_offset > + && prev_copy->dst_bit_offset + prev_copy->dst_bit_size > + == mrg_copy->dst_bit_offset > + && prev_copy->bitfield_representative > + == mrg_copy->bitfield_representative) > + { > + /* Merge conditions are satisfied - merge copies. */ > + mrg_copy->merged = true; > + prev_copy->next = mrg_copy; > + head_copy->modified = true; > + prev_copy = mrg_copy; > + mrg_copy->head_copy = head_copy; > + } > + else > + head_copy = prev_copy = mrg_copy; > + } > + bitfield_copies_merge.release (); > + bitfield_copies_merge = vNULL; > + } > + > + tree reaching_vuse = NULL_TREE; > + > + for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) > + { > + gimple stmt = gsi_stmt (gsi); > + > + /* Fix vuse info if necessary. */ > + if (gimple_has_mem_ops (stmt) && reaching_vuse != NULL_TREE) > + { > + gimple_set_vuse (stmt, reaching_vuse); > + update_stmt (stmt); > + } > + > + if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt)) > + { > + gsi_next (&gsi); > + if (gimple_vdef (stmt)) > + reaching_vuse = gimple_vdef (stmt); > + continue; > + } > + > + tree ref; > + unsigned HOST_WIDE_INT off; > + tree size; > + bool deleted = false; > + > + /* Lower a bit-field read. */ > + ref = gimple_assign_rhs1 (stmt); > + if (bf_access_candidate_p (ref, &off)) > + { > + bfcopy *cc = NULL; > + bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL); > + bitfield_stmt_bfcopy_pair *p_st_cpy; > + p_st_cpy = bf_stmnt_cpy.find (&st_cpy); > + unsigned HOST_WIDE_INT mrg_off; > + if (p_st_cpy) > + cc = p_st_cpy->copy; > + > + if (cc && (cc->merged || cc->modified)) > + size = > + build_int_cst (unsigned_type_node, > + get_merged_bit_field_size (cc->head_copy ? > + cc->head_copy : > + cc)); > + else > + size = DECL_SIZE (TREE_OPERAND (ref, 1)); > + if (cc && cc->merged > + && bf_access_candidate_p (gimple_assign_rhs1 > + (cc->head_copy->load_stmt), > + &mrg_off)) > + off = mrg_off; > + if (cc && cc->merged) > + { > + tree head_rhs = gimple_assign_rhs1 (cc->head_copy->load_stmt); > + switch (TREE_CODE (head_rhs)) > + { > + case COMPONENT_REF: > + if (bf_access_candidate_p (head_rhs, &mrg_off)) > + off = mrg_off; > + break; > + case BIT_FIELD_REF: > + off = tree_to_uhwi (TREE_OPERAND (head_rhs, 2)); > + break; > + default: > + break; > + } > + } > + > + if (cc && (cc->modified)) > + { > + tree tmp_ssa; > + tree itype = make_node (INTEGER_TYPE); > + TYPE_PRECISION (itype) = TREE_INT_CST_LOW (size); > + fixup_unsigned_type (itype); > + lower_bitfield_read (&gsi, off, size, itype); > + tmp_ssa = > + make_ssa_name (create_tmp_var (itype, NULL), cc->load_stmt); > + gimple_assign_set_lhs (cc->load_stmt, tmp_ssa); > + update_stmt (cc->load_stmt); > + gimple_assign_set_rhs1 (cc->store_stmt, tmp_ssa); > + update_stmt (cc->store_stmt); > + } > + else if (cc && cc->merged) > + { > + gsi_remove (&gsi, true); > + deleted = true; > + } > + } > + /* Lower a bit-field write. */ > + ref = gimple_assign_lhs (stmt); > + if (bf_access_candidate_p (ref, &off)) > + { > + bfcopy *cc = NULL; > + bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL); > + bitfield_stmt_bfcopy_pair *p_st_cpy; > + unsigned HOST_WIDE_INT mrg_off; > + p_st_cpy = bf_stmnt_cpy.find (&st_cpy); > + if (p_st_cpy) > + cc = p_st_cpy->copy; > + > + if (cc && (cc->merged || cc->modified)) > + size = > + build_int_cst (unsigned_type_node, > + get_merged_bit_field_size (cc->head_copy ? > + cc->head_copy : > + cc)); > + else > + size = DECL_SIZE (TREE_OPERAND (ref, 1)); > + if (cc && cc->merged > + && > + bf_access_candidate_p (gimple_assign_lhs > + (cc->head_copy->store_stmt), &mrg_off)) > + off = mrg_off; > + > + if (cc && (cc->modified) && !(cc && cc->merged)) > + lower_bitfield_write (&gsi, off, size); > + else if (cc && cc->merged) > + { > + if (gimple_vdef (stmt)) > + { > + imm_use_iterator imm_iter; > + gimple use_stmt; > + use_operand_p use_p; > + FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, > + gimple_vdef (stmt)) > + { > + FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) > + SET_USE (use_p, reaching_vuse); > + update_stmt (stmt); > + } > + } > + > + gsi_remove (&gsi, true); > + deleted = true; > + } > + } > + if (gimple_vdef (stmt) && !deleted) > + reaching_vuse = gimple_vdef (stmt); > + if (!deleted) > + gsi_next (&gsi); > + } > + } > + > + bf_stmnt_cpy.dispose (); > +} > + > /* Perform early intraprocedural SRA. */ > static unsigned int > early_intra_sra (void) > { > sra_mode = SRA_MODE_EARLY_INTRA; > - return perform_intra_sra (); > + unsigned int res = perform_intra_sra (); > + if (flag_tree_bitfield_merge) > + lower_bitfields (); > + return res; > } > > /* Perform "late" intraprocedural SRA. */ > diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c > index f7ec8b6..cde6ce6 100644 > --- a/gcc/tree-ssa-sccvn.c > +++ b/gcc/tree-ssa-sccvn.c > @@ -4193,29 +4193,6 @@ get_next_value_id (void) > return next_value_id++; > } > > - > -/* Compare two expressions E1 and E2 and return true if they are equal. */ > - > -bool > -expressions_equal_p (tree e1, tree e2) > -{ > - /* The obvious case. */ > - if (e1 == e2) > - return true; > - > - /* If only one of them is null, they cannot be equal. */ > - if (!e1 || !e2) > - return false; > - > - /* Now perform the actual comparison. */ > - if (TREE_CODE (e1) == TREE_CODE (e2) > - && operand_equal_p (e1, e2, OEP_PURE_SAME)) > - return true; > - > - return false; > -} > - > - > /* Return true if the nary operation NARY may trap. This is a copy > of stmt_could_throw_1_p adjusted to the SCCVN IL. */ > > diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h > index f52783a..0aa5537 100644 > --- a/gcc/tree-ssa-sccvn.h > +++ b/gcc/tree-ssa-sccvn.h > @@ -21,10 +21,6 @@ > #ifndef TREE_SSA_SCCVN_H > #define TREE_SSA_SCCVN_H > > -/* In tree-ssa-sccvn.c */ > -bool expressions_equal_p (tree, tree); > - > - > /* TOP of the VN lattice. */ > extern tree VN_TOP; > > diff --git a/gcc/tree.c b/gcc/tree.c > index 8b44ecc..0104c80 100644 > --- a/gcc/tree.c > +++ b/gcc/tree.c > @@ -12358,4 +12358,44 @@ get_base_address (tree t) > return t; > } > > +/* Compare two expressions E1 and E2 and return true if they are equal. */ > + > +bool > +expressions_equal_p (tree e1, tree e2) > +{ > + /* The obvious case. */ > + if (e1 == e2) > + return true; > + > + /* If only one of them is null, they cannot be equal. */ > + if (!e1 || !e2) > + return false; > + > + /* Now perform the actual comparison. */ > + if (TREE_CODE (e1) == TREE_CODE (e2) > + && operand_equal_p (e1, e2, OEP_PURE_SAME)) > + return true; > + > + return false; > +} > + > +/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE > + node, return the size in bits for the type if it is a constant, or else > + return the alignment for the type if the type's size is not constant, or > + else return BITS_PER_WORD if the type actually turns out to be an > + ERROR_MARK node. */ > + > +unsigned HOST_WIDE_INT > +simple_type_size_in_bits (const_tree type) > +{ > + if (TREE_CODE (type) == ERROR_MARK) > + return BITS_PER_WORD; > + else if (TYPE_SIZE (type) == NULL_TREE) > + return 0; > + else if (tree_fits_uhwi_p (TYPE_SIZE (type))) > + return tree_to_uhwi (TYPE_SIZE (type)); > + else > + return TYPE_ALIGN (type); > +} > + > #include "gt-tree.h" > diff --git a/gcc/tree.h b/gcc/tree.h > index 9ad0b6f..a855070 100644 > --- a/gcc/tree.h > +++ b/gcc/tree.h > @@ -3993,6 +3993,7 @@ extern tree substitute_placeholder_in_expr (tree, tree); > ((EXP) == 0 || TREE_CONSTANT (EXP) ? (EXP) \ > : substitute_placeholder_in_expr (EXP, OBJ)) > > +extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type); > > /* stabilize_reference (EXP) returns a reference equivalent to EXP > but it can be used multiple times > @@ -4109,6 +4110,11 @@ inlined_function_outer_scope_p (const_tree block) > (TREE = function_args_iter_cond (&(ITER))) != NULL_TREE; > \ > function_args_iter_next (&(ITER))) > > + > +/* In dwarf2out.c. */ > +HOST_WIDE_INT > +field_byte_offset (const_tree decl); > + > /* In tree.c */ > extern unsigned crc32_string (unsigned, const char *); > extern unsigned crc32_byte (unsigned, char); > @@ -4253,6 +4259,7 @@ extern tree obj_type_ref_class (tree ref); > extern bool types_same_for_odr (tree type1, tree type2); > extern bool contains_bitfld_component_ref_p (const_tree); > extern bool type_in_anonymous_namespace_p (tree); > +extern bool expressions_equal_p (tree e1, tree e2); > extern bool block_may_fallthru (const_tree); > extern void using_eh_for_cleanups (void); > extern bool using_eh_for_cleanups_p (void); > > > > Regards, > Zoran Jovanovic