Alexvod tried pr31849-patch from
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31849. It changed result size from
72 to 68 below, but still didn't eliminated the loop counter below.

See the following code

// compilation options: -march=armv5te -mthumb -Os
struct tree_block
{
 unsigned handler_block_flag:2;
 unsigned block_num:30;
};
static int next_block_index = 2;
void number_blocks (int n_blocks, struct tree_block **block_vector)
{
 int i;
 for (i = 0; i < n_blocks; ++i)
   ((block_vector[i])->block_num) = next_block_index++;
}

is compiled by gcc 4.4.0 in very inefficient way. gcc 4.2.1 compiles it to 48
bytes, and gcc 4.4.0 to 72 bytes (1.5 times bigger).

Analysis of assembly files shows the following problems:

1) operations with bitfields are done inefficiently. gcc-4.2.1 sets block_num
by LSLing the value and ORRing it.gcc-4.4.0 loads an extra constant 0x3fffffff
from memory and does AND in addition to that LSL and ORR.

2) gcc-4.4.0 doesn't eliminate loop counter i. It increments both block_vector
and i. Instead gcc-4.2.1 computes the end of the loop and increments only
block_vector

3) register allocation performs badly, it spills some registers to stack, which
causes extra LDR, STR operations in the loop body

The code was taken from GCC SPEC benchmark.


-- 
           Summary: Code bloating on operations with bit fields
           Product: gcc
           Version: 4.4.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: sliao at google dot com
 GCC build triplet: i686-linux
  GCC host triplet: i686-linux
GCC target triplet: arm-eabi


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42501


Reply via email to