https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81719

            Bug ID: 81719
           Summary: Range-based for loop on short fixed size array
                    generates long unrolled loop
           Product: gcc
           Version: 7.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jzwinck at gmail dot com
  Target Milestone: ---

C++11 range-based for loops over arrays of size known at compile time result in
 bloated, branchy, and unreachable code with -O3 optimization.  For example:

    typedef int Items[2];

    struct ItemArray
    {
        Items items;
        int sum_x2() const;
    };

    int ItemArray::sum_x2() const
    {
        int total = 0;
        for (int item : items)
        {
            total += item;
        }
        return total;
    }

Clang compiles the above to [mov, add, ret].  GCC with -O2 compiles it to a few
more than that, and with -O3, a whopping 81 instructions.  Add -march=haswell
and behold about 130 instructions to add two ints.

GCC (all versions, 4 to 7) generates code to handle a variable-sized array up
to about 6 to 14 elements, depending on -march.  The number of elements is
known at compile time to be 2 (other small values also elicit the bug).  GCC
should generate three instructions in both -O2 and -O3.  It actually does, if
sum_x2() is a free function instead of a member function.  The problem also
goes away if you use a C-style loop.

There are lots of permutations of this, including using a range-based for loop
to assign a common value to every element of an array whose size is known at
compile time (120 instructions to assign a single int:
https://godbolt.org/g/BGYggD).

Discussion on Stack Overflow:
https://stackoverflow.com/questions/45496987/gcc-optimizes-fixed-range-based-for-loop-as-if-it-had-longer-variable-length

Reply via email to