https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78847

            Bug ID: 78847
           Summary: pointer arithmetic from c++ ranged-based for loop not
                    optimized
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: krister.walfridsson at gmail dot com
  Target Milestone: ---

GCC has some problems eliminating overhead from C++ range-based for loops.
Consider the program

#include <stddef.h>
#include <cstring>
#include <experimental/string_view>
using string_view = std::experimental::string_view;

class Foo {
    constexpr static size_t Length = 9;
    char ascii_[Length];
public:
    Foo();
    string_view view() const {
        return string_view(ascii_, Length);
    }
};

void testWithLoopValue(const Foo foo, size_t ptr, char *buf_) {
  for (auto c : foo.view())
    buf_[ptr++] = c;
}

compiled as
  g++ -O3 -S -std=c++1z k.cpp


ldist determines that this is a memcpy of length expressed as _14

  _18 = (unsigned long) &MEM[(void *)&foo + 9B];
  _4 = &foo.ascii_ + 1;
  _3 = (unsigned long) _4;
  _16 = _18 + 1;
  _14 = _16 - _3;

and dom3 improves this to

  _18 = (unsigned long) &MEM[(void *)&foo + 9B];
  _3 = (unsigned long) &MEM[(void *)&foo + 1B];
  _16 = _18 + 1;
  _14 = _16 - _3;

But this is not further simplified to 9 until combine, where it is too late,
and a call to memcpy is generated instead of the expected inlined version.

Reply via email to