https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85116

            Bug ID: 85116
           Summary: std::min_element does not optimize well with inlined
                    predicate
           Product: gcc
           Version: 7.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: christopher.schell at oculus dot com
  Target Milestone: ---

According to godbolt (https://godbolt.org/g/igzsnL), the following code:

#define SIZE 1000
std::array<double, SIZE> testArray;

int getMinIdxCPPStyle(double offset)
{
    auto minElement = std::min_element(std::cbegin(testArray),
std::cend(testArray), [offset](auto a, auto b) { return std::abs(a - offset) <
std::abs(b - offset); });
    return std::distance(std::cbegin(testArray), minElement );
}

generates as the following under -O3

getMinIdxCPPStyle(double):
  movq xmm3, QWORD PTR .LC1[rip]
  mov eax, OFFSET FLAT:testArray
  mov edx, OFFSET FLAT:testArray+8
.L11:
  movsd xmm1, QWORD PTR [rdx]
  movsd xmm2, QWORD PTR [rax]
  subsd xmm1, xmm0
  subsd xmm2, xmm0
  andpd xmm1, xmm3
  andpd xmm2, xmm3
  ucomisd xmm2, xmm1
  cmova rax, rdx
  add rdx, 8
  cmp rdx, OFFSET FLAT:testArray+8000
  jne .L11
  sub rax, OFFSET FLAT:testArray
  sar rax, 3
  ret

The problem being that the typical c-style loop beats this easily due to
caching the minimum value and not fetching it and recomputing it. Is there a
reason that the generated code should not cache the minimum value in a register
instead of probably causing a cache miss by fetching it and then unnecessarily
running the computations on it again?

Reply via email to