I don't quite know how to characterize this one, so i'll let it up to those in the know to fix the summary/description. Primo, AFAIK, gcc has always struggled to get this right but 4.1 is setting a new record; so i'll qualify that as a regression vs 3.3/3.4. Secundo, i haven't found any related bugreports but one about PPC. So here we go.
Consider this testcase: #include <algorithm> template<class T> static inline T max(const T a, const T b) { return a<b ? b : a; } template<class T> static inline T min(const T a, const T b) { return a<b ? a : b; } template<class T> static inline const T &ref_max(const T &a, const T &b) { return a<b ? b : a; } template<class T> static inline const T &ref_min(const T &a, const T &b) { return a<b ? a : b; } template<class T> struct foo_t { T a0, a1; T bar(const T b, const T c) { return max(min(a0, c), min(max(a1, c), b)); } T bar_ref(const T b, const T c) { return ref_max(ref_min(a0, c), ref_min(ref_max(a1, c), b)); } T bar_stl(const T b, const T c) { return std::max(std::min(a0, c), std::min(std::max(a1, c), b)); } }; template struct foo_t<int>; template struct foo_t<float>; int main() { return 0; } With g++-4120050501 -O3 i get such creative & entertaining code as: 0000000000400610 <foo_t<float>::bar_ref(float, float)>: 400610: ucomiss 0x4(%rdi),%xmm1 400614: lea 0x4(%rdi),%rax 400618: lea 0xfffffffffffffff8(%rsp),%rdx 40061d: movss %xmm0,0xfffffffffffffffc(%rsp) 400623: movss %xmm1,0xfffffffffffffff8(%rsp) 400629: movaps %xmm1,%xmm2 40062c: cmova %rdx,%rax 400630: movss (%rax),%xmm1 400634: ucomiss %xmm1,%xmm0 400637: ja 400641 <foo_t<float>::bar_ref(float, float)+0x31> 400639: lea 0xfffffffffffffffc(%rsp),%rax 40063e: movaps %xmm0,%xmm1 400641: ucomiss (%rdi),%xmm2 400644: cmova %rdi,%rdx 400648: movss (%rdx),%xmm0 40064c: ucomiss %xmm0,%xmm1 40064f: jbe 400655 <foo_t<float>::bar_ref(float, float)+0x45> 400651: movss (%rax),%xmm0 400655: repz retq Compare that to what g++-3.4.4-20050314 gives (g++ 3.3.6 is similar): 0000000000400610 <foo_t<float>::bar_ref(float, float)>: 400610: ucomiss (%rdi),%xmm1 400613: lea 0xfffffffffffffffc(%rsp),%rsi 400618: mov %rdi,%rcx 40061b: lea 0x4(%rdi),%rdx 40061f: movss %xmm0,0xfffffffffffffff8(%rsp) 400625: lea 0xfffffffffffffff8(%rsp),%rax 40062a: movss %xmm1,0xfffffffffffffffc(%rsp) 400630: cmovbe %rsi,%rcx 400634: ucomiss 0x4(%rdi),%xmm1 400638: cmova %rsi,%rdx 40063c: ucomiss (%rdx),%xmm0 40063f: cmova %rdx,%rax 400643: movss (%rax),%xmm0 400647: ucomiss (%rcx),%xmm0 40064a: cmovbe %rcx,%rax 40064e: movss (%rax),%xmm0 400652: retq Certainly not optimal, and in fact quite ugly, but at least there's no branch. Happens as soon as references are used and enough min/max are piled up, and that means that unless you use your own min/max instead of the STL version, you're doomed. -- Summary: min/max and references Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: regression AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: tbptbp at gmail dot com CC: gcc-bugs at gcc dot gnu dot org GCC host triplet: x86* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21463