https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106952
Bug ID: 106952
Summary: Missed optimization: x < y ? x : y not lowered to
minss
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: tavianator at gmail dot com
Target Milestone: ---
Created attachment 53580
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53580&action=edit
Assembly from gcc -O3 -S bug.c
The following is an implementation of a ray/axis-aligned box intersection test:
struct ray {
float origin[3];
float dir_inv[3];
};
struct box {
float min[3];
float max[3];
};
static inline float min(float x, float y) {
return x < y ? x : y;
}
static inline float max(float x, float y) {
return x < y ? x : y;
}
_Bool intersection(const struct ray *ray, const struct box *box) {
float tmin = 0.0, tmax = 1.0 / 0.0;
for (int i = 0; i < 3; ++i) {
float t1 = (box->min[i] - ray->origin[i]) * ray->dir_inv[i];
float t2 = (box->max[i] - ray->origin[i]) * ray->dir_inv[i];
tmin = min(max(t1, tmin), max(t2, tmin));
tmax = max(min(t1, tmax), min(t2, tmax));
}
return tmin < tmax;
}
However, gcc -O3 doesn't use minss/maxss for every min()/max(). Instead, some
of them are lowered to conditional jumps which regresses performance
significantly since the branches are unpredictable.
Simpler variants like
tmin = max(tmin, min(t1, t2));
tmax = min(tmax, max(t1, t2));
get the desired codegen, but that behaves differently if t1 or t2 is NaN.
"Bisecting" with godbolt.org, it seems this is an old regression: 4.8.5 was
good, but 4.9.0 was bad.