https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95906
Bug ID: 95906
Summary: Failure to recognize max pattern with mask
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: gabravier at gmail dot com
Target Milestone: ---
typedef int8_t v16i8 __attribute__((__vector_size__ (16)));
v16i8 f(v16i8 a, v16i8 b)
{
v16i8 cmp = (a > b);
return (cmp & a) | (~cmp & b);
}
int f2(int a, int b)
{
int cmp = -(a > b);
return (cmp & a) | (~cmp & b);
}
f can be optimized to `__builtin_ia32_pmaxsb128` (on x86 with `-msse4`) (the
`pmax` instructions can be used for the same pattern with similar types) and
`f2` can be optimized to using `MAX_EXPR` (they're essentially the same but
I've included the pattern for vectorized types because I originally found this
in a function (which was made before SSE4) made for SSE). LLVM does these
transformations, but GCC does not.