https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102860
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Short testcase:
function foo(a)
integer(kind=4) :: a(1024)
a(:) = modulo (a(:), 39)
end function
-O2 -mcpu=power10.
vect_recog_divmod_pattern only handles TRUNC_{DIV,MOD}_EXPR and EXACT_DIV_EXPR
(and isn't guaranteed to succeed anyway), but optab_for_tree_code returns the
same smod_optab or sdiv_optab (if signed; FLOOR_* for unsigned is mapped to
TRUNC_*).
I guess the quickest way would be to punt on {CEIL,FLOOR,ROUND}_{DIV,MOD}_EXPR
in the vectorizer and tree-vect-generic.cc
Further gradual improvements can be:
1) match.pd has:
/* For unsigned integral types, FLOOR_DIV_EXPR is the same as
TRUNC_DIV_EXPR. Rewrite into the latter in this case. */
(simplify
(floor_div @0 @1)
(if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
&& TYPE_UNSIGNED (type))
(trunc_div @0 @1)))
but expmed.cc has:
/* Promote floor rounding to trunc rounding for unsigned operations. */
if (unsignedp)
{
if (code == FLOOR_DIV_EXPR)
code = TRUNC_DIV_EXPR;
if (code == FLOOR_MOD_EXPR)
code = TRUNC_MOD_EXPR;
if (code == EXACT_DIV_EXPR && op1_is_pow2)
code = TRUNC_DIV_EXPR;
}
Shouldn't we make it
(for floor_divmod (floor_div floor_mod)
trunc_divmod (trunc_div trunc_mod)
(simplify
(floor_divmod @0 @1)
(if ((INTEGRAL_TYPE_P (type) || VECTOR_INTEGER_TYPE_P (type))
&& TYPE_UNSIGNED (type))
(trunc_divmod @0 @1))))
?
2) as the RTL optabs really do just trunc div/mod, perhaps
tree-vect-patterns.cc
could be changed to replace some or all of those operations with the trunc
operation followed by some arith and cond_exprs so that the vectorizer knows
actual cost of those operations.
E.g. it seems expmed.cc expands
r = x %[fl] y;
as
r = x % y; if (r && (x ^ y) < 0) r += y;
and
d = x /[fl] y;
would be
r = x % y; d = x / y; if (r && (x ^ y) < 0) --d;
Looking at wide-int.h,
r = x %[cl] y;
as
r = x % y; if (r && (x ^ y) >= 0) r -= y;
and
d = /[cl] y;
as
r = x % y; d = x / y; if (r && (x ^ y) >= 0) ++d;
All of the above for signed, as I said earlier, unsigned [fl] is the same as
trunc and unsigned [cl] should replace (x ^ y) >= 0 with 1.
[rd] is even more complex.