https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121230
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |crazylht at gmail dot com
--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
This is basically
/* When the component is loaded from memory we can directly
move it to a vector register, otherwise we have to go
via a GPR or via vpinsr which involves similar cost.
Likewise with a BIT_FIELD_REF extracting from a vector
register we can hope to avoid using a GPR. */
if (!is_gimple_assign (def)
|| ((!gimple_assign_load_p (def)
|| (!TARGET_SSE4_1
&& GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) == 1))
&& (gimple_assign_rhs_code (def) != BIT_FIELD_REF
|| !VECTOR_TYPE_P (TREE_TYPE
(TREE_OPERAND (gimple_assign_rhs1 (def),
0))))))
{
if (fp)
m_num_sse_needed[where]++;
else
{
m_num_gpr_needed[where]++;
int cost = COSTS_N_INSNS (ix86_cost->integer_to_sse) / 2;
where we make a move from FP stack reg to FP XMM reg free, assuming that
FP is done in XMM regs. The def stmt here is an add, but without
-mfpmath=sse we have to spill to the stack and re-load to XMM. There's
no special cost for this like integer_to_sse. Also I'm not sure on the
exact TARGET_* flag to check for -mfpmath=sse (I guess sse,x87 should be
handled conservatively). Changing the above if (fp) to if (0), thus
considering integer-to-sse disables vectorization w/o -mfmath=sse.
Somebody with more target knowledge around -mfpmath should put costing
into the if (fp) path accounting for FP REG to stack + XMM load from stack.