https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125767
--- Comment #4 from Christopher Bazley <Chris.Bazley at arm dot com> ---
(In reply to Richard Sandiford from comment #3)
> I agree that this is a missing case, but it's more of a missed optimisation
> rather than a correctness issue. Returning false should always be
> conservatively correct.
Hi Richard,
it would be possible to work around the issue but I reported it as a bug
because the function does not seem to behave according to its documented
contract.
The last time I encountered this issue, I put in a workaround:
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index cd3ba6fa1cb..367a9c63ea4 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1664,37 +1664,41 @@ check_load_store_for_partial_vectors (vec_info *vinfo,
tree vectype,
}
/* We might load more scalars than we need for permuting SLP loads.
We checked in get_load_store_type that the extra elements
don't leak into a new vector. */
auto group_memory_nvectors = [](poly_uint64 size, poly_uint64 nunits)
{
unsigned int nvectors;
if (can_div_away_from_zero_p (size, nunits, &nvectors))
return nvectors;
- gcc_unreachable ();
+
+ gcc_assert (known_le (size, nunits));
+ return 1u;
};
Now, I need a similar workaround in gen_lowpart_common:
{
/* MODE must occupy no more of the underlying registers than X. */
poly_uint64 regsize = REGMODE_NATURAL_SIZE (innermode);
unsigned int mregs, xregs;
if (!can_div_away_from_zero_p (msize, regsize, &mregs)
|| !can_div_away_from_zero_p (xsize, regsize, &xregs)
|| mregs > xregs)
return 0;
}
As this point, I think it is better just to modify the function to give the
expected result.