https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95018

--- Comment #33 from Jiu Fu Guo <guojiufu at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #32)
> Note I don't think the unrolling is excessive - store motion then applying
> to all count[] and all computations hoisted out of the loop may be a bit
> too much for register pressure though, especially since we're using
> flag-based store-motion.  But it causes the stores to be materialized
> on all exits of the loop which means we end up with N*N conditional stores :/

In general, it may not very aggressive for param_max_peel_branches = 31,
param_max_completely_peel_times = 16. 
For in_pack_i4.c, the loop is at most 13+1 times and then be unrolled. While
for the loop, unrolling increases size and does not help performance.

> 
> I guess SM could be improved here.


Thanks all!

Reply via email to