http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51074

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-11-10 
10:50:54 UTC ---
The case I was worried was if we have a single VECTOR_CST before the loop and
then create 16 different vectors out of it using different permutations, then
perhaps the permutations of the same VECTOR_CST might be cheaper over having to
load 10 constants out of memory because the register pressure was too high.
But perhaps that is unlikely and we just should fold, if it works in
fold-const.c, sure.

Doing something about interleaved const store in the vectorizer is desirable
anyway, even if we leave the folding to following passes, the fact that we
don't need any interleaves means we perhaps might handle more cases and the
cost model wouldn't reject it so often.

Reply via email to