--- Comment #7 from janis at gcc dot gnu dot org 2007-12-12 19:22 ---
The failure goes away with -mabi=altivec, which is not the default with -m32.
I had never seen this failure in my earlier testing on powerpc64-linux because
whenever I used -maltivec I also used -mabi=altivec. It's
--- Comment #8 from janis at gcc dot gnu dot org 2007-12-12 23:34 ---
Ira, it seems I've wasted your time; it's a well-known problem (except to me)
that -maltivec without -mabi=altivec doesn't work in the general case because
vector registers are not saved and restored. That explains
--- Comment #6 from irar at il dot ibm dot com 2007-12-11 13:24 ---
The first difference I found between with and without -fno-strict-aliasing
versions of the loop in reload.c:2352 is in
with -fno-strict-aliasing (the bad one):
(insn 414 413 415 43 ../src/reload.c:2354 (set (reg/f:SI
--- Comment #5 from irar at il dot ibm dot com 2007-12-06 07:49 ---
It also fails with -O2 and -O1 (and not only with -O3).
The offending loop is reload.c:2352 (in function find_reloads):
for (i = 0; i noperands; i++)
{
constraints[i] = constraints1[i]
--- Comment #3 from janis at gcc dot gnu dot org 2007-12-03 21:57 ---
The failure occurs with -m32 -O3 -maltivec -fno-strict-aliasing, but not
without -fno-strict-aliasing. That option is sometimes necessary because of
invalid code in 176.gcc, as described in
--- Comment #4 from irar at il dot ibm dot com 2007-12-04 07:31 ---
(In reply to comment #3)
The failure occurs with -m32 -O3 -maltivec -fno-strict-aliasing, but not
without -fno-strict-aliasing.
Yes, I succeeded to reproduce the failure with -fno-strict-aliasing.
Thanks,
Ira
--
--- Comment #1 from irar at il dot ibm dot com 2007-11-29 09:31 ---
I can't reproduce this failure with the same flags with revision 130403 on
ppc64-redhat-linux.
(Some loops indeed get vectorized in reload.c and reload1.c:
reload1.c.104t.vect:reload1.c:2433: note: LOOP VECTORIZED.
--- Comment #2 from janis at gcc dot gnu dot org 2007-11-29 19:01 ---
I can reproduce the failure with revision 130507 on a p970 system. I compile
176.gcc with -m32 -O3 -maltivec and execute that benchmark program with test
input.
My list of vectorized loops is the same except that