[Bug tree-optimization/21591] not vectorizing a loop with access to structs
--- Comment #7 from irar at il dot ibm dot com 2006-09-13 08:32 --- I think, the problem here is that we only check SMT and not NMT. I am preparing a patch to fix this. NMT is stored in ptr_info_def of data-ref, and only if it does not exist, SMT will be checked. -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com, ||dnovillo at redhat dot com AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2006-02-21 01:04:59 |2006-09-13 08:32:31 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21591
[Bug tree-optimization/18438] vectorizer failed for vector matrix multiplication
--- Comment #3 from irar at il dot ibm dot com 2006-09-19 07:10 --- t.c:20: note: not vectorized: mixed data-types t.c:20: note: can't determine vectorization factor. Removing flags[i] = true; Multiple data-types vectorization is already supported in the autovect branch, and the patches for mainline (starting from http://gcc.gnu.org/ml/gcc-patches/2006-02/msg00941.html) will be committed as soon as 4.3 is open. we get: t.c:20: note: not consecutive access t.c:20: note: not vectorized: complicated access pattern. Vectorization of strided accesses is also already implemented in the autovect branch (and will be committed to the mainline 4.3). However, this case contains stores with gaps (stores to opoints[i][0], opoints[i][1], and opoints[i][2], without a store to opoints[i][3]), and only loads with gaps are currently supported. Therefore, this loop will be vectorizable in the autovect branch (and soon in the mainline 4.3) if a store to opoints[i][3] is added. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com Last reconfirmed|2005-12-21 03:49:03 |2006-09-19 07:10:15 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18438
[Bug tree-optimization/19049] not vectorizing a fortran loop
--- Comment #7 from irar at il dot ibm dot com 2006-09-19 07:29 --- Even though vectorization of strided accesses is already implemented in the autovect branch (and will be committed to the mainline 4.3), this case contains a store with a gap (store to a[i] without a store to a[i-1]), and such stores are not supported (the current implementation supports only loads with gaps). Note, however, that adding a store to a[i-1] will create a data dependence in the loop. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19049
[Bug tree-optimization/26969] [4.1 Regression] ICE with -O1 -funswitch-loops -ftree-vectorize
--- Comment #15 from irar at il dot ibm dot com 2006-10-18 11:03 --- (In reply to comment #13) We need to check if above patch fixes PR26969 as well. Checked, it does not. -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26969
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #3 from irar at il dot ibm dot com 2006-10-30 11:33 --- I am getting another failure: /home/irar/main-boot/build7/./gcc/xgcc -B/home/irar/main-boot/build7/./gcc/ -B/home/irar/main-boot/ppc64-redhat-linux/bin/ -B/home/irar/main-boot/ppc64-redhat-linux/lib/ -isystem /home/irar/main-boot/ppc64-redhat-linux/include -isystem /home/irar/main-boot/ppc64-redhat-linux/sys-include -O2 -O2 -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -I. -Inof -I../../gcc/gcc -I../../gcc/gcc/nof -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libdecnumber -I../libdecnumber -msoft-float -fPIC -mstrict-align -g0 -finhibit-size-directive -fno-inline-functions -fno-exceptions -fno-zero-initialized-in-bss -fno-toplevel-reorder -msdata=none \ -c ../../gcc/gcc/crtstuff.c -DCRT_END \ -o nof/crtend.o ../../gcc/gcc/crtstuff.c: In function â__do_global_dtors_auxâ: ../../gcc/gcc/crtstuff.c:304: internal compiler error: in push_reload, at reload.c:1294 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. ../../gcc/gcc/crtstuff.c: In function â__do_global_ctors_auxâ: ../../gcc/gcc/crtstuff.c:522: internal compiler error: in push_reload, at reload.c:1294 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. make[5]: *** [nof/crtbegin.o] Error 1 make[5]: *** Waiting for unfinished jobs make[5]: *** [nof/crtend.o] Error 1 make[5]: Leaving directory `/home/irar/main-boot/build7/gcc' make[4]: *** [extranof] Error 2 make[4]: Leaving directory `/home/irar/main-boot/build7/gcc' I found out that this happens when the loop in config/rs6000/rs6000.c:3674 is vectorized. Here it is: for (i = 32; i 64; i++) fixed_regs[i] = call_used_regs[i] = call_really_used_regs[i] = 1; Also if the loop is split to for (i = 32; i 64; i++) fixed_regs[i] = 1; and the rest, and only this loop is vectorized, the same error occurs. I made a testcase with the original loop, but it works fine. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #4 from irar at il dot ibm dot com 2006-11-02 11:18 --- The loop at config/rs6000/rs6000.c:3674 requires versioning for alignment, so when bootstrapping with -ftree-vectorize -fno-tree-vect-loop-version it does not get vectorized. However, we still fail bootstrap... This is the failure we get: /home/irar/main-boot/build17/./gcc/xgcc -B/home/irar/main-boot/build17/./gcc/ -B/home/irar/main-boot/ppc64-redhat-linux/bin/ -B/home/irar/main-boot/ppc64-redhat-linux/lib/ -isystem /home/irar/main-boot/ppc64-redhat-linux/include -isystem /home/irar/main-boot/ppc64-redhat-linux/sys-include -O2 -O2 -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -I. -I. -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libdecnumber -I../libdecnumber -g0 -finhibit-size-directive -fno-inline-functions -fno-exceptions -fno-zero-initialized-in-bss -fno-toplevel-reorder -msdata=none \ -c ../../gcc/gcc/crtstuff.c -DCRT_BEGIN \ -o crtbegin.o ../../gcc/gcc/crtstuff.c: In function â__do_global_dtors_auxâ: ../../gcc/gcc/crtstuff.c:304: internal compiler error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. make[3]: *** [crtbegin.o] Error 1 make[3]: *** Waiting for unfinished jobs rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gcc.pod gfortran.pod gpl.pod make[3]: Leaving directory `/home/irar/main-boot/build17/gcc' make[2]: *** [all-stage2-gcc] Error 2 make[2]: Leaving directory `/home/irar/main-boot/build17' make[1]: *** [stage2-bubble] Error 2 make[1]: Leaving directory `/home/irar/main-boot/build17' make: *** [bootstrap] Error 2 I found that this time a different loop in rs6000 is related to the failure - when it is vectorized, the Stage2 compiler is bad, and when we force the loop not to be vectorized, the Stage2 compiler is good (bootstrap passes with vectorization enabled). The loop is config/rs6000/rs6000.c:17204: for (i = 0; i issue_rate; i++) { group_insns[i] = 0; } Looks like the problem is not related to some specific loop vectorization. Don't know if it makes sense to try to create a reduced testcase (or how...). Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #5 from irar at il dot ibm dot com 2006-11-02 11:44 --- I found that revision 110671 passed bootstrap with vectorization enabled, and revision 110846 failed bootstrap with vectorization enabled (but passed w/o). Janis - could you help track down the patch that exposed/caused the bootstrap failure with BOOT_CFLAGS=-O2 -g -ftree-vectorize -maltivec? Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #7 from irar at il dot ibm dot com 2006-11-06 09:38 --- Janis - thanks a lot for your help! http://gcc.gnu.org/viewcvs?view=revrev=110705 r110705 | law | 2006-02-07 18:31:27 + (Tue, 07 Feb 2006) In r110704 bootstrap passes with and w/o vectorization enabled, and in r110705 boostrap fails in both cases (also without vectorization). The patch 2006-02-08 Jeff Law [EMAIL PROTECTED] PR tree-optimization/26169 * tree-vrp.c (execute_vrp): Perform any queued SSA updates before threading jumps. (http://gcc.gnu.org/viewcvs?view=revrevision=110758) fixes bootstrap without vectorization, but bootstrap with BOOT_CFLAGS=-O2 -g -ftree-vectorize -maltivec still fails. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||law at redhat dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #9 from irar at il dot ibm dot com 2006-11-07 08:31 --- In r110758 (and r110705) bootstrap with BOOT_CFLAGS=-O2 -g -ftree-vectorize -maltivec fails with another error: /Develop/main-110758/build-vect/./prev-gcc/xgcc -B/Develop/main-110758/build-vect/./prev-gcc/ -B/Develop/main-110758//powerpc64-suse-linux/bin/ -c -O2 -g -ftree-vectorize -maltivec -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wmissing-format-attribute -Werror-DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libdecnumber -I../libdecnumber../../gcc/gcc/regmove.c -o regmove.o ../../gcc/gcc/recog.c: In function constrain_operands: ../../gcc/gcc/recog.c:2270: internal compiler error: in mark_operand_necessary, at tree-ssa-dce.c:266 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. make[3]: *** [recog.o] Error 1 make[3]: *** Waiting for unfinished jobs rm cpp.pod gfdl.pod gfortran.pod fsf-funding.pod gpl.pod gcc.pod gcov.pod make[3]: Leaving directory `/Develop/main-110758/build-vect/gcc' make[2]: *** [all-stage2-gcc] Error 2 make[2]: Leaving directory `/Develop/main-110758/build-vect' make[1]: *** [stage2-bubble] Error 2 make[1]: Leaving directory `/Develop/main-110758/build-vect' make: *** [bootstrap] Error 2 The guilty loop is recog.c:2283, i.e., forcing this loop not to get vectorized makes bootstrap with vectorization to pass. The backtrace: Breakpoint 1, fancy_abort (file=0x107b4668 ../../gcc/gcc/tree-ssa-dce.c, line=266, function=0x107b4930 mark_operand_necessary) at ../../gcc/gcc/diagnostic.c:642 642 internal_error (in %s, at %s:%d, function, trim_filename (file), line); (gdb) backtrace #0 fancy_abort (file=0x107b4668 ../../gcc/gcc/tree-ssa-dce.c, line=266, function=0x107b4930 mark_operand_necessary) at ../../gcc/gcc/diagnostic.c:642 During symbol reading, incomplete CFI data; unspecified registers (e.g., r0) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r2) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r3) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r4) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r5) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r6) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r7) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r8) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r9) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r10) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r11) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r12) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r13) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r14) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r15) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r16) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r17) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r18) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r19) at 0x10203008. During symbol reading, incomplete CFI data; unspecified registers (e.g., r20) at 0x10203008. #1 0x105d2654 in mark_operand_necessary (op=0xf6d34e40, phionly=0 '\0') at ../../gcc/gcc/tree-ssa-dce.c:266 #2 0x105d218c in propagate_necessity (el=0x10ad6000) at ../../gcc/gcc/tree-ssa-dce.c:553 #3 0x105d401c in perform_tree_ssa_dce (aggressive=1 '\001') at ../../gcc/gcc/tree-ssa-dce.c:906 #4 0x105d4148 in tree_ssa_cd_dce () at ../../gcc/gcc/tree-ssa-dce.c:947 #5 0x1051b110 in execute_one_pass (pass=0x1088c584) at ../../gcc/gcc/passes.c:853 #6 0x1051b2c8 in execute_pass_list (pass=0x1088c584) at ../../gcc/gcc/passes.c:897 #7 0x1051b2f4 in execute_pass_list (pass=0x108876dc) at ../../gcc/gcc/passes.c:898 #8 0x100b1684 in tree_rest_of_compilation (fndecl=0xf7862d00) at ../../gcc/gcc/tree-optimize.c:412 #9 0x10016914 in c_expand_body (fndecl=0xf7862d00) at ../../gcc/gcc/c-decl.c:6689 #10 0x1059b914 in cgraph_expand_function (node=0xf75b3e00) at ../../gcc/gcc/cgraphunit.c:1101 #11 0x1059bba4 in cgraph_expand_all_functions () at ../../gcc/gcc/cgraphunit.c:1166 #12 0x1059c798 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1434 #13
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #10 from irar at il dot ibm dot com 2006-11-07 08:32 --- Created an attachment (id=12560) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12560action=view) recog.i -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #12 from irar at il dot ibm dot com 2006-11-08 08:40 --- Jeff, Thanks a lot! I will do the things you've suggested shortly. Meanwhile, out of curiosity, I am attaching a good recog.i (built with vectorization enabled, but the offending loop was not vectorized). BTW, here is the compiler configuration: ../gcc/configure --enable-threads=posix --prefix=/Develop/main-110758/ --enable-checking=release --enable-ssp --disable-libssp --enable-java-awt=gtk --enable-gtk-cairo --disable-libjava-multilib --with-system-zlib --enable-shared --enable-__cxa_atexit --enable-libstdcxx-allocator=new --without-system-libunwind --with-cpu=default32 --enable-secureplt --with-long-double-128 --host=powerpc64-suse-linux --enable-languages=c,c++,fortran My system compiler is gcc version 4.1.0 (SUSE Linux). Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #14 from irar at il dot ibm dot com 2006-11-08 11:33 --- (In reply to comment #11) 1. Put a breakpoint in tree_ssa_cd_dce when compiling the offending function from recog.c.When that breakpoint triggers issue: verify_ssa (true) I can't see any way for that to fail, but better safe than sorry. It fails... Breakpoint 6, tree_ssa_cd_dce () at ../../gcc/gcc/tree-ssa-dce.c:947 947 perform_tree_ssa_dce (/*aggressive=*/optimize = 2); (gdb) p verify_ssa (true) No symbol true in current context. (gdb) p verify_ssa (1) During symbol reading, incomplete CFI data; unspecified registers (e.g., r0) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r2) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r3) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r4) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r5) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r6) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r7) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r8) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r9) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r10) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r11) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r12) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r13) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r14) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r15) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r16) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r17) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r18) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r19) at 0x105d412c. During symbol reading, incomplete CFI data; unspecified registers (e.g., r20) at 0x105d412c. ../../gcc/gcc/recog.c: In function constrain_operands: ../../gcc/gcc/recog.c:2270: error: expected an SSA_NAME object Breakpoint 1, fancy_abort (file=0x10744d58 ../../gcc/gcc/tree-ssa-operands.c, line=1905, function=0x10745428 verify_imm_links) at ../../gcc/gcc/diagnostic.c:642 642 internal_error (in %s, at %s:%d, function, trim_filename (file), line); The program being debugged stopped while in a function called from GDB. When the function (verify_ssa) is done executing, GDB will silently stop (instead of continuing to evaluate the expression containing the function call). Backtrace: #0 fancy_abort (file=0x10744d58 ../../gcc/gcc/tree-ssa-operands.c, line=1905, function=0x10745428 verify_imm_links) at ../../gcc/gcc/diagnostic.c:642 #1 0x100e4c10 in verify_imm_links (f=0xffec4a8, var=0xf6d34e40) at ../../gcc/gcc/tree-ssa-operands.c:1905 #2 0x100ac1fc in verify_use (bb=0xf73144d0, def_bb=0x1, use_p=0xf6beb118, stmt=0xf74510f0, check_abnormal=0 '\0', is_virtual=1 '\001', names_defined_in_bb=0x10a8d9d0) at ../../gcc/gcc/tree-ssa.c:228 #3 0x100ae404 in verify_ssa (check_modified_stmt=1 '\001') at ../../gcc/gcc/tree-ssa.c:735 #4 function called from gdb #5 tree_ssa_cd_dce () at ../../gcc/gcc/tree-ssa-dce.c:947 #6 0x1051b110 in execute_one_pass (pass=0x1088c584) at ../../gcc/gcc/passes.c:853 #7 0x1051b2c8 in execute_pass_list (pass=0x1088c584) at ../../gcc/gcc/passes.c:897 #8 0x1051b2f4 in execute_pass_list (pass=0x108876dc) at ../../gcc/gcc/passes.c:898 #9 0x100b1684 in tree_rest_of_compilation (fndecl=0xf7862d00) at ../../gcc/gcc/tree-optimize.c:412 #10 0x10016914 in c_expand_body (fndecl=0xf7862d00) at ../../gcc/gcc/c-decl.c:6689 #11 0x1059b914 in cgraph_expand_function (node=0xf75b3e00) at ../../gcc/gcc/cgraphunit.c:1101 #12 0x1059bba4 in cgraph_expand_all_functions () at ../../gcc/gcc/cgraphunit.c:1166 #13 0x1059c798 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1434 #14 0x1001a024 in c_write_global_declarations () at ../../gcc/gcc/c-decl.c:7804 #15 0x104d99a8 in compile_file () at ../../gcc/gcc/toplev.c:1012 #16 0x104dbc64 in do_compile () at ../../gcc/gcc/toplev.c:1957 #17 0x104dbcf8 in toplev_main (argc=54, argv=0xffdf2c74) at ../../gcc/gcc/toplev.c:1989 #18 0x10090518 in main (argc=54, argv=0xffdf2c74) at ../../gcc/gcc/main.c:35 (gdb) p debug_tree (0xf6d34e40) struct_field_tag 0xf6d34e40 SFT.1940 type array_type 0xf78709a0 type pointer_type 0xf7d0c3f0 rtx type record_type 0xf7d0c310 rtx_def sizes-gimplified asm_written public unsigned SI size
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #15 from irar at il dot ibm dot com 2006-11-08 12:05 --- Additional behavior: If I run bootstrap with BOOT_CFLAGS=-O2 -g -ftree-vectorize -maltivec (without -fdump-tree-vect-details), bootstrap fails with ../../gcc/gcc/recog.c: In function constrain_operands: ../../gcc/gcc/recog.c:2270: internal compiler error: in mark_operand_necessary, at tree-ssa-dce.c:266 Then I compile recog.c alone with /Develop/main-110758/build-vect/./prev-gcc/xgcc -B/Develop/main-110758/build-vect/./prev-gcc/ -B/Develop/main-110758//powerpc64-suse-linux/bin/ -c -O2 -g -ftree-vectorize -maltivec -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wmissing-format-attribute -Werror-DHAVE_CONFIG_H -I. -I. -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libdecnumber -I../libdecnumber../../gcc/gcc/recog.c and again I get ../../gcc/gcc/recog.c: In function constrain_operands: ../../gcc/gcc/recog.c:2270: internal compiler error: in mark_operand_necessary, at tree-ssa-dce.c:266 With -quiet I get a different error: Program received signal SIGSEGV, Segmentation fault. 0x100b1dc4 in is_gimple_min_invariant (t=0x18) at ../../gcc/gcc/tree-gimple.c:172 172 switch (TREE_CODE (t)) (gdb) backtrace #0 0x100b1dc4 in is_gimple_min_invariant (t=0x18) at ../../gcc/gcc/tree-gimple.c:172 During symbol reading, incomplete CFI data; unspecified registers (e.g., r0) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r2) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r3) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r4) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r5) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r6) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r7) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r8) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r9) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r10) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r11) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r12) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r13) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r14) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r15) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r16) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r17) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r18) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r19) at 0x100b1dbc. During symbol reading, incomplete CFI data; unspecified registers (e.g., r20) at 0x100b1dbc. #1 0x1013f110 in replace_vuses_in (stmt=0xf74510f0, replaced_addresses_p=0xff8a9790 , prop_value=0x10b6b238) at ../../gcc/gcc/tree-ssa-propagate.c:958 #2 0x1013f8d4 in substitute_and_fold (prop_value=0x10b6b238, use_ranges_p=0 '\0') at ../../gcc/gcc/tree-ssa-propagate.c:1139 #3 0x100dd070 in fini_copy_prop () at ../../gcc/gcc/tree-ssa-copy.c:914 #4 0x100dd158 in execute_copy_prop (store_copy_prop=0 '\0', phis_only=1 '\001') at ../../gcc/gcc/tree-ssa-copy.c:1035 #5 0x100dd204 in do_phi_only_copy_prop () at ../../gcc/gcc/tree-ssa-copy.c:1076 #6 0x1051b090 in execute_one_pass (pass=0x10915e68) at ../../gcc/gcc/passes.c:853 #7 0x1051b248 in execute_pass_list (pass=0x10915e68) at ../../gcc/gcc/passes.c:897 #8 0x1051b274 in execute_pass_list (pass=0x108875bc) at ../../gcc/gcc/passes.c:898 #9 0x100b1684 in tree_rest_of_compilation (fndecl=0xf7862d00) at ../../gcc/gcc/tree-optimize.c:412 #10 0x10016914 in c_expand_body (fndecl=0xf7862d00) at ../../gcc/gcc/c-decl.c:6689 #11 0x1059b894 in cgraph_expand_function (node=0xf75b3e00) at ../../gcc/gcc/cgraphunit.c:1101 #12 0x1059bb24 in cgraph_expand_all_functions () at ../../gcc/gcc/cgraphunit.c:1166 #13 0x1059c718 in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1434 #14 0x1001a024 in c_write_global_declarations () at ../../gcc/gcc/c-decl.c:7804 #15 0x104d9928 in compile_file () at ../../gcc/gcc/toplev.c:1012 #16 0x104dbbe4 in do_compile () at ../../gcc/gcc/toplev.c:1957 #17 0x104dbc78 in toplev_main (argc=53, argv=0xff8a9ca4) at ../../gcc/gcc/toplev.c:1989 #18 0x10090518 in main (argc=53, argv=0xff8a9ca4) at ../../gcc/gcc/main.c:35 Moreover, if I add -fdump-tree-vect
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #17 from irar at il dot ibm dot com 2006-11-09 10:15 --- I applied the patch http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01043.html (a fix for PR26197). The bootstrap with vectorization passes! However, the failure in comment #3 still occurs in the later revisions. So, I am going to hunt for a later patch that broke bootstrap with vectorization (applying the above patch). Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #19 from irar at il dot ibm dot com 2006-11-12 09:52 --- Janis, Thanks a lot! The range of the revisions is 110758 - 111615 (110758 passes bootstrap with vectorization with the patch, 111615 fails with the error in comment #3). I had to modify the patch and split it into two patches in order to make it possible to apply the patch automatically (without rejections). I am attaching the two parts. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #20 from irar at il dot ibm dot com 2006-11-12 09:55 --- Created an attachment (id=12597) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12597action=view) The first part of the patch -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #21 from irar at il dot ibm dot com 2006-11-12 09:56 --- Created an attachment (id=12598) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12598action=view) The second part of the patch -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug tree-optimization/20122] Wrong code with gcc 4.0 tree-vectorizer
--- Additional Comments From irar at il dot ibm dot com 2005-02-24 13:41 --- I found the problem that causes this. I'll send the patch next week. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20122
[Bug tree-optimization/20122] Wrong code with gcc 4.0 tree-vectorizer
-- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2005-03-02 11:42:36 |2005-03-02 12:43:57 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20122
[Bug tree-optimization/20122] Wrong code with gcc 4.0 tree-vectorizer
--- Additional Comments From irar at il dot ibm dot com 2005-03-02 12:45 --- Fixed in http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01788.html. Waiting for review. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20122
[Bug tree-optimization/18527] cannot determine number of iterations for loops with =
--- Additional Comments From irar at il dot ibm dot com 2005-03-09 06:56 --- New testcase added: vect-3.f90 (in autovect branch for now). If this PR is solved, testcase vect-3.f90 will be vectorized. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18527
[Bug tree-optimization/20474] ICE while compiling openmotif-2.2.3 with -ftree-vectorize
--- Additional Comments From irar at il dot ibm dot com 2005-03-15 11:37 --- This problem was solved in autovect branch (http://gcc.gnu.org/ml/gcc- patches/2005-03/msg00754.html). This patch will be submitted to mainline in stage 2. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20474
[Bug tree-optimization/19049] not vectorizing a fortran loop
--- Additional Comments From irar at il dot ibm dot com 2005-04-25 09:58 --- The vectorizer fails to determine dependence between: (*a_38)[D.719_49] and (*a_38)[D.718_51], since it fails to determine that both of the data-refs have the same base, *a_38. This is already fixed in autovect branch, and I am working on a patch to bring the changes in data-refs analysis to mainline. -- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2005-01-01 04:23:21 |2005-04-25 09:58:44 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19049
[Bug tree-optimization/21218] [4.1 regression] ICE using -ftree-vectorize
--- Additional Comments From irar at il dot ibm dot com 2005-04-26 10:04 --- We get the following code for the loop: this_5 = b_4-D.2068; D.2080_9 = this_5-d[i_18]; b_4-D.2068.d[i_18] = D.2080_9; In analysis of data-ref this_5-d[i_18] we don't check that the initial condition of access_fn of *this_5 is not loop invariant (we rely on evolution == NULL test, which is wrong). This is already fixed in autovect branch, and I am working on a patch to bring the changes in data-refs analysis to mainline. Another issue here is that this_5 = b_4-D.2068; is loop invariant and can be hoist out of the loop. Maybe it will happen with structure-aliasing-branch? -- What|Removed |Added CC||dberlin at gcc dot gnu dot ||org AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2005-04-25 14:43:35 |2005-04-26 10:04:49 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21218
[Bug tree-optimization/21630] [4.1 Regression] gcc.dg/vect/vect-none.c scan-tree-dump-times vectorized 1 loops 1 fails
--- Additional Comments From irar at il dot ibm dot com 2005-05-22 11:42 --- The problem is in vect-none.c itself. This patch fixes the problem http://gcc.gnu.org/ml/gcc-patches/2005-05/msg02124.html (waiting for ok). -- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed||1 Last reconfirmed|-00-00 00:00:00 |2005-05-22 11:42:09 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21630
[Bug testsuite/21630] [4.1 Regression] gcc.dg/vect/vect-none.c scan-tree-dump-times vectorized 1 loops 1 fails
--- Additional Comments From irar at il dot ibm dot com 2005-05-23 05:28 --- My patch removes vect-none.c, so it's impossible to get failures on this testcase. I guess, there is a problem either in how I created the patch (I did 'cvs remove' and 'cvs add', and 'cvs diff -N' afterwards) or in how you applied it. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21630
[Bug testsuite/21630] [4.1 Regression] gcc.dg/vect/vect-none.c scan-tree-dump-times vectorized 1 loops 1 fails
--- Additional Comments From irar at il dot ibm dot com 2005-05-24 07:01 --- Thanks for fixing the patch. I can't reproduce vect-106.c failure on i686-pc-linux-gnu. Could you please give me some information? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21630
[Bug testsuite/21630] [4.1 Regression] gcc.dg/vect/vect-none.c scan-tree-dump-times vectorized 1 loops 1 fails
--- Additional Comments From irar at il dot ibm dot com 2005-05-24 11:57 --- I committed the patch, since I am not able to reproduce vect-106.c failure. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21630
[Bug tree-optimization/25211] [4.1/4.2 Regression] verify_ssa ICE for mesa with -Os -ftree-loop-linear
--- Comment #4 from irar at il dot ibm dot com 2005-12-14 13:11 --- I think the reason why this ICE occurs with my patch (http://gcc.gnu.org/viewcvs?view=revrev=102356) is that my patch enables data-refs analysis for INDIRECT_REFs. Similar ICE in PR 20256 happens also before my patch since the data-refs there are ARRAY_REFs, and ARRAY_REFs were already supported before. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25211
[Bug tree-optimization/25371] -ftree-vectorize results in internal compiler error on AMD64
--- Comment #3 from irar at il dot ibm dot com 2005-12-18 08:15 --- I failed to reproduce this ICE on ppc and i686. Vectorizer's dump file can help. -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25371
[Bug tree-optimization/25881] unsigned int loop indices don't optimize as good as int or __SIZE_TYPE__ for 64bit targets
--- Comment #2 from irar at il dot ibm dot com 2006-01-29 10:10 --- Changing double to float, the scalar evolution analyzer returns access function (float *) ((unsigned int) {0, +, 1}_1 * 4) + (float *) a_12, since it fails in type conversion: (failed conversion: type: unsigned int base: 0 step: estimated_nb_iterations: scev_not_known ) (Without type conversion we get {(float *) a_14, +, 4B}_1). Data-refs analysis fails to analyze the access pattern, therefore the loop does not get vectorized. -- irar at il dot ibm dot com changed: What|Removed |Added CC||sebastian dot pop at cri dot ||ensmp dot fr, irar at il dot ||ibm dot com Last reconfirmed|2006-01-20 20:43:27 |2006-01-29 10:10:55 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25881
[Bug tree-optimization/19347] New: Invariant load not moved out of loop
In mesa benchmark (osmesa.c:678) GLuint i, n, *ptr4; n = osmesa-rowlength * osmesa-height; ptr4 = (GLuint *) osmesa-buffer; for (i=0;in;i++) { *ptr4++ = osmesa-clearpixel; } The load of osmesa-clearpixel is not taken outside the loop by LIM because of aliasing limitations. This in turn also prevents vectorization. In this particular case we can actually get the load moved out of the loop even without resolving the aliasing issue (which requires whole-program), on account that even if the store aliases the load, it will not alter the value loaded (because we store the same value that we loaded). I'm looking into this in the context of the vectorizer. -- Summary: Invariant load not moved out of loop Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: tree-optimization AssignedTo: irar at il dot ibm dot com ReportedBy: irar at il dot ibm dot com CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: powerpc-apple-darwin7.0.0 GCC host triplet: powerpc-apple-darwin7.0.0 GCC target triplet: powerpc-apple-darwin7.0.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19347
[Bug tree-optimization/18527] New: Missed optimization
When loop bound condition is =, number_of_iterations_in_loop returns 'unknown' although the loop is countable. Turns out this case is pretty common in SPEC and blocks vectorization opportunities. E.g., int foo () { int a[N]; int i; int n; for (i = 0; i = n; i++) { ca[i] = 2; } } In order to check loop bound for overflow (in function number_of_iterations_exit), conditions before loop are checked (in simplify_using_initial_conditions). However, there is no relevant condition before the loop. The condition is n = 0 and the expression to check n != 2147483647. -- Summary: Missed optimization Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com CC: dorit at il dot ibm dot com,gcc-bugs at gcc dot gnu dot org,rakdver at atrey dot karlin dot mff dot cuni dot cz GCC build triplet: powerpc-apple-darwin7.0.0 GCC host triplet: powerpc-apple-darwin7.0.0 GCC target triplet: powerpc-apple-darwin7.0.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18527
[Bug tree-optimization/18607] New: Vectorizer: data_reference is overwritten in vect_analyze_data_refs
While looking for memory tag in vect_analyze_data_refs, data_reference is overwritten by temporary. -- Summary: Vectorizer: data_reference is overwritten in vect_analyze_data_refs Product: gcc Version: 4.0.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: tree-optimization AssignedTo: irar at il dot ibm dot com ReportedBy: irar at il dot ibm dot com CC: dorit at il dot ibm dot com,gcc-bugs at gcc dot gnu dot org GCC build triplet: powerpc-apple-darwin7.0.0 GCC host triplet: powerpc-apple-darwin7.0.0 GCC target triplet: powerpc-apple-darwin7.0.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18607
[Bug tree-optimization/18607] Vectorizer: data_reference is overwritten in vect_analyze_data_refs
--- Additional Comments From irar at il dot ibm dot com 2004-11-22 10:01 --- Created an attachment (id=7580) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=7580action=view) Testcase -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18607
[Bug tree-optimization/18607] Vectorizer: data_reference is overwritten in vect_analyze_data_refs
--- Additional Comments From irar at il dot ibm dot com 2004-11-23 07:36 --- Fixed in http://gcc.gnu.org/ml/gcc-patches/2004-11/msg01747.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18607
[Bug tree-optimization/22029] [4.1 Regression] ICE with -fdump-tree-copyprop3-details
--- Additional Comments From irar at il dot ibm dot com 2005-06-30 11:38 --- Submitted a patch that fixes this: http://gcc.gnu.org/ml/gcc-patches/2005- 06/msg02228.html -- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2005-06-13 13:05:51 |2005-06-30 11:38:36 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22029
[Bug tree-optimization/22184] tree vectorizer depends on context
--- Additional Comments From irar at il dot ibm dot com 2005-07-07 07:47 --- The problem occurs in decision whether the number of loop iterations is greater than zero. The (single) predecessor edge is checked for being EDGE_TRUE_VALUE or EDGE_FALSE_VALUE, and the corresponding predicate is used to make the decision. In the first case (single loop) BB 0 contains predicate 'len0', its TRUE successor is BB 1, and the fallthru successor of BB 1 is BB 2 - the loop. The condition to check is 'len = 0', which is therefore simplified to FALSE. In the second case, however, the control flow is more complicated. The loop is in BB 6, its predecessor is BB 3, which has 2 predecessors: BB 5 (with predicate 'len 0'), and BB 2 - the first loop. The first loop is also guarded by 'len 0', but this information is not propagated. -- What|Removed |Added CC||rakdver at atrey dot karlin ||dot mff dot cuni dot cz Last reconfirmed|2005-06-25 18:28:42 |2005-07-07 07:47:46 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22184
[Bug tree-optimization/22526] vectorizer produces mis-match types in conditionals
--- Additional Comments From irar at il dot ibm dot com 2005-07-21 05:45 --- I submitted a patch to fix this - http://gcc.gnu.org/ml/gcc-patches/2005- 07/msg01388.html -- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed||1 Last reconfirmed|-00-00 00:00:00 |2005-07-21 05:45:57 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22526
[Bug tree-optimization/19049] not vectorizing a fortran loop
--- Additional Comments From irar at il dot ibm dot com 2005-07-26 07:07 --- The data dependence issue was solved by this patch http://gcc.gnu.org/ml/gcc- patches/2005-07/msg01195.html (committed). However, this loop is still not vectorizable because of noncontinuous access. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19049
[Bug tree-optimization/23320] [4.1 Regression] ICE in in base_addr_differ_p, at tree-data-ref.c:430
--- Additional Comments From irar at il dot ibm dot com 2005-08-11 08:14 --- Created an attachment (id=9469) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=9469action=view) Patch Yes, you are right, I should check the type of the data-ref (array type in the first case). And instead of the assert I'll add the check that both data-refs are of pointer type in the second case. I'll submit the changes after testing. Thanks, Ira -- What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23320
[Bug tree-optimization/24262] [4.1 Regression] ICE: verify_ssa failed with -O -msse2 -ftree-vectorize
--- Comment #3 from irar at il dot ibm dot com 2005-10-12 09:00 --- I think, it's the same bug in scev that my autovect patch http://gcc.gnu.org/ml/gcc-patches/2005-07/msg00252.html solved (and Sebastian reverted it). Here scev analyzer calculates the evolution of 'D.1703_5 * 2 + i_15', where 'D.1703_5 = i_15/2'. Scev doesn't handle division, therefore for D.1703_5 we get unknown scev, but then it's combined with the rest of the expression, erroneously leading to {D.1703_5*2, +, 1}. Ira -- irar at il dot ibm dot com changed: What|Removed |Added Last reconfirmed|2005-10-07 19:27:42 |2005-10-12 09:00:12 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24262
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc and i386
--- Comment #27 from irar at il dot ibm dot com 2006-11-22 11:15 --- I committed the patch that enables vectorization of strided accesses (http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01679.html), and now bootstrap with vectorization fails also on x86 with the same error as in comment #3. Here, the offending loop is cfgloopanal.c:153. Ira -- irar at il dot ibm dot com changed: What|Removed |Added GCC build triplet|powerpc64-linux |powerpc64-linux and and ||i386-linux GCC host triplet|powerpc64-linux |powerpc64-linux and i386- ||linux GCC target triplet|powerpc64-linux |powerpc64-linux and and ||i386-linux Summary|bootstrap comparision fails |bootstrap comparision fails |with -ftree-vectorize -|with -ftree-vectorize - |maltivec on ppc|maltivec on ppc and i386 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug tree-optimization/22372] Vectorizer produces mis-match types
--- Comment #5 from irar at il dot ibm dot com 2006-11-27 11:19 --- The patch I committed (comment #4) fixes almost all the type mismatch occurrences in the vectorizer, but there's one occurrence that still remains - one of the vectorizer testcases (vect-reduc-dot-u8b.c) still fails with modify.diff.txt on MODIFY_EXPR where the right hand side is a call to a builtin function (rs6000_builtin_mul_widen_even). For Altivec, the return value of the builtin function is always signed (while the left hand side of the assignment is unsigned). Is the check in modify.diff.txt too strict or is the problem with the return type of the Altivec builtin (shouldn't it be signed/unsigned as relevant, instead of always signed? Specifically - shouldn't builtin vmuloub return an unsigned type)? Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22372
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc and i386
--- Comment #29 from irar at il dot ibm dot com 2006-12-04 09:24 --- I reproduced the wrong printings on x86. It seems to be a problem in strided access vectorization after all - no stores are generated. I am looking into this. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc and i386
--- Comment #30 from irar at il dot ibm dot com 2006-12-07 12:14 --- I am testing a patch for x86 boostrap failure. It was caused by a bug in vectorization of strided accesses analysis, and, therefore, has nothing to do with the bootstrap failures on ppc. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc and i386
--- Comment #31 from irar at il dot ibm dot com 2006-12-07 13:30 --- (In reply to comment #17) I applied the patch http://gcc.gnu.org/ml/gcc-patches/2006-07/msg01043.html (a fix for PR26197). The bootstrap with vectorization passes! However, the failure in comment #3 still occurs in the later revisions. So, I am going to hunt for a later patch that broke bootstrap with vectorization (applying the above patch). I found this patch: http://gcc.gnu.org/viewcvs?view=revrevision=110852 The offending loop here is rs6000.c:17088. If I disable -fmove-loop-invariants on r110852, bootstrap with vectorization enabled passes. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||rakdver at atrey dot karlin ||dot mff dot cuni dot cz http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc and i386
--- Comment #32 from irar at il dot ibm dot com 2006-12-11 12:46 --- I am attaching the bad rs6000.s (generated with vectorization) and good rs6000.s (generated with vectorization and -fno-move-loop-invariants) using revision 110852 (from February 2006). I looked over these a bit, but I wouldn't like to hunt down a bug that had since been solved, so I think I'll switch to looking into more recent snapshots. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc and i386
--- Comment #33 from irar at il dot ibm dot com 2006-12-11 12:57 --- Created an attachment (id=12779) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12779action=view) Bad rs6000.s -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc and i386
--- Comment #34 from irar at il dot ibm dot com 2006-12-11 13:02 --- Created an attachment (id=12781) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=12781action=view) Good rs6000.s -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug target/30210] New: Altivec builtins return wrong types
The return value of builtin function rs6000_builtin_mul_widen_even for Altivec is always signed, while it should be signed/unsigned as relevant (builtin vmuloub returns vector signed short, instead of vector unsigned short, as defined by the altivec PIM). It seems to be a more general problem with altivec builtins declaration. Ira -- Summary: Altivec builtins return wrong types Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com GCC build triplet: powerpc*-* GCC host triplet: powerpc*-* GCC target triplet: powerpc*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30210
[Bug tree-optimization/22372] Vectorizer produces mis-match types
--- Comment #7 from irar at il dot ibm dot com 2006-12-14 11:53 --- So, it is an altivec bug and not vectorizer's. I opened a new PR 30210 instead. I think, this PR can be closed. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22372
[Bug middle-end/28752] bootstrap comparision fails with -ftree-vectorize -maltivec on ppc
--- Comment #35 from irar at il dot ibm dot com 2006-12-14 11:58 --- Th problem was solved for i386 by http://gcc.gnu.org/viewcvs?view=revrevision=119779. Ira -- irar at il dot ibm dot com changed: What|Removed |Added GCC build triplet|powerpc64-linux and i386- |powerpc64-linux |linux | GCC host triplet|powerpc64-linux and i386- |powerpc64-linux |linux | GCC target triplet|powerpc64-linux and i386- |powerpc64-linux |linux | Summary|bootstrap comparision fails |bootstrap comparision fails |with -ftree-vectorize -|with -ftree-vectorize - |maltivec on ppc and i386 |maltivec on ppc http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28752
[Bug target/30211] New: missed optimization: model missing vec_extract_even/odd idioms for ia64
vec_extract_even/odd are not implemented on ia64. They are used in vectorization of strided loads, and are implemented only on powerpc (patch http://gcc.gnu.org/ml/gcc-patches/2006-03/msg01679.html (revision 119088)). The implementation exists on autovect-branch, but it possibly can be more efficient (http://gcc.gnu.org/ml/gcc-patches/2006-09/msg01278.html): 2005-12-01 Richard Henderson [EMAIL PROTECTED] * targhooks.c (interleave_vectorize_builtin_extract_evenodd): New. (interleave_vectorize_builtin_extract_even): New. (interleave_vectorize_builtin_extract_odd): New. * targhooks.h: Declare them. * config/i386/i386.c (TARGET_VECTORIZE_BUILTIN_EXTRACT_EVEN): New. (TARGET_VECTORIZE_BUILTIN_EXTRACT_ODD): New. 2005-12-02 Richard Henderson [EMAIL PROTECTED] * config/ia64/ia64.c (TARGET_VECTORIZE_BUILTIN_EXTRACT_EVEN): New. (TARGET_VECTORIZE_BUILTIN_EXTRACT_ODD): New. 2006-09-28 Ira Rosen [EMAIL PROTECTED] * targhooks.c (interleave_vectorize_builtin_extract_evenodd): Fix to produce a correct instructions sequence. * tree-vect-transform.c (vect_permute_store_chain): Choose the correct instruction according to the endianness. Call mark_new_vars_to_rename. Once the above is merged, we can add ia64 to the list of targets that support check_effective_target_vect_extract_even_odd in testsuite/lib/target-support.exp. Ira -- Summary: missed optimization: model missing vec_extract_even/odd idioms for ia64 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com GCC build triplet: ia64-*-* GCC host triplet: ia64-*-* GCC target triplet: ia64-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30211
[Bug tree-optimization/29925] Wrong code with -ftree-vectorize
--- Comment #6 from irar at il dot ibm dot com 2006-12-14 12:41 --- I couldn't reproduce the problem on x86. I ran it with valgrind --leak-check=yes, is it correct? Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29925
[Bug target/30210] Altivec builtins return wrong types
--- Comment #5 from irar at il dot ibm dot com 2006-12-20 12:20 --- Paolo, thanks for the explanation! The problem originates from PR 22372, so I will not open another bug report for it. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30210
[Bug tree-optimization/22372] Vectorizer produces mis-match types
--- Comment #8 from irar at il dot ibm dot com 2006-12-20 12:22 --- As explained by Paolo in PR 30210, it is not an Altivec problem after all. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=22372
[Bug tree-optimization/18438] vectorizer failed for vector matrix multiplication
--- Comment #5 from irar at il dot ibm dot com 2007-01-07 07:40 --- On the todo list. BTW, vectorization of strided accesses was committed to the mainline 4.3. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18438
[Bug target/30210] Altivec builtins have inaccurate return types
--- Comment #8 from irar at il dot ibm dot com 2007-01-15 10:04 --- (In reply to comment #2) I think this whole type issue is a mess and needs some improvement. Maybe next week I can get to that. Andrew, are you still planning to solve this, or should I prepare a fix for rs6000_builtin_mul_widen_even as suggested by Paolo in comment #1 to close PR 22372? Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||pinskia at gcc dot gnu dot ||org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30210
[Bug tree-optimization/26362] ICE on the autovect-branch (gfortran example)
--- Comment #3 from irar at il dot ibm dot com 2007-01-28 10:45 --- The current versions of both mainline and autovect branch do not ICE. Strided loads are not implemented for SSE. I opened a PR 30211 for it. I think this PR can be closed. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=26362
[Bug tree-optimization/27659] ICE on autovect-branch
--- Comment #3 from irar at il dot ibm dot com 2007-01-28 11:38 --- I tried to reproduce this on x86 with current autovect branch and mainline with .../g++ -fpreprocessed tmp.ii -S -O3 -ftree-vectorize -msse2 -ansi -fdump-tree-vect-details. It doesn't not ICE, and the loop is vectorized. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27659
[Bug c/30843] [4.3 Regression] ice for legal code with -ftree-vectorize -O2
--- Comment #5 from irar at il dot ibm dot com 2007-02-19 11:18 --- Subject: Re: ice for legal code with -ftree-vectorize -O2 I know what the problem is. If we don't remove the store while iterating, we can't get it later (the si), can we? Ira dorit at il dot ibm dot com [EMAIL PROTECTED] To .gnu.org Ira Rosen/Haifa/[EMAIL PROTECTED] cc 18/02/2007 23:52 Subject [Bug c/30843] ice for legal code Please respond to with -ftree-vectorize -O2 [EMAIL PROTECTED] gnu.org -- dorit at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30843 --- You are receiving this mail because: --- You are on the CC list for the bug, or are watching someone who is. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30843
[Bug c/30843] [4.3 Regression] ice for legal code with -ftree-vectorize -O2
--- Comment #6 from irar at il dot ibm dot com 2007-02-19 12:41 --- Sorry about the last comment, it was sent by mistake. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30843
[Bug bootstrap/30921] New: Bootstrap failure with -ftree-vectorize on i386
Bootstrap with vectorization enabled fails on i386 starting from revision 121767: http://gcc.gnu.org/viewcvs?view=revrevision=121767 Ira -- Summary: Bootstrap failure with -ftree-vectorize on i386 Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com GCC build triplet: i386-redhat-linux GCC host triplet: i386-redhat-linux GCC target triplet: i386-redhat-linux http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30921
[Bug bootstrap/30921] Bootstrap failure with -ftree-vectorize on i386
--- Comment #1 from irar at il dot ibm dot com 2007-02-22 07:58 --- Here is the ChangeLog entry for that patch: 2007-02-09 Richard Henderson [EMAIL PROTECTED] * config/i386/constraints.md (Ym): New constraint. * config/i386/i386.md (movsi_1): Change Y2 to Yi constraints. (movdi_1_rex64): Split sse and xmm general register moves from memory move alternatives. Use conditional register constraints. (movsf_1, movdf_integer): Likewise. (zero_extendsidi2_32, zero_extendsidi2_rex64): Likewise. (movdf_integer_rex64): New. (pushsf_rex64): Fix output constraints. * config/i386/sse.md (sse2_loadld): Split rm alternative, use Yi. (sse2_stored): Likewise. (sse2_storeq_rex64): New. * config/i386/i386.c (x86_inter_unit_moves): Enable for not amd and not generic. (ix86_secondary_memory_needed): Don't bypass TARGET_INTER_UNIT_MOVES for optimize_size. Remove SF/DFmode hack. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30921
[Bug bootstrap/30921] Bootstrap failure with -ftree-vectorize on i386
--- Comment #3 from irar at il dot ibm dot com 2007-02-22 08:22 --- (In reply to comment #2) (In reply to comment #0) Bootstrap with vectorization enabled fails on i386 starting from revision 121767: http://gcc.gnu.org/viewcvs?view=revrevision=121767 Could you post exact steps how to reproduce this failure? Run make bootstrap BOOT_CFLAGS=-O2 -g -ftree-vectorize -msse2 Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30921
[Bug tree-optimization/24309] [4.1/4.2/4.3 Regression] ICE with -O3 -ftree-loop-linear
--- Comment #15 from irar at il dot ibm dot com 2007-03-05 09:30 --- I tried the reduced testcase on powerpc with -ftree-loop-linear and both -O2 and -O3 on 4.1, 4.2 and 4.3, and it works fine. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=24309
[Bug tree-optimization/25371] -ftree-vectorize results in internal compiler error on AMD64
--- Comment #6 from irar at il dot ibm dot com 2007-03-11 10:33 --- Harsha, could you please attach vectorizer's dump file (produced with -fdump-tree-vect-details)? Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25371
[Bug tree-optimization/31343] New: ICE in data-refs dependence testing
An attempt to divide by zero is made (causing ICE on the attached test case) for evolution functions with zero step. For the following evolution functions of pS[i_15].x and pS[i_15].y from the attached test (chrec_a = {{0, +, 1}_1, +, 0}_2) (chrec_b = {{1, +, 1}_1, +, 0}_2) the difference (-1) is calculated, and then the check whether the step (0)divides the difference is performed in function chrec_steps_divide_constant_p (tree-data-ref.c), causing ICE. -- Summary: ICE in data-refs dependence testing Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31343
[Bug tree-optimization/31343] ICE in data-refs dependence testing
--- Comment #1 from irar at il dot ibm dot com 2007-03-25 10:02 --- Created an attachment (id=13281) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=13281action=view) test case -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31343
[Bug testsuite/32014] new gcc failures
--- Comment #1 from irar at il dot ibm dot com 2007-05-21 10:43 --- On PowerPC revision 124785 from May 17 we get: FAIL: gcc.dg/vect/vect-64.c (internal compiler error) FAIL: gcc.dg/vect/vect-64.c (test for excess errors) WARNING: gcc.dg/vect/vect-64.c compilation failed to produce executable FAIL: gcc.dg/vect/vect-68.c (internal compiler error) FAIL: gcc.dg/vect/vect-68.c (test for excess errors) WARNING: gcc.dg/vect/vect-68.c compilation failed to produce executable FAIL: gcc.dg/vect/vect-70.c (internal compiler error) FAIL: gcc.dg/vect/vect-70.c (test for excess errors) WARNING: gcc.dg/vect/vect-70.c compilation failed to produce executable FAIL: gcc.dg/vect/vect-intfloat-conversion-4a.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-intfloat-conversion-4b.c scan-tree-dump-times vectorized 1 loops 1 Revision 124739 from May 15 works fine. On today's snapshot (124895) we also see XPASS: gcc.dg/vect/vect-iv-4.c scan-tree-dump-times vectorized 1 loops 1 in addition to the above failures. Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32014
[Bug tree-optimization/37194] [4.3/4.4 Regression] Autovectorization of small constant iteration loop degrades performance
--- Comment #8 from irar at il dot ibm dot com 2009-01-05 13:58 --- To handle unknown alignment of data, the vectorizer creates a prolog loop to peel a statically unknown number of scalar iterations (0=nVF). This loop is followed by a vectorized loop (with the remaining, multiple of VF, number of iterations), and an epilog scalar loop that completes the iterations that were not executed (0=nVF). Therefore, the created scalar loops have unknown number of iterations, which prevents their unrolling (while the original scalar loop is unrolled). Vectorizer cost model does not take possible unrolling into account. Another cost model problem is that the calculation of scalar outside cost for this case is performed not for the original scalar version, but includes run-time guards. Which seems to be wrong in case that the original loop bound is known. I am going to submit a patch to fix that. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194
[Bug tree-optimization/38721] [alias-improvements] vectorizer miscompiles gfortran.fortran-torture/execute/elemental.f90 at -O3
--- Comment #1 from irar at il dot ibm dot com 2009-01-05 13:19 --- Here is a reduced testcase: program test_elemental implicit none integer, dimension (2, 4) :: a integer, dimension (2, 4) :: b integer(kind = 8), dimension(2) :: c a = reshape ((/2, 3, 4, 5, 6, 7, 8, 9/), (/2, 4/)) b = 0 a = e_fn (a(:, 4:1:-1), 1 + b) ! This tests intrinsic elemental conversion functions. c = 2 * a(1, 1) if (any (c .ne. 14)) call abort ! This triggered bug due to building ss chains in the wrong order. b = 0; a = a - e_fn (a, b) if (any (a .ne. 0)) call abort contains elemental integer(kind=4) function e_fn (p, q) integer, intent(in) :: p, q e_fn = p - q end function end program The problem is that dse2 removes the stores to array A.4 which is used by the vectorized code: A.4[0] = D.1635_155; ... A.4[7] = D.1635_165; vect_pA.67_156 = (vector integer(kind=4) *) A.4; vect_pa.73_197 = (vector integer(kind=4) *) a; vect_var_.68_254 = *vect_pA.67_156; *vect_pa.73_197 = vect_var_.68_254; vect_pA.63_256 = vect_pA.67_156 + 16; vect_pa.69_257 = vect_pa.73_197 + 16; vect_var_.68_170 = *vect_pA.63_256; *vect_pa.69_257 = vect_var_.68_170; We propagate alias info from the scalar to vector ref in vect_create_data_ref_ptr() (in tree-vect-transform.c): /** (2) Add aliasing information to the new vector-pointer: (The points-to info (DR_PTR_INFO) may be defined later.) **/ tag = DR_SYMBOL_TAG (dr); gcc_assert (tag); /* If tag is a variable (and NOT_A_TAG) than a new symbol memory tag must be created with tag added to its may alias list. */ if (!MTAG_P (tag)) new_type_alias (vect_ptr, tag, DR_REF (dr)); else set_symbol_mem_tag (vect_ptr, tag); Those lines do not exist on the branch. Do you take care of this somewhere else? Ira -- irar at il dot ibm dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-01-05 13:19:53 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38721
[Bug tree-optimization/37194] [4.3/4.4 Regression] Autovectorization of small constant iteration loop degrades performance
--- Comment #12 from irar at il dot ibm dot com 2009-01-08 09:25 --- (In reply to comment #11) fixed for 4.3.3? Thanks. No, still waiting for approval. -- irar at il dot ibm dot com changed: What|Removed |Added Status|RESOLVED|REOPENED Resolution|FIXED | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194
[Bug tree-optimization/38529] [4.3 regression] ICE with nested loops
--- Comment #4 from irar at il dot ibm dot com 2009-01-11 07:48 --- Fixed on 4.3 branch as well. -- irar at il dot ibm dot com changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38529
[Bug tree-optimization/37194] [4.3 Regression] Autovectorization of small constant iteration loop degrades performance
--- Comment #14 from irar at il dot ibm dot com 2009-01-11 07:57 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37194
[Bug tree-optimization/37021] Fortran Complex reduction / multiplication not vectorized
--- Comment #6 from irar at il dot ibm dot com 2009-01-25 09:12 --- (In reply to comment #5) So, 4) The vectorized version sucks because we have to use peeling for niters because we need to unroll the loop once and cannot apply SLP here. What do you mean by unroll the loop once? Q1: does SLP work with reductions at all? No. SLP currently originates from groups of strided stores. Q2: does SLP do pattern recognition? Pattern recoginition is done before SLP, and SLP handles stmts that were marked as a part of a pattern. There is no SLP specific pattern recoginition. First of all we would need to recognize a complex reduction as a single vectorized reduction. Second we need to vectorize the complex multiplication with SLP, feeding the reduction with one resulting complex vector. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37021
[Bug tree-optimization/37021] Fortran Complex reduction / multiplication not vectorized
--- Comment #8 from irar at il dot ibm dot com 2009-01-25 12:17 --- (In reply to comment #7) Q1: does SLP work with reductions at all? No. SLP currently originates from groups of strided stores. Ah, I see. In this loop we have two reductions, so to apply SLP we would need to see that we can use a group of reductions for SLP? Yes, I think this will work. Q2: does SLP do pattern recognition? Pattern recoginition is done before SLP, and SLP handles stmts that were marked as a part of a pattern. There is no SLP specific pattern recoginition. Ok, but with a reduction it won't help me here. Can a loop be vectorized with just pattern recognition? Hm, if I remember correctly we detect scalar patterns and then vectorize them. We don't support detecting vector patterns from scalar code, correct? Yes, if I understand you correctly, we detect scalar patterns, but adding vector pattern detection does not seem to be complicated. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37021
[Bug tree-optimization/38968] Complex matrix product is not vectorized
--- Comment #3 from irar at il dot ibm dot com 2009-01-26 13:09 --- (In reply to comment #2) Now, I wonder why we do not just use alignment + misalign in that case. I think you are right. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38968
[Bug tree-optimization/39248] FAIL: gcc.dg/vect/vect-complex-1.c
--- Comment #4 from irar at il dot ibm dot com 2009-02-25 07:06 --- Does adding attribute aligned, as below, help? Index: vect-complex-1.c === --- vect-complex-1.c(revision 144030) +++ vect-complex-1.c(working copy) @@ -6,19 +6,19 @@ #define N 16 -_Complex float a[N] = +_Complex float a[N] __attribute__ ((__aligned__(16))) = { 10.0F + 20.0iF, 11.0F + 21.0iF, 12.0F + 22.0iF, 13.0F + 23.0iF, 14.0F + 24.0iF, 15.0F + 25.0iF, 16.0F + 26.0iF, 17.0F + 27.0iF, 18.0F + 28.0iF, 19.0F + 29.0iF, 20.0F + 30.0iF, 21.0F + 31.0iF, 22.0F + 32.0iF, 23.0F + 33.0iF, 24.0F + 34.0iF, 25.0F + 35.0iF }; -_Complex float b[N] = +_Complex float b[N] __attribute__ ((__aligned__(16))) = { 30.0F + 40.0iF, 31.0F + 41.0iF, 32.0F + 42.0iF, 33.0F + 43.0iF, 34.0F + 44.0iF, 35.0F + 45.0iF, 36.0F + 46.0iF, 37.0F + 47.0iF, 38.0F + 48.0iF, 39.0F + 49.0iF, 40.0F + 50.0iF, 41.0F + 51.0iF, 42.0F + 52.0iF, 43.0F + 53.0iF, 44.0F + 54.0iF, 45.0F + 55.0iF }; -_Complex float c[N]; -_Complex float res[N] = +_Complex float c[N] __attribute__ ((__aligned__(16))); +_Complex float res[N] __attribute__ ((__aligned__(16))) = { 40.0F + 60.0iF, 42.0F + 62.0iF, 44.0F + 64.0iF, 46.0F + 66.0iF, 48.0F + 68.0iF, 50.0F + 70.0iF, 52.0F + 72.0iF, 54.0F + 74.0iF, 56.0F + 76.0iF, 58.0F + 78.0iF, 60.0F + 80.0iF, 62.0F + 82.0iF, Could you please attach slp-7.c's dump as well? I think it is a different problem there. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39248
[Bug tree-optimization/39300] vectorizer confused by predictive commoning and PRE
--- Comment #4 from irar at il dot ibm dot com 2009-02-25 14:08 --- Looks similar to PR 35229. We get here: # pre.1 = PHI D.1, D.2 .. load D.2 D.3 = D.2 + pre.1 + ... store D.3 -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39300
[Bug tree-optimization/39248] FAIL: gcc.dg/vect/vect-complex-1.c
--- Comment #7 from irar at il dot ibm dot com 2009-02-26 09:57 --- In slp-7.c all the three loops get vectorized, including the loop that requires vector multiplication for shorts. This patch http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00044.html added ARM to vect_int_mult, but not to vect_short_mult, so I guess the fix should be: Index: target-supports.exp === --- target-supports.exp (revision 144030) +++ target-supports.exp (working copy) @@ -2275,7 +2275,8 @@ proc check_effective_target_vect_short_m || [istarget spu-*-*] || [istarget i?86-*-*] || [istarget x86_64-*-*] - || [istarget powerpc*-*-*] } { + || [istarget powerpc*-*-*] + || [check_effective_target_arm32] } { set et_vect_short_mult_saved 1 } } Does it make sense? Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39248
[Bug middle-end/39318] internal compiler error: verify_stmts failed
--- Comment #6 from irar at il dot ibm dot com 2009-03-01 10:34 --- Reduced it a bit more: subroutine adw_trajsp (F_u,i0,in,j0,jn) implicit none real F_u(*) integer i0,in,j0,jn integer n,i,j real*8 xsin(i0:in,j0:jn) !$omp parallel do private(xsin) do j=j0,jn do i=i0,in xsin(i,j) = sqrt(F_u(n)) end do end do !$omp end parallel do return end on x86_64-suse-linux with gfortran -c -fopenmp -fcray-pointer -fexceptions -O2 -ftree-vectorize. When we vectorize a function call, we replace the RHS of the stmt with something harmless: D.1692_41 = __builtin_sqrtf (pretmp.45_79); is replaced with D.1692_41 = 0.0;. We don't remove the original stmt from the EH table. The question is it OK to vectorize function that are in EH table? -- irar at il dot ibm dot com changed: What|Removed |Added CC||rguenther at suse dot de Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-03-01 10:34:19 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39318
[Bug middle-end/39318] internal compiler error: verify_stmts failed
--- Comment #10 from irar at il dot ibm dot com 2009-03-01 12:27 --- (In reply to comment #9) Ok. Then if (maybe_clean_or_replace_eh_stmt (old_stmt, new_stmt)) gimple_purge_dead_eh_edges (bb); should be enough to fix this. Richard. Yes, it fixes the ICE. Thanks! I'll submit a patch after testing. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39318
[Bug tree-optimization/39248] FAIL: gcc.dg/vect/vect-complex-1.c
--- Comment #10 from irar at il dot ibm dot com 2009-03-08 07:25 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39248
[Bug testsuite/39422] [4.4 regression] Failing SPU vectorizer testcases
--- Comment #1 from irar at il dot ibm dot com 2009-03-10 13:55 --- I am preparing a patch. -- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-03-10 13:55:31 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39422
[Bug tree-optimization/35229] Vectorizer doesn't support dependence created by predictive commoning
--- Comment #3 from irar at il dot ibm dot com 2009-03-17 13:33 --- (In reply to comment #2) Or like the following, which is just a bunch of reductions of two elements float data[1024]; void foo(void) { int i; for (i = 1; i 1024; ++i) data[i] = data[i] + data[i-1]; } Actually, this loop is not vectorizable. res and data have to be different arrays, otherwise we get read after write dependence with distance 1. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35229
[Bug tree-optimization/39529] ICE on valid code
--- Comment #2 from irar at il dot ibm dot com 2009-03-24 08:23 --- I am testing this patch: Index: tree-vect-transform.c === --- tree-vect-transform.c (revision 145027) +++ tree-vect-transform.c (working copy) @@ -1099,7 +1099,10 @@ vect_create_data_ref_ptr (gimple stmt, s if (!MTAG_P (tag)) new_type_alias (vect_ptr, tag, DR_REF (dr)); else -set_symbol_mem_tag (vect_ptr, tag); +{ + set_symbol_mem_tag (vect_ptr, tag); + mark_sym_for_renaming (tag); +} /** Note: If the dataref is in an inner-loop nested in LOOP, and we are vectorizing LOOP (i.e. outer-loop vectorization), we need to create two -- irar at il dot ibm dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2009-03-24 08:23:00 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39529
[Bug tree-optimization/39529] ICE on valid code
--- Comment #3 from irar at il dot ibm dot com 2009-03-24 11:42 --- (In reply to comment #0) My solution: After each loop is vectorized, and SSA is updated, I re-compute alias info. I am not familiar with the vectorizer sources, so I don't know if there is a more efficient way to fix this problem, and still be sure it would be correct for all inputs. I think that just marking the created pointer for renaming can be enough... Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39529
[Bug tree-optimization/39529] ICE on valid code
--- Comment #5 from irar at il dot ibm dot com 2009-03-25 12:27 --- Fixed. -- irar at il dot ibm dot com changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39529
[Bug tree-optimization/39595] [4.4/4.5 Regression]ICE in vectorizable_store at tree-vect-transform.c:5361
-- irar at il dot ibm dot com changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |irar at il dot ibm dot com |dot org | Status|NEW |ASSIGNED Last reconfirmed|2009-03-31 09:49:03 |2009-03-31 12:21:04 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39595
[Bug tree-optimization/39595] [4.4/4.5 Regression]ICE in vectorizable_store at tree-vect-transform.c:5361
--- Comment #9 from irar at il dot ibm dot com 2009-04-02 10:07 --- Will the following test do the job? (I added -m64 for target i686-*-*) ! { dg-do compile } ! { dg-options -c -O3 -fdump-tree-vect-details -m64 { target i686-*-* } } subroutine foo(a,c,i,m) dimension a(4,*),b(3,64),c(3,200),d(64) integer*8 i,j,k,l,m do j=1,m,64 do k=1,m-j+1 d(k)=a(4,j-1+k) do l=1,3 b(l,k)=c(l,i)+a(l,j-1+k) end do end do call bar(b,d,i) end do end ! { dg-final { cleanup-tree-dump vect } } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39595
[Bug tree-optimization/39595] [4.4/4.5 Regression]ICE in vectorizable_store at tree-vect-transform.c:5361
--- Comment #11 from irar at il dot ibm dot com 2009-04-02 11:16 --- (In reply to comment #10) No, please don't ever add -m64 or -m32 to dg-options, that is something the tester decides on in how it invokes make check. If a test is specific to -m64 or -m32, you should be using ilp32 or lp64 etc. effective target requirements, but in this case there is nothing in the testcase that requires -m64, the test just passes for some targets and fails for others. Don't add -c, that's implicit for dg-do compile, you're adding it for second time. OK, thanks for the explanation! Also, I don't like the s/double precision/dimension/ change, the type of the vars should be if possible explicit when you aren't testing the Fortran FE. On x86_64-linux it fails with double precision, but also real, integer or integer*8 instead of double precision, just don't leave the explicit type out. I will change it to real then (double does not get vectorized on PowerPC). The testcase as is in #c3 fails on x86_64-linux and succeeds on i686-linux and RUNTESTFLAGS=--target_board=unix/-m32 on x86_64-linux, I guess on Darwin similarly, it will fail with RUNTESTFLAGS=--target_board=unix/-m64. Here is the final version (the test name will be O3-pr39595.f, so vect.exp will append -O3 to the flags): ! { dg-do compile } subroutine foo(a,c,i,m) real a(4,*),b(3,64),c(3,200),d(64) integer*8 i,j,k,l,m do j=1,m,64 do k=1,m-j+1 d(k)=a(4,j-1+k) do l=1,3 b(l,k)=c(l,i)+a(l,j-1+k) end do end do call bar(b,d,i) end do end ! { dg-final { cleanup-tree-dump vect } } Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39595
[Bug tree-optimization/32230] [4.3 Regression] Segfault in set_bb_for_stmt with -O -ftree-vectorize
--- Comment #5 from irar at il dot ibm dot com 2007-06-28 09:02 --- I think it is better to check that the statement is not NULL before calling bsi_insert_on_edge_immediate. I am going to prepare a patch for this. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32230
[Bug tree-optimization/32230] [4.3 Regression] Segfault in set_bb_for_stmt with -O -ftree-vectorize
--- Comment #6 from irar at il dot ibm dot com 2007-06-28 11:41 --- ((float*) (((sbuf_header_t *) ((buf) == (buf)-buf[0]))-buf[0]))[i] = val; is (after ommiting the casts) *(1B + (i * 4)) = val; Is that legal? Vectorizer assumes that every data-ref has base_address. In the above case we get the following data-ref structure: base_address: 0B offset from base address: 0 constant offset from base address: 1 step: 4 aligned to: 128 base_object: *0B symbol tag: SMT.5 therefore, creating an empty stmt for the first access of the data-ref in the loop. Before Zdenek's rewrite of data-refs analysis, it failed to create a dr here, and thus no segfault occurred. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32230
[Bug tree-optimization/32230] [4.3 Regression] Segfault in set_bb_for_stmt with -O -ftree-vectorize
--- Comment #8 from irar at il dot ibm dot com 2007-06-28 12:29 --- (In reply to comment #7) I suppose rejecting NULL bases should work here? Yes, only it's not NULL it's zero (0B). We can reject it in the vectorizer or not create a dr for it... Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32230
[Bug tree-optimization/32230] [4.3 Regression] Segfault in set_bb_for_stmt with -O -ftree-vectorize
--- Comment #10 from irar at il dot ibm dot com 2007-06-28 12:38 --- (In reply to comment #9) I suppose all INTEGER_CST bases should be rejected. Richard. Right. The value actually doesn't matter since the constant part is split to the init part in (tree-data-ref.c:656): split_constant_offset (base_iv.base, base_iv.base, dinit); I only don't know where it is better to fail - in dr analysis on in the vectorizer. Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32230
[Bug tree-optimization/32477] ice for legal code with -O2 -ftree-vectorize
--- Comment #3 from irar at il dot ibm dot com 2007-07-01 13:21 --- A fix to PR 32230 http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00018.html fixes this one too. Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32477
[Bug tree-optimization/32377] can't determine dependence (source/destination overlap without more than size)
--- Comment #5 from irar at il dot ibm dot com 2007-07-02 12:20 --- (In reply to comment #4) Looks like the data-dependence analysis is doing it's job I am not sure about that. I tried the following cases and got distance 1 (and direction positive) in all of them for load and store to ia pair. for (i = 0; i N; i++){ ia[i+1] = ia[i] * 4; } for (i = 0; i N; i++){ ia[i] = ia[i+1] * 4; } for (i = 0; i N; i++){ ia[i+1] = 0; ic[i] = ia[i] * 4; } for (i = 0; i N; i++){ ia[i] = 0; ic[i] = ia[i+1] * 4; } What am I missing? Thanks, Ira -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32377
[Bug tree-optimization/32377] can't determine dependence (source/destination overlap without more than size)
--- Comment #7 from irar at il dot ibm dot com 2007-07-03 12:57 --- (In reply to comment #6) Distance vectors are lexicographically positive vectors, that is why you get the 1 in all these cases. If you want to know which one comes first, you have to look at the DR_IS_READ for both references in the dependence relation. I am sorry, but I still don't understand. For for (i = 0; i N; i++){ ia[i+1] = ia[i] * 4; } the ddr is {ld, st} and distance 1 and for for (i = 0; i N; i++){ ia[i] = ia[i+1] * 4; } the ddr is also {ld, st} with distance 1. How can we distinguish between these cases? Thanks, Ira -- irar at il dot ibm dot com changed: What|Removed |Added CC||irar at il dot ibm dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32377