[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #12 from jakub at gcc dot gnu dot org 2009-01-09 17:13 --- Subject: Bug 38708 Author: jakub Date: Fri Jan 9 17:12:40 2009 New Revision: 143211 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=143211 Log: PR target/38686 PR target/38708 * config/i386/i386.c (override_options): Reject -mstringop-strategy=rep_8byte with -m32. (ix86_expand_movmem): For size_needed == 1 set epilogue_size_needed to 1. Do count comparison against epilogue_size_needed at compile time even when count_exp was constant forced into register. For size_needed don't jump to epilogue, instead just avoid aligning and invoke the body algorithm. If need_zero_guard, add zero guard even if count is non-zero, but smaller than size_needed + number of bytes that could be stored for alignment. (ix86_expand_setmem): For size_needed == 1 set epilogue_size_needed to 1. If need_zero_guard, add zero guard even if count is non-zero, but smaller than size_needed + number of bytes that could be stored for alignment. Compare size_needed with epilogue_size_needed instead of desired_align - align, don't adjust size_needed, pass epilogue_size_needed to the epilogue expanders. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #13 from jakub at gcc dot gnu dot org 2009-01-09 17:13 --- Fixed. -- jakub at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #11 from jakub at gcc dot gnu dot org 2009-01-06 11:13 --- *** Bug 38686 has been marked as a duplicate of this bug. *** -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Priority|P3 |P2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
-- rguenth at gcc dot gnu dot org changed: What|Removed |Added Priority|P2 |P1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #10 from jakub at gcc dot gnu dot org 2009-01-04 14:26 --- The problem is that if size is smaller than epilogue_size_needed, desired_align align and size_needed == 1, then ix86_expand_setmem jumps around the initialization of promoted_val (when val_exp isn't CONST_INT) and alignment, but then uses promoted_val for rep stosb (which is thus uninitialized). I have a fix. -- jakub at gcc dot gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |jakub at gcc dot gnu dot org |dot org | Status|NEW |ASSIGNED Last reconfirmed|2009-01-02 19:01:18 |2009-01-04 14:26:28 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #5 from jakub at gcc dot gnu dot org 2009-01-03 08:46 --- There are several issues. One is what H.J. mentioned, seen e.g. on: char buf[8] __attribute__((aligned)); char A = 'A'; int len = 1; void __attribute__((noinline)) check (void) { if (__builtin_memcmp (buf, \0\0A\0\0\0\0\0, 8)) __builtin_abort (); } int main () { __builtin_memset (buf + 2, A, len); check (); return 0; } with -O -mtune=pentium-m -m32. This can be fixed by adding max_size = smallest_pow2_greater_than (max_size - 1); at the start of expand_setmem_epilogue_via_loop. But another testcase with the same options that still fails is: char buf[8] __attribute__((aligned)) = ; char A = 'A'; int len = 4; void __attribute__((noinline)) check (void) { if (__builtin_memcmp (buf, \0\0\0\0, 8)) __builtin_abort (); } int main () { __builtin_memset (buf + 4, '\0', len); check (); return 0; } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #6 from jakub at gcc dot gnu dot org 2009-01-03 09:09 --- --- i386.c.jj42008-12-27 10:12:25.0 +0100 +++ i386.c2009-01-03 10:03:05.0 +0100 @@ -18012,13 +18012,12 @@ ix86_expand_setmem (rtx dst, rtx count_e Epilogue code will actually copy COUNT_EXP EPILOGUE_SIZE_NEEDED bytes. Compensate if needed. */ - if (size_needed desired_align - align) + if (size_needed epilogue_size_needed) { tmp = expand_simple_binop (counter_mode (count_exp), AND, count_exp, GEN_INT (size_needed - 1), count_exp, 1, OPTAB_DIRECT); - size_needed = desired_align - align + 1; if (tmp != count_exp) emit_move_insn (count_exp, tmp); } @@ -18029,10 +18028,10 @@ ix86_expand_setmem (rtx dst, rtx count_e { if (force_loopy_epilogue) expand_setmem_epilogue_via_loop (dst, destreg, val_exp, count_exp, - size_needed); + epilogue_size_needed); else expand_setmem_epilogue (dst, destreg, promoted_val, count_exp, -size_needed); +epilogue_size_needed); } if (jump_around_label) emit_label (jump_around_label); instead seems to fix memset-3.c with -mtune=pentium-m -m32 at all optimizations levels (and is what ix86_expand_movmem does). But memset-2.c still fails, and not just at -O3, but also at -O2. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #7 from jakub at gcc dot gnu dot org 2009-01-03 10:26 --- Created an attachment (id=17025) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17025action=view) gcc44-pr38708.patch For size_needed, we never want any epilogue. This cures memset-2.c at -O2, but -O3 still fails. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #8 from jakub at gcc dot gnu dot org 2009-01-03 10:26 --- I meant for size_needed == 1. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #9 from jakub at gcc dot gnu dot org 2009-01-03 11:39 --- I ran check-gcc RUNTESTFLAGS='execute.exp --target_board=unix/{-m32,-m32/-mtune=pentium-m,-m64}/-mstringop-strategy={rep_byte,libcall,rep_4byte,rep_8byte,byte_loop,loop,unrolled_loop}' before and after the patch. In all cases, -m32 together with -mstringop-strategy=rep_8byte ICEs in many testcases, obviously we need to reject -m32 -mstringop-strategy=rep_8byte. Other than that, we have: unix/-m64/-mstringop-strategy=loop FAIL: gcc.c-torture/execute/memset-2.c execution, -O2 FAIL: gcc.c-torture/execute/memset-2.c execution, -O3 -fomit-frame-pointer FAIL: gcc.c-torture/execute/memset-2.c execution, -O3 -fomit-frame-pointer -funroll-loops FAIL: gcc.c-torture/execute/memset-2.c execution, -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions FAIL: gcc.c-torture/execute/memset-2.c execution, -O3 -g (before and after the patch) and: unix/-m32/-mtune=pentium-m/-mstringop-strategy=rep_byte -FAIL: gcc.c-torture/execute/memcpy-2.c execution, -O1 -FAIL: gcc.c-torture/execute/memcpy-2.c execution, -O2 -FAIL: gcc.c-torture/execute/memcpy-2.c execution, -O3 -fomit-frame-pointer -FAIL: gcc.c-torture/execute/memcpy-2.c execution, -O3 -fomit-frame-pointer -funroll-loops -FAIL: gcc.c-torture/execute/memcpy-2.c execution, -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions -FAIL: gcc.c-torture/execute/memcpy-2.c execution, -O3 -g FAIL: gcc.c-torture/execute/memset-1.c execution, -O1 FAIL: gcc.c-torture/execute/memset-1.c execution, -O2 FAIL: gcc.c-torture/execute/memset-1.c execution, -O3 -fomit-frame-pointer FAIL: gcc.c-torture/execute/memset-1.c execution, -O3 -fomit-frame-pointer -funroll-loops FAIL: gcc.c-torture/execute/memset-1.c execution, -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions FAIL: gcc.c-torture/execute/memset-1.c execution, -O3 -g FAIL: gcc.c-torture/execute/memset-2.c execution, -O3 -fomit-frame-pointer FAIL: gcc.c-torture/execute/memset-2.c execution, -O3 -fomit-frame-pointer -funroll-loops FAIL: gcc.c-torture/execute/memset-2.c execution, -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions FAIL: gcc.c-torture/execute/memset-2.c execution, -O3 -g FAIL: gcc.c-torture/execute/memset-3.c execution, -O1 FAIL: gcc.c-torture/execute/memset-3.c execution, -O2 FAIL: gcc.c-torture/execute/memset-3.c execution, -O3 -fomit-frame-pointer FAIL: gcc.c-torture/execute/memset-3.c execution, -O3 -fomit-frame-pointer -funroll-loops FAIL: gcc.c-torture/execute/memset-3.c execution, -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions FAIL: gcc.c-torture/execute/memset-3.c execution, -O3 -g unix/-m32/-mtune=pentium-m/-mstringop-strategy=rep_4byte -FAIL: gcc.c-torture/execute/memset-3.c execution, -O1 -FAIL: gcc.c-torture/execute/memset-3.c execution, -O2 -FAIL: gcc.c-torture/execute/memset-3.c execution, -O3 -fomit-frame-pointer -FAIL: gcc.c-torture/execute/memset-3.c execution, -O3 -fomit-frame-pointer -funroll-loops -FAIL: gcc.c-torture/execute/memset-3.c execution, -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions -FAIL: gcc.c-torture/execute/memset-3.c execution, -O3 -g ( FAIL is before+after the patch, -FAIL is before the patch, cured by the patch). To reject rep_8byte for -m32 we IMHO want: --- i386.c.jj42008-12-27 10:12:25.0 +0100 +++ i386.c2009-01-03 11:53:15.0 +0100 @@ -2686,7 +2686,8 @@ override_options (bool main_args_p) stringop_alg = libcall; else if (!strcmp (ix86_stringop_string, rep_4byte)) stringop_alg = rep_prefix_4_byte; - else if (!strcmp (ix86_stringop_string, rep_8byte)) + else if (!strcmp (ix86_stringop_string, rep_8byte) +TARGET_64BIT) stringop_alg = rep_prefix_8_byte; else if (!strcmp (ix86_stringop_string, byte_loop)) stringop_alg = loop_1_byte; and obviously the remaining FAILs need to be investigated. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708
[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro
--- Comment #4 from pinskia at gcc dot gnu dot org 2009-01-02 22:08 --- -march=pentiumpro is enough to reproduce the failure. -- pinskia at gcc dot gnu dot org changed: What|Removed |Added Summary|[4.4 Regression] Revision |[4.4 Regression] Revision |137646 caused gcc.c-|137646 caused gcc.c- |torture/execute/memset- |torture/execute/memset- |[23].c fail with - |[23].c fail with - |mtune=pentium-m |mtune=pentiumpro http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708