[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-09 Thread jakub at gcc dot gnu dot org


--- Comment #12 from jakub at gcc dot gnu dot org  2009-01-09 17:13 ---
Subject: Bug 38708

Author: jakub
Date: Fri Jan  9 17:12:40 2009
New Revision: 143211

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=143211
Log:
PR target/38686
PR target/38708
* config/i386/i386.c (override_options): Reject
-mstringop-strategy=rep_8byte with -m32.
(ix86_expand_movmem): For size_needed == 1 set epilogue_size_needed
to 1.  Do count comparison against epilogue_size_needed at compile
time even when count_exp was constant forced into register.  For
size_needed don't jump to epilogue, instead just avoid aligning
and invoke the body algorithm.  If need_zero_guard, add zero guard
even if count is non-zero, but smaller than size_needed + number of
bytes that could be stored for alignment.
(ix86_expand_setmem): For size_needed == 1 set epilogue_size_needed
to 1.  If need_zero_guard, add zero guard even if count is non-zero,
but smaller than size_needed + number of bytes that could be stored
for alignment.  Compare size_needed with epilogue_size_needed instead
of desired_align - align, don't adjust size_needed, pass
epilogue_size_needed to the epilogue expanders.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-09 Thread jakub at gcc dot gnu dot org


--- Comment #13 from jakub at gcc dot gnu dot org  2009-01-09 17:13 ---
Fixed.


-- 

jakub at gcc dot gnu dot org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-06 Thread jakub at gcc dot gnu dot org


--- Comment #11 from jakub at gcc dot gnu dot org  2009-01-06 11:13 ---
*** Bug 38686 has been marked as a duplicate of this bug. ***


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-05 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Priority|P3  |P2


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-05 Thread rguenth at gcc dot gnu dot org


-- 

rguenth at gcc dot gnu dot org changed:

   What|Removed |Added

   Priority|P2  |P1


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-04 Thread jakub at gcc dot gnu dot org


--- Comment #10 from jakub at gcc dot gnu dot org  2009-01-04 14:26 ---
The problem is that if size is smaller than epilogue_size_needed, desired_align
 align and size_needed == 1, then ix86_expand_setmem jumps around the
initialization of promoted_val (when val_exp isn't CONST_INT) and alignment,
but then uses promoted_val for rep stosb (which is thus uninitialized).
I have a fix.


-- 

jakub at gcc dot gnu dot org changed:

   What|Removed |Added

 AssignedTo|unassigned at gcc dot gnu   |jakub at gcc dot gnu dot org
   |dot org |
 Status|NEW |ASSIGNED
   Last reconfirmed|2009-01-02 19:01:18 |2009-01-04 14:26:28
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-03 Thread jakub at gcc dot gnu dot org


--- Comment #5 from jakub at gcc dot gnu dot org  2009-01-03 08:46 ---
There are several issues.  One is what H.J. mentioned, seen e.g. on:
char buf[8] __attribute__((aligned));
char A = 'A';
int len = 1;

void __attribute__((noinline))
check (void)
{
  if (__builtin_memcmp (buf, \0\0A\0\0\0\0\0, 8))
__builtin_abort ();
}

int
main ()
{
  __builtin_memset (buf + 2, A, len);
  check ();
  return 0;
}
with -O -mtune=pentium-m -m32.  This can be fixed by adding
max_size = smallest_pow2_greater_than (max_size - 1); at the start of
expand_setmem_epilogue_via_loop.  But another testcase with the same options
that still fails is:
char buf[8] __attribute__((aligned)) = ;
char A = 'A';
int len = 4;

void __attribute__((noinline))
check (void)
{
  if (__builtin_memcmp (buf, \0\0\0\0, 8))
__builtin_abort ();
}

int
main ()
{
  __builtin_memset (buf + 4, '\0', len);
  check ();
  return 0;
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-03 Thread jakub at gcc dot gnu dot org


--- Comment #6 from jakub at gcc dot gnu dot org  2009-01-03 09:09 ---
--- i386.c.jj42008-12-27 10:12:25.0 +0100
+++ i386.c2009-01-03 10:03:05.0 +0100
@@ -18012,13 +18012,12 @@ ix86_expand_setmem (rtx dst, rtx count_e
  Epilogue code will actually copy COUNT_EXP  EPILOGUE_SIZE_NEEDED
  bytes. Compensate if needed.  */

-  if (size_needed  desired_align - align)
+  if (size_needed  epilogue_size_needed)
 {
   tmp =
 expand_simple_binop (counter_mode (count_exp), AND, count_exp,
  GEN_INT (size_needed - 1), count_exp, 1,
  OPTAB_DIRECT);
-  size_needed = desired_align - align + 1;
   if (tmp != count_exp)
 emit_move_insn (count_exp, tmp);
 }
@@ -18029,10 +18028,10 @@ ix86_expand_setmem (rtx dst, rtx count_e
 {
   if (force_loopy_epilogue)
 expand_setmem_epilogue_via_loop (dst, destreg, val_exp, count_exp,
- size_needed);
+ epilogue_size_needed);
   else
 expand_setmem_epilogue (dst, destreg, promoted_val, count_exp,
-size_needed);
+epilogue_size_needed);
 }
   if (jump_around_label)
 emit_label (jump_around_label);

instead seems to fix memset-3.c with -mtune=pentium-m -m32 at all optimizations
levels (and is what ix86_expand_movmem does).  But memset-2.c still fails, and
not just at -O3, but also at -O2.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-03 Thread jakub at gcc dot gnu dot org


--- Comment #7 from jakub at gcc dot gnu dot org  2009-01-03 10:26 ---
Created an attachment (id=17025)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=17025action=view)
gcc44-pr38708.patch

For size_needed, we never want any epilogue.  This cures memset-2.c at -O2, but
-O3 still fails.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-03 Thread jakub at gcc dot gnu dot org


--- Comment #8 from jakub at gcc dot gnu dot org  2009-01-03 10:26 ---
I meant for size_needed == 1.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-03 Thread jakub at gcc dot gnu dot org


--- Comment #9 from jakub at gcc dot gnu dot org  2009-01-03 11:39 ---
I ran check-gcc RUNTESTFLAGS='execute.exp
--target_board=unix/{-m32,-m32/-mtune=pentium-m,-m64}/-mstringop-strategy={rep_byte,libcall,rep_4byte,rep_8byte,byte_loop,loop,unrolled_loop}'
before and after the patch.  In all cases, -m32 together with
-mstringop-strategy=rep_8byte
ICEs in many testcases, obviously we need to reject -m32
-mstringop-strategy=rep_8byte.
Other than that, we have:
 unix/-m64/-mstringop-strategy=loop
 FAIL: gcc.c-torture/execute/memset-2.c execution,  -O2 
 FAIL: gcc.c-torture/execute/memset-2.c execution,  -O3 -fomit-frame-pointer 
 FAIL: gcc.c-torture/execute/memset-2.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
 FAIL: gcc.c-torture/execute/memset-2.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
 FAIL: gcc.c-torture/execute/memset-2.c execution,  -O3 -g 
(before and after the patch) and:
 unix/-m32/-mtune=pentium-m/-mstringop-strategy=rep_byte
-FAIL: gcc.c-torture/execute/memcpy-2.c execution,  -O1 
-FAIL: gcc.c-torture/execute/memcpy-2.c execution,  -O2 
-FAIL: gcc.c-torture/execute/memcpy-2.c execution,  -O3 -fomit-frame-pointer 
-FAIL: gcc.c-torture/execute/memcpy-2.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
-FAIL: gcc.c-torture/execute/memcpy-2.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
-FAIL: gcc.c-torture/execute/memcpy-2.c execution,  -O3 -g 
 FAIL: gcc.c-torture/execute/memset-1.c execution,  -O1 
 FAIL: gcc.c-torture/execute/memset-1.c execution,  -O2 
 FAIL: gcc.c-torture/execute/memset-1.c execution,  -O3 -fomit-frame-pointer 
 FAIL: gcc.c-torture/execute/memset-1.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
 FAIL: gcc.c-torture/execute/memset-1.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
 FAIL: gcc.c-torture/execute/memset-1.c execution,  -O3 -g 
 FAIL: gcc.c-torture/execute/memset-2.c execution,  -O3 -fomit-frame-pointer 
 FAIL: gcc.c-torture/execute/memset-2.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
 FAIL: gcc.c-torture/execute/memset-2.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
 FAIL: gcc.c-torture/execute/memset-2.c execution,  -O3 -g 
 FAIL: gcc.c-torture/execute/memset-3.c execution,  -O1 
 FAIL: gcc.c-torture/execute/memset-3.c execution,  -O2 
 FAIL: gcc.c-torture/execute/memset-3.c execution,  -O3 -fomit-frame-pointer 
 FAIL: gcc.c-torture/execute/memset-3.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
 FAIL: gcc.c-torture/execute/memset-3.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
 FAIL: gcc.c-torture/execute/memset-3.c execution,  -O3 -g 
 unix/-m32/-mtune=pentium-m/-mstringop-strategy=rep_4byte
-FAIL: gcc.c-torture/execute/memset-3.c execution,  -O1 
-FAIL: gcc.c-torture/execute/memset-3.c execution,  -O2 
-FAIL: gcc.c-torture/execute/memset-3.c execution,  -O3 -fomit-frame-pointer 
-FAIL: gcc.c-torture/execute/memset-3.c execution,  -O3 -fomit-frame-pointer
-funroll-loops 
-FAIL: gcc.c-torture/execute/memset-3.c execution,  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions 
-FAIL: gcc.c-torture/execute/memset-3.c execution,  -O3 -g 

( FAIL is before+after the patch, -FAIL is before the patch, cured by the
patch).
To reject rep_8byte for -m32 we IMHO want:
--- i386.c.jj42008-12-27 10:12:25.0 +0100
+++ i386.c2009-01-03 11:53:15.0 +0100
@@ -2686,7 +2686,8 @@ override_options (bool main_args_p)
 stringop_alg = libcall;
   else if (!strcmp (ix86_stringop_string, rep_4byte))
 stringop_alg = rep_prefix_4_byte;
-  else if (!strcmp (ix86_stringop_string, rep_8byte))
+  else if (!strcmp (ix86_stringop_string, rep_8byte)
+TARGET_64BIT)
 stringop_alg = rep_prefix_8_byte;
   else if (!strcmp (ix86_stringop_string, byte_loop))
 stringop_alg = loop_1_byte;

and obviously the remaining FAILs need to be investigated.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708



[Bug target/38708] [4.4 Regression] Revision 137646 caused gcc.c-torture/execute/memset-[23].c fail with -mtune=pentiumpro

2009-01-02 Thread pinskia at gcc dot gnu dot org


--- Comment #4 from pinskia at gcc dot gnu dot org  2009-01-02 22:08 ---
-march=pentiumpro  is enough to reproduce the failure. 


-- 

pinskia at gcc dot gnu dot org changed:

   What|Removed |Added

Summary|[4.4 Regression] Revision   |[4.4 Regression] Revision
   |137646 caused gcc.c-|137646 caused gcc.c-
   |torture/execute/memset- |torture/execute/memset-
   |[23].c fail with -  |[23].c fail with -
   |mtune=pentium-m |mtune=pentiumpro


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38708