Hello! A problem was uncovered by -march=corei7 -mtune=intel -m32 with i386/memcpy-[23] testcase in decide_alg subroutine [1]. Although the max size of the transfer was known, the memcpy was not inlined, as expected by the testcase.
The core of the problem can be seen in the definition of 32bit intel_memcpy stringop alg: {libcall, {{11, loop, false}, {-1, rep_prefix_4_byte, false}}}, Please note that the last algorithm sets its maximum size to -1, "unlimited". However, in decide_alg, the same number also signals that no algorithm sets its size, so expected_size is never calculated. In the loop that sets maximal size for user defined algorithm, it is assumed that "-1" belongs exclusively to libcall, which is not the case in the above intel_memcpy definition: if (candidate != libcall && candidate && usable) max = algs->size[i].max; When the last non-libcall algorithm sets its maximum to "-1" (aka "unlimited"), this value fails following test: if (max > 1 && (unsigned HOST_WIDE_INT) max >= max_size and expected_size is never calculated. Attached patch fixes this oversight, so "-1" means unlimited size and "0" means that size was never set. The patch also considers these two special values when choosing a maximum size for dynamic check. 2014-06-02 Uros Bizjak <ubiz...@gmail.com> * config/i386/i386.c (decide_alg): Correctly handle maximum size of stringop algorithm. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}, also with RUNTESTFLAGS="--target_board=unix/-march=corei7/-mtune=intel\{,-m32\}", where it fixes both memcpy failures from [1]. [1] https://gcc.gnu.org/ml/gcc-testresults/2014-06/msg00127.html Jan, can you please review the patch, to check if the logic is OK? Uros.
Index: fuse-caller-save.c =================================================================== --- fuse-caller-save.c (revision 211112) +++ fuse-caller-save.c (working copy) @@ -1,5 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fuse-caller-save" } */ +/* { dg-additional-options "-mregparm=1" { target ia32 } } */ + /* Testing -fuse-caller-save optimization option. */ static int __attribute__((noinline)) Index: sibcall-1.c =================================================================== --- sibcall-1.c (revision 211112) +++ sibcall-1.c (working copy) @@ -1,5 +1,4 @@ -/* { dg-do compile } */ -/* { dg-require-effective-target ia32 } */ +/* { dg-do compile { target ia32 } } */ /* { dg-options "-O2" } */ extern int (*foo)(int); Index: sibcall-2.c =================================================================== --- sibcall-2.c (revision 211118) +++ sibcall-2.c (working copy) @@ -1,5 +1,4 @@ -/* { dg-do compile { xfail { *-*-* } } } */ -/* { dg-require-effective-target ia32 } */ +/* { dg-do compile { target ia32 } } */ /* { dg-options "-O2" } */ extern int doo1 (int); @@ -13,4 +12,4 @@ return (a < 0 ? doo1 : doo2) (a); } -/* { dg-final { scan-assembler-not "call\[ \t\]*.%eax" } } */ +/* { dg-final { scan-assembler-not "call\[ \t\]*.%eax" { xfail *-*-* } } } */ Index: sibcall-3.c =================================================================== --- sibcall-3.c (revision 211118) +++ sibcall-3.c (working copy) @@ -1,5 +1,4 @@ -/* { dg-do compile } */ -/* { dg-require-effective-target ia32 } */ +/* { dg-do compile { target ia32 } } */ /* { dg-options "-O2" } */ extern Index: sibcall-4.c =================================================================== --- sibcall-4.c (revision 211118) +++ sibcall-4.c (working copy) @@ -1,6 +1,5 @@ /* Testcase for PR target/46219. */ -/* { dg-do compile { xfail { *-*-* } } } */ -/* { dg-require-effective-target ia32 } */ +/* { dg-do compile { target ia32 } } */ /* { dg-options "-O2" } */ typedef void (*dispatch_t)(long offset); @@ -12,4 +11,4 @@ dispatch[offset](offset); } -/* { dg-final { scan-assembler-not "jmp\[ \t\]*.%eax" } } */ +/* { dg-final { scan-assembler-not "jmp\[ \t\]*.%eax" { xfail *-*-* } } } */ Index: sibcall-5.c =================================================================== --- sibcall-5.c (revision 211112) +++ sibcall-5.c (working copy) @@ -1,6 +1,5 @@ /* Check that indirect sibcalls understand regparm. */ -/* { dg-do run } */ -/* { dg-require-effective-target ia32 } */ +/* { dg-do run { target ia32 } } */ /* { dg-options "-O2" } */ extern void abort (void); Index: sibcall-6.c =================================================================== --- sibcall-6.c (revision 211118) +++ sibcall-6.c (working copy) @@ -1,5 +1,4 @@ -/* { dg-do compile } */ -/* { dg-require-effective-target ia32 } */ +/* { dg-do compile { target ia32 } } */ /* { dg-options "-O2" } */ typedef void *ira_loop_tree_node_t;