[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #23 from matz at gcc dot gnu dot org 2007-09-28 13:32 --- Subject: Bug 33552 Author: matz Date: Fri Sep 28 13:31:50 2007 New Revision: 128864 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=128864 Log: PR rtl-optimization/33552 * function.c (match_asm_constraints_1): Check for overlap in inputs and replace all occurences. Modified: trunk/gcc/ChangeLog trunk/gcc/function.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #24 from matz at gcc dot gnu dot org 2007-09-28 13:33 --- Subject: Bug 33552 Author: matz Date: Fri Sep 28 13:33:09 2007 New Revision: 128865 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=128865 Log: PR rtl-optimization/33552 * gcc.target/i386/pr33552.c: New runtime test. * gcc.target/i386/strinline.c: New compile time test. Added: trunk/gcc/testsuite/gcc.target/i386/pr33552.c trunk/gcc/testsuite/gcc.target/i386/strinline.c Modified: trunk/gcc/testsuite/ChangeLog -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #22 from bonzini at gnu dot org 2007-09-28 10:04 --- I don't think so, as you don't know which input will match the output (i.e. whether the two inputs will be swapped) if you have a % constraint in the asm. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #25 from matz at gcc dot gnu dot org 2007-09-28 14:45 --- Fixed now. -- matz at gcc dot gnu dot org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #16 from bonzini at gnu dot org 2007-09-27 06:28 --- I will implement something along the lines that Jakub discussed. In the meanwhile, could anybody figure a self-contained execution testcase based on comment #14? Thanks! -- bonzini at gnu dot org changed: What|Removed |Added AssignedTo|unassigned at gcc dot gnu |bonzini at gnu dot org |dot org | Status|NEW |ASSIGNED Last reconfirmed|2007-09-25 18:57:01 |2007-09-27 06:28:00 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #17 from belyshev at depni dot sinp dot msu dot ru 2007-09-27 08:01 --- (In reply to comment #16) something like this (for gcc.target/i386): /* { dg-do run } */ /* { dg-require-effective-target lp64 } */ /* { dg-options -O2} */ unsigned long a [100]; int main (void) { unsigned long v = 0x5UL, c = 13, low, high; long j; a [0] = 0x2UL; a [1] = 0x3UL; for (j = 0; j 100; j++) { asm (mulq %3 : =a (low), =d (high) : 0 (a [j]), rm (v)); asm (addq %5,%q1\n\tadcq %3,%q0 : =r (c), =r (a [j]) : 0 (high), rme (0), 1 (low), rme (c)); } if (a [0] != 13 || a [1] != 10 || a [2] != 15 || a [3] != 0) __builtin_abort (); return 0; } -- belyshev at depni dot sinp dot msu dot ru changed: What|Removed |Added CC||belyshev at depni dot sinp ||dot msu dot ru http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #18 from jakub at gcc dot gnu dot org 2007-09-27 08:12 --- Sure, no problem. /* PR rtl-optimization/33552 */ /* { dg-do run } */ /* { dg-options -O2 } */ extern void abort (void); void __attribute__((noinline)) foo (unsigned long *wp, unsigned long *up, long un, unsigned long *vp) { long j; unsigned long prod_low, prod_high; unsigned long cy_dig; unsigned long v_limb; v_limb = vp[0]; cy_dig = 64; for (j = un; j 0; j--) { unsigned long u_limb, w_limb; u_limb = *up++; __asm__ ( : =r (prod_low), =r (prod_high) : 0 (u_limb), 1 (v_limb)); __asm__ (mov %5, %1; add %5, %0 : =r (cy_dig), =r (w_limb) : 0 (prod_high), rm (0), 1 (prod_low), rm (cy_dig)); *wp++ = w_limb; } } int main (void) { unsigned long wp[4]; unsigned long up[4] = { 0x1248, 0x248a, 0x1745, 0x1853 }; unsigned long vp = 0xdead; foo (wp, up, 4, vp); if (wp[0] != 0x40 || wp[1] != 0xdeed || wp[2] != 0x1bd9a || wp[3] != 0x29c47) abort (); return 0; } For dg.target/i386, unless we add a bunch of target variants for the only asm arch dependent string in there. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #19 from matz at gcc dot gnu dot org 2007-09-27 14:52 --- Have a patch on http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01968.html . Fixes also the reload failure on x86 -O2 -fPIC on this testcase (which hits glibc): /* { dg-do compile } */ /* { dg-options -O2 -fPIC } */ typedef unsigned int size_t; char * __mempcpy_by2 (char *__dest, __const char *__src, size_t __srclen) { register char *__tmp = __dest; register unsigned long int __d0, __d1; __asm__ __volatile__ (shrl $1,%3\n\t jz2f\n 1:\n\t movl (%2),%0\n\t leal 4(%2),%2\n\t movl %0,(%1)\n\t leal 4(%1),%1\n\t decl %3\n\t jnz 1b\n 2:\n\t movw (%2),%w0\n\t movw %w0,(%1) : =q (__d0), =r (__tmp), =r (__src), =r (__d1), =m ( *(struct { __extension__ char __x[__srclen]; } *)__dest) : 1 (__tmp), 2 (__src), 3 (__srclen / 2), m ( *(struct { __extension__ char __x[__srclen]; } *)__src) : cc); return __tmp + 2; } -- matz at gcc dot gnu dot org changed: What|Removed |Added CC||matz at gcc dot gnu dot org http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #20 from bonzini at gnu dot org 2007-09-27 15:01 --- Thanks. -- bonzini at gnu dot org changed: What|Removed |Added AssignedTo|bonzini at gnu dot org |unassigned at gcc dot gnu ||dot org Status|ASSIGNED|NEW http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #21 from ubizjak at gmail dot com 2007-09-27 19:59 --- (In reply to comment #19) Have a patch on http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01968.html . Fixes also the reload failure on x86 -O2 -fPIC on this testcase (which hits glibc): Do we need a solution for %. decorations of matched operands? strtoul() will not return 0 for %0 operands, as described in Comment 11. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #14 from jakub at gcc dot gnu dot org 2007-09-26 21:19 --- This isn't related to commutative constraints, can be reproduced with: void mul_basecase (unsigned long *wp, unsigned long *up, long un, unsigned long *vp, long vn) { long j; unsigned long prod_low, prod_high; unsigned long cy_dig; unsigned long v_limb; v_limb = vp[0]; cy_dig = 0; for (j = un; j 0; j--) { unsigned long u_limb, w_limb; u_limb = *up++; __asm__ (mulq %3 : =a (prod_low), =d (prod_high) : 0 (u_limb), rm (v_limb)); __asm__ (addq %5,%q1\n\tadcq %3,%q0 : =r (cy_dig), =r (w_limb) : 0 (prod_high), rme (0), 1 (prod_low), rme (cy_dig)); *wp++ = w_limb; } } The problem is that match_asm_constraints_1 doesn't do any checking whether the change it wants to do is actually valid. Particularly it must and does not check whether the output (whose value it will kill in the new insn prepended before the asm) isn't among inputs of the asm. Also, I wonder whether it shouldn't limit any changes to REGs, ATM it will happily copy around MEMs etc., which IMHO is highly undesirable. When the output is present among inputs (except for the one with matching constraint), we have either a choice to create a new pseudo or just don't do anything. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #15 from jakub at gcc dot gnu dot org 2007-09-26 21:24 --- The restriction at least not to allow MEM_Ps was posted in: http://gcc.gnu.org/ml/gcc-patches/2007-07/msg01329.html but never applied to the trunk. But I believe it should instead check just for REG_P, instead of !MEM_P. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #12 from ubizjak at gmail dot com 2007-09-25 18:57 --- (In reply to comment #1) marking %0 early-clobbered fixes the problem. Please look at comment #7. Confirmed as a 4.3 regression, something is wrong in match_asm_constraints_1. -- ubizjak at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2007-09-25 18:57:01 date|| Summary|wrong code for multiple |[4.3 Regression] wrong code |output asm, wrong df? |for multiple output asm, ||wrong df? Target Milestone|--- |4.3.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552
[Bug rtl-optimization/33552] [4.3 Regression] wrong code for multiple output asm, wrong df?
--- Comment #13 from ubizjak at gmail dot com 2007-09-25 19:29 --- (In reply to comment #12) marking %0 early-clobbered fixes the problem. Please look at comment #7. I think I need some sleep. I was thinking of comment #11. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33552