Re: [PATCH] Fix ssa coalescing with inline asm (PR middle-end/70593)
On 04/12/2016 09:06 AM, Jakub Jelinek wrote: On Tue, Apr 12, 2016 at 09:01:58AM -0400, Andrew MacLeod wrote: I poked around for a while and found a patch from late 2012 where I was attempting to implement atomics as a gimple primitive (GIMPLE_ATOMIC). That work did indeed add the ability of the CAS primitive to return 2 values to op_iter_init and friends. I recall there were maybe a half dozen places in the compiler that needed to be updated to deal with 2 results for a gimple statement. It never went anywhere because I ran into a bunch of other problems that frustrated me and led me down the frontend/backend separation path :-P Today, we still just let RTL patterns optimize the conditional result and expected-value address parameter from a compare_exchange... so we simply bypass the 2 results issue for a CAS and push it to the target :-). For the __builtin_*_overflow builtins, we lower them into internal functions that return complex integer that contains both the values we want (result and overflow flag) and then just extract the real and imag part of that. Ugly, but it works without major changes to all optimizers. Jakub I suppose it shouldn't be too hard to abstract it a bit so that we aren't special casing GIMPLE_ASM, but rather check a flag on the statement kind and use a call back to handle the multiple defs that need handling as an exception.. And do it in some way so it doesn't slow down the normal case. I expect everything required is buried somewhere in my patch. Maybe someday when I'm looking for a something smallish and different to do I'll have a look and see if there is anything worth pursuing. It does seem to come up every year or two. Andrew
Re: [PATCH] Fix ssa coalescing with inline asm (PR middle-end/70593)
On Tue, Apr 12, 2016 at 09:01:58AM -0400, Andrew MacLeod wrote: > I poked around for a while and found a patch from late 2012 where I was > attempting to implement atomics as a gimple primitive (GIMPLE_ATOMIC). That > work did indeed add the ability of the CAS primitive to return 2 values to > op_iter_init and friends. I recall there were maybe a half dozen places in > the compiler that needed to be updated to deal with 2 results for a gimple > statement. > > It never went anywhere because I ran into a bunch of other problems that > frustrated me and led me down the frontend/backend separation path :-P > > Today, we still just let RTL patterns optimize the conditional result and > expected-value address parameter from a compare_exchange... so we simply > bypass the 2 results issue for a CAS and push it to the target :-). For the __builtin_*_overflow builtins, we lower them into internal functions that return complex integer that contains both the values we want (result and overflow flag) and then just extract the real and imag part of that. Ugly, but it works without major changes to all optimizers. Jakub
Re: [PATCH] Fix ssa coalescing with inline asm (PR middle-end/70593)
On 04/12/2016 04:35 AM, Richard Biener wrote: On Mon, 11 Apr 2016, Jeff Law wrote: On 04/08/2016 10:10 AM, Jakub Jelinek wrote: On Fri, Apr 08, 2016 at 06:04:38PM +0200, Richard Biener wrote: Hmm, don't we simply want to do this for all stmts (OK, only asm have multiple defs...)? For all stmts that have multiple defs (which is only GIMPLE_ASM right now). I thought the atomic support added support for statements with multiple outputs? Maybe on a branch. Historically all gimples had the ability to have multiple DEFs but this was removed as having excessive memory requirement. Later DEFs handling was simplified and we no longer keep def "operands" explicit but the iterators know how to get at all possible defs. Which makes it "currently only ASMs can have multiple (non-virtual) defs". See ssa-iterators.h:op_iter_init and op_iter_next_def. Richard. I poked around for a while and found a patch from late 2012 where I was attempting to implement atomics as a gimple primitive (GIMPLE_ATOMIC). That work did indeed add the ability of the CAS primitive to return 2 values to op_iter_init and friends. I recall there were maybe a half dozen places in the compiler that needed to be updated to deal with 2 results for a gimple statement. It never went anywhere because I ran into a bunch of other problems that frustrated me and led me down the frontend/backend separation path :-P Today, we still just let RTL patterns optimize the conditional result and expected-value address parameter from a compare_exchange... so we simply bypass the 2 results issue for a CAS and push it to the target :-). Andrew
Re: [PATCH] Fix ssa coalescing with inline asm (PR middle-end/70593)
On Mon, 11 Apr 2016, Jeff Law wrote: > On 04/08/2016 10:10 AM, Jakub Jelinek wrote: > > On Fri, Apr 08, 2016 at 06:04:38PM +0200, Richard Biener wrote: > > > Hmm, don't we simply want to do this for all stmts (OK, only asm have > > > multiple defs...)? > > > > For all stmts that have multiple defs (which is only GIMPLE_ASM right now). > I thought the atomic support added support for statements with multiple > outputs? Maybe on a branch. Historically all gimples had the ability to have multiple DEFs but this was removed as having excessive memory requirement. Later DEFs handling was simplified and we no longer keep def "operands" explicit but the iterators know how to get at all possible defs. Which makes it "currently only ASMs can have multiple (non-virtual) defs". See ssa-iterators.h:op_iter_init and op_iter_next_def. Richard.
Re: [PATCH] Fix ssa coalescing with inline asm (PR middle-end/70593)
On 04/08/2016 10:10 AM, Jakub Jelinek wrote: On Fri, Apr 08, 2016 at 06:04:38PM +0200, Richard Biener wrote: Hmm, don't we simply want to do this for all stmts (OK, only asm have multiple defs...)? For all stmts that have multiple defs (which is only GIMPLE_ASM right now). I thought the atomic support added support for statements with multiple outputs? Jeff
Re: [PATCH] Fix ssa coalescing with inline asm (PR middle-end/70593)
On April 8, 2016 6:10:05 PM GMT+02:00, Jakub Jelinekwrote: >On Fri, Apr 08, 2016 at 06:04:38PM +0200, Richard Biener wrote: >> Hmm, don't we simply want to do this for all stmts (OK, only asm have >multiple defs...)? > >For all stmts that have multiple defs (which is only GIMPLE_ASM right >now). >Though, of course, if you want, unconditionally doing: > bool first = true; > FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_DEF) > if (first) > first = false; > else > live_track_process_use (live, var); >would work too and would be prepared for future hypothetical stmts >with multiple defs. Yes, that's what I am suggesting. The reason why I'm not calling >live_track_process_use >on the first SSA_OP_DEF operand is that it is completely useless, >because the first live_track_process_def will undo it immediately. >For the above, I guess I'd just slightly adjust the comment, instead of > For GIMPLE_ASM as the only statement which can have > more than one SSA_NAME definition, ... >say > For stmts with more than one SSA_NAME definition, ... Sounds good. Richard. > Jakub
Re: [PATCH] Fix ssa coalescing with inline asm (PR middle-end/70593)
On Fri, Apr 08, 2016 at 06:04:38PM +0200, Richard Biener wrote: > Hmm, don't we simply want to do this for all stmts (OK, only asm have > multiple defs...)? For all stmts that have multiple defs (which is only GIMPLE_ASM right now). Though, of course, if you want, unconditionally doing: bool first = true; FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_DEF) if (first) first = false; else live_track_process_use (live, var); would work too and would be prepared for future hypothetical stmts with multiple defs. The reason why I'm not calling live_track_process_use on the first SSA_OP_DEF operand is that it is completely useless, because the first live_track_process_def will undo it immediately. For the above, I guess I'd just slightly adjust the comment, instead of For GIMPLE_ASM as the only statement which can have more than one SSA_NAME definition, ... say For stmts with more than one SSA_NAME definition, ... Jakub
Re: [PATCH] Fix ssa coalescing with inline asm (PR middle-end/70593)
On April 8, 2016 4:54:22 PM GMT+02:00, Jakub Jelinekwrote: >Hi! > >xen is miscompiled since Alex' SSA coalescing changes. >The problem is that we have there inline asm that sets more than one >SSA_NAME, one of them is dead though (has zero uses) and because of >the zero uses coalescing doesn't see any conflicts and puts both >the SSA_NAMEs in the two GIMPLE_ASM outputs into the same partition. >During expansion we then emit the ASM_OPERANDS followed by copying of >the >two outputs into DECL_RTL of their SSA_NAMEs. But as both are in the >same >partition, that is the same pseudo. The first one is what we want, >and the second one is for the zero uses SSA_NAME, which then overwrites >the right one with unrelated value. >(insn 9 6 7 2 (parallel [ >(set (reg:DI 89 [ c ]) >(asm_operands/v:DI ("xorl %k1, %k1 > movl$7, %k0") ("=c") 0 [ >(reg/v:DI 87 [ ]) >(reg/v:DI 87 [ ]) >] > [ >(asm_input:DI ("0") pr70593.c:9) >(asm_input:DI ("1") pr70593.c:9) >] > [] pr70593.c:9)) >(set (reg:DI 90 [ a ]) >(asm_operands/v:DI ("xorl %k1, %k1 > movl$7, %k0") ("=a") 1 [ >(reg/v:DI 87 [ ]) >(reg/v:DI 87 [ ]) >] > [ >(asm_input:DI ("0") pr70593.c:9) >(asm_input:DI ("1") pr70593.c:9) >] > [] pr70593.c:9)) >(clobber (mem:BLK (scratch) [0 A8])) >(clobber (reg:CCFP 18 fpsr)) >(clobber (reg:CC 17 flags)) >]) pr70593.c:9 -1 > (nil)) >(insn 7 9 8 2 (set (reg/v:DI 87 [ ]) >(reg:DI 89 [ c ])) pr70593.c:9 -1 > (nil)) >(insn 8 7 13 2 (set (reg/v:DI 87 [ ]) >(reg:DI 90 [ a ])) pr70593.c:9 -1 > (nil)) >The following patch handles this by making sure we record conflicts >between multiple SSA_NAME outputs of the GIMPLE_ASM, even when some of >them >are not live, by emulating what really happens during expansion - that >they are live in the moves after the asm insn. The >live_track_process_def calls later on remove it again from the live >partitions and thus undo the live_track_process_use effect, so all >the patch changes is that during the processing of e.g. the first >SSA_NAME output it sees the second (and perhaps third ...) SSA_NAME >output >as live and adds conflict, similarly when processing the second one >if there are more than two, it again sees the third one and adds >conflict >etc. > >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Hmm, don't we simply want to do this for all stmts (OK, only asm have multiple defs...)? Otherwise OK.thanks, Richard. >2016-04-08 Jakub Jelinek > > PR middle-end/70593 > * tree-ssa-coalesce.c (build_ssa_conflict_graph): For GIMPLE_ASM > with multiple SSA_NAME defs, force the outputs other than first > to be live before calling live_track_process_def on each output. > > * gcc.target/i386/pr70593.c: New test. > >--- gcc/tree-ssa-coalesce.c.jj 2016-03-30 16:00:17.0 +0200 >+++ gcc/tree-ssa-coalesce.c2016-04-08 12:36:26.409403204 +0200 >@@ -905,6 +905,27 @@ build_ssa_conflict_graph (tree_live_info > } > else if (is_gimple_debug (stmt)) > continue; >+else if (gimple_code (stmt) == GIMPLE_ASM >+ && gimple_asm_noutputs (as_a (stmt)) > 1) >+ { >+/* For GIMPLE_ASM as the only statement which can have >+ more than one SSA_NAME definition, pretend all the >+ SSA_NAME outputs but the first one are live at this point, >+ so that conflicts are added in between all those even >+ when they are actually not really live after the asm, >+ because expansion might copy those into pseudos after >+ the asm and if multiple outputs share the same partition, >+ it might overwrite those that should be live. E.g. >+ asm volatile (".." : "=r" (a) : "=r" (b) : "0" (a), "1" (a)); >+ return a; >+ See PR70593. */ >+bool first = true; >+FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_DEF) >+ if (first) >+first = false; >+ else >+live_track_process_use (live, var); >+ } > > FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_DEF) > live_track_process_def (live, var, graph); >--- gcc/testsuite/gcc.target/i386/pr70593.c.jj 2016-04-08 >12:59:27.352278471 +0200 >+++ gcc/testsuite/gcc.target/i386/pr70593.c2016-04-08 >12:48:27.0 +0200 >@@ -0,0 +1,19 @@ >+/* PR middle-end/70593 */ >+/* { dg-do run }
[PATCH] Fix ssa coalescing with inline asm (PR middle-end/70593)
Hi! xen is miscompiled since Alex' SSA coalescing changes. The problem is that we have there inline asm that sets more than one SSA_NAME, one of them is dead though (has zero uses) and because of the zero uses coalescing doesn't see any conflicts and puts both the SSA_NAMEs in the two GIMPLE_ASM outputs into the same partition. During expansion we then emit the ASM_OPERANDS followed by copying of the two outputs into DECL_RTL of their SSA_NAMEs. But as both are in the same partition, that is the same pseudo. The first one is what we want, and the second one is for the zero uses SSA_NAME, which then overwrites the right one with unrelated value. (insn 9 6 7 2 (parallel [ (set (reg:DI 89 [ c ]) (asm_operands/v:DI ("xorl %k1, %k1 movl$7, %k0") ("=c") 0 [ (reg/v:DI 87 [ ]) (reg/v:DI 87 [ ]) ] [ (asm_input:DI ("0") pr70593.c:9) (asm_input:DI ("1") pr70593.c:9) ] [] pr70593.c:9)) (set (reg:DI 90 [ a ]) (asm_operands/v:DI ("xorl %k1, %k1 movl$7, %k0") ("=a") 1 [ (reg/v:DI 87 [ ]) (reg/v:DI 87 [ ]) ] [ (asm_input:DI ("0") pr70593.c:9) (asm_input:DI ("1") pr70593.c:9) ] [] pr70593.c:9)) (clobber (mem:BLK (scratch) [0 A8])) (clobber (reg:CCFP 18 fpsr)) (clobber (reg:CC 17 flags)) ]) pr70593.c:9 -1 (nil)) (insn 7 9 8 2 (set (reg/v:DI 87 [ ]) (reg:DI 89 [ c ])) pr70593.c:9 -1 (nil)) (insn 8 7 13 2 (set (reg/v:DI 87 [ ]) (reg:DI 90 [ a ])) pr70593.c:9 -1 (nil)) The following patch handles this by making sure we record conflicts between multiple SSA_NAME outputs of the GIMPLE_ASM, even when some of them are not live, by emulating what really happens during expansion - that they are live in the moves after the asm insn. The live_track_process_def calls later on remove it again from the live partitions and thus undo the live_track_process_use effect, so all the patch changes is that during the processing of e.g. the first SSA_NAME output it sees the second (and perhaps third ...) SSA_NAME output as live and adds conflict, similarly when processing the second one if there are more than two, it again sees the third one and adds conflict etc. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2016-04-08 Jakub JelinekPR middle-end/70593 * tree-ssa-coalesce.c (build_ssa_conflict_graph): For GIMPLE_ASM with multiple SSA_NAME defs, force the outputs other than first to be live before calling live_track_process_def on each output. * gcc.target/i386/pr70593.c: New test. --- gcc/tree-ssa-coalesce.c.jj 2016-03-30 16:00:17.0 +0200 +++ gcc/tree-ssa-coalesce.c 2016-04-08 12:36:26.409403204 +0200 @@ -905,6 +905,27 @@ build_ssa_conflict_graph (tree_live_info } else if (is_gimple_debug (stmt)) continue; + else if (gimple_code (stmt) == GIMPLE_ASM + && gimple_asm_noutputs (as_a (stmt)) > 1) + { + /* For GIMPLE_ASM as the only statement which can have +more than one SSA_NAME definition, pretend all the +SSA_NAME outputs but the first one are live at this point, +so that conflicts are added in between all those even +when they are actually not really live after the asm, +because expansion might copy those into pseudos after +the asm and if multiple outputs share the same partition, +it might overwrite those that should be live. E.g. +asm volatile (".." : "=r" (a) : "=r" (b) : "0" (a), "1" (a)); +return a; +See PR70593. */ + bool first = true; + FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_DEF) + if (first) + first = false; + else + live_track_process_use (live, var); + } FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_DEF) live_track_process_def (live, var, graph); --- gcc/testsuite/gcc.target/i386/pr70593.c.jj 2016-04-08 12:59:27.352278471 +0200 +++ gcc/testsuite/gcc.target/i386/pr70593.c 2016-04-08 12:48:27.0 +0200 @@ -0,0 +1,19 @@ +/* PR middle-end/70593 */ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +__attribute__((noinline, noclone)) unsigned long +foo (unsigned x) +{ + unsigned long a, c = x; + asm volatile ("xorl\t%k1, %k1\n\tmovl\t$7, %k0" : "=c" (c), "=a" (a) : "0" (c), "1" (c) : "memory"); + return c; +} + +int +main ()