Re: RFA: one more version of the patch for PR61360
Uros Bizjak writes: > On Sat, Oct 4, 2014 at 12:49 PM, Richard Sandiford > wrote: >> Uros Bizjak writes: >>> On Thu, Oct 2, 2014 at 10:13 PM, Vladimir Makarov >>> wrote: >>> >>> I guess we achieved the consensus about the following patch to fix >>> PR61360 >>> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 >>> >>> The patch was successfully bootstrapped and tested (w/wo >>> -march=amdfam10) on x86/x86-64. >>> >>> Is it ok to commit to trunk? >> >> >> I've tested your patch and unfortunately it doesn't work: >> >> In file included from >> /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0: >> /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void >> Process(JSContext*, JSObject*, const char*, bool)’: >> /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler >> error: in lra_update_insn_recog_data, at lra.c:1221 >> } >> ^ >> 0xa9d9ec lra_update_insn_recog_data(rtx_insn*) >> ../../gcc/gcc/lra.c:1220 >> 0xab450f eliminate_regs_in_insn >> ../../gcc/gcc/lra-eliminations.c:1077 >> 0xab450f process_insn_for_elimination >> ../../gcc/gcc/lra-eliminations.c:1344 >> 0xab450f lra_eliminate(bool, bool) >> ../../gcc/gcc/lra-eliminations.c:1408 >> 0xa9f2da lra(_IO_FILE*) >> ../../gcc/gcc/lra.c:2270 >> 0xa5d659 do_reload >> ../../gcc/gcc/ira.c:5311 >> 0xa5d659 execute >> ../../gcc/gcc/ira.c:5470 > > > Testcase is attached: > > % g++ -c -march=amdfam10 -w -O2 js.ii > js.ii: In function ‘void RunFile(C)’: > js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at > lra.c:1221 > Thanks for reporting this, Marcus. The problem now is in optimize_function_for_size_p. It is false, when we define and cache enable attributes at early stages (instantation of virtual regs) and true later. It is returning us to the same problem. I believe that we should not have enable attributes depending on optimization options or on the current running pass if we don't want the current solution in the trunk (with recog_init). Setting right value for optimize_function_for_size_p does not solve the problem as we can have different options for different functions in the same compilation file. So minimal solution would be removing optimize_function_for_size_p from the attribute definition. But I guess we could remove all condition. Unfortunately, Ganesh did not post is it really beneficial to switch off this alternative for AMD CPUs even if AMD optimization guide recommends it. >>> >>> I propose to split the pattern into size-optimized and normal pattern. >>> The patch implements this proposal. >> >> An alternative would be to add two new enabled-like attributes, >> "good_for_size" and "good_for_speed", that say whether it is efficient >> to use a particular alternative. These attributes wouldn't ever stop >> an existing instruction from being recognised; they would just say >> whether the RA and optimisers should consider the alternative when >> optimising for size or speed respectively. The attributes would have >> the same restrictions as the enabled attribute and could be cached in >> the same way. >> >> The "enabled" attribute would then be purely about whether an >> instruction is available, not whether it's efficient in a particular >> situation. >> >> The main advantage is that it would be possible to make the size/speed >> choice at a basic block level rather than a function level. In the worst >> case we might move an instruction from a hot block to a cold block or >> vice versa, but with intelligent spilling that shouldn't happen too >> often and at least it wouldn't trigger an ICE. >> >> If that sounds OK, I'll try to get something together next week. >> I think Ramana said he had a use for this on ARM too. > > I think that this would be way better than duplication of patterns. > Perhaps we can name these attributes "enable_for_size" and > "enable_for_speed", and have them in addition to "enable" attribute. > The final enable condition would be an union of "enable", > "enable_for_speed" and "enable_for_size" attributes. I was trying to avoid having "enable" in the names because the new attributes would only be strong optimisation hints whereas "enabled" is a correctness thing. We'd need to ignore the new attributes when constraining an existing instruction (which might have been optimised the opposite way and then moved) but we can never ignore "enabled". E.g. to take two examples, when constrain_operands is used like this: if (! constrain_operands (1)) fatal_insn_not_found (insn); then only the enabled attribute matters, whereas for validate_change I think we'd want to take the new attributes into account as well. "good_
Re: RFA: one more version of the patch for PR61360
On Sat, Oct 4, 2014 at 12:49 PM, Richard Sandiford wrote: > Uros Bizjak writes: >> On Thu, Oct 2, 2014 at 10:13 PM, Vladimir Makarov >> wrote: >> >> I guess we achieved the consensus about the following patch to fix >> PR61360 >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 >> >> The patch was successfully bootstrapped and tested (w/wo >> -march=amdfam10) on x86/x86-64. >> >> Is it ok to commit to trunk? > > > I've tested your patch and unfortunately it doesn't work: > > In file included from > /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0: > /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void > Process(JSContext*, JSObject*, const char*, bool)’: > /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler > error: in lra_update_insn_recog_data, at lra.c:1221 > } > ^ > 0xa9d9ec lra_update_insn_recog_data(rtx_insn*) > ../../gcc/gcc/lra.c:1220 > 0xab450f eliminate_regs_in_insn > ../../gcc/gcc/lra-eliminations.c:1077 > 0xab450f process_insn_for_elimination > ../../gcc/gcc/lra-eliminations.c:1344 > 0xab450f lra_eliminate(bool, bool) > ../../gcc/gcc/lra-eliminations.c:1408 > 0xa9f2da lra(_IO_FILE*) > ../../gcc/gcc/lra.c:2270 > 0xa5d659 do_reload > ../../gcc/gcc/ira.c:5311 > 0xa5d659 execute > ../../gcc/gcc/ira.c:5470 Testcase is attached: % g++ -c -march=amdfam10 -w -O2 js.ii js.ii: In function ‘void RunFile(C)’: js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at lra.c:1221 >>> >>> Thanks for reporting this, Marcus. The problem now is in >>> optimize_function_for_size_p. It is false, when we define and cache enable >>> attributes at early stages (instantation of virtual regs) and true later. >>> >>> It is returning us to the same problem. I believe that we should not have >>> enable attributes depending on optimization options or on the current >>> running pass if we don't want the current solution in the trunk (with >>> recog_init). Setting right value for optimize_function_for_size_p does not >>> solve the problem as we can have different options for different functions >>> in the same compilation file. >>> >>> So minimal solution would be removing optimize_function_for_size_p from the >>> attribute definition. But I guess we could remove all condition. >>> Unfortunately, Ganesh did not post is it really beneficial to switch off >>> this alternative for AMD CPUs even if AMD optimization guide recommends it. >> >> I propose to split the pattern into size-optimized and normal pattern. >> The patch implements this proposal. > > An alternative would be to add two new enabled-like attributes, > "good_for_size" and "good_for_speed", that say whether it is efficient > to use a particular alternative. These attributes wouldn't ever stop > an existing instruction from being recognised; they would just say > whether the RA and optimisers should consider the alternative when > optimising for size or speed respectively. The attributes would have > the same restrictions as the enabled attribute and could be cached in > the same way. > > The "enabled" attribute would then be purely about whether an > instruction is available, not whether it's efficient in a particular > situation. > > The main advantage is that it would be possible to make the size/speed > choice at a basic block level rather than a function level. In the worst > case we might move an instruction from a hot block to a cold block or > vice versa, but with intelligent spilling that shouldn't happen too > often and at least it wouldn't trigger an ICE. > > If that sounds OK, I'll try to get something together next week. > I think Ramana said he had a use for this on ARM too. I think that this would be way better than duplication of patterns. Perhaps we can name these attributes "enable_for_size" and "enable_for_speed", and have them in addition to "enable" attribute. The final enable condition would be an union of "enable", "enable_for_speed" and "enable_for_size" attributes. Uros.
Re: RFA: one more version of the patch for PR61360
Uros Bizjak writes: > On Thu, Oct 2, 2014 at 10:13 PM, Vladimir Makarov wrote: > > I guess we achieved the consensus about the following patch to fix > PR61360 > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 > > The patch was successfully bootstrapped and tested (w/wo > -march=amdfam10) on x86/x86-64. > > Is it ok to commit to trunk? I've tested your patch and unfortunately it doesn't work: In file included from /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0: /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void Process(JSContext*, JSObject*, const char*, bool)’: /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler error: in lra_update_insn_recog_data, at lra.c:1221 } ^ 0xa9d9ec lra_update_insn_recog_data(rtx_insn*) ../../gcc/gcc/lra.c:1220 0xab450f eliminate_regs_in_insn ../../gcc/gcc/lra-eliminations.c:1077 0xab450f process_insn_for_elimination ../../gcc/gcc/lra-eliminations.c:1344 0xab450f lra_eliminate(bool, bool) ../../gcc/gcc/lra-eliminations.c:1408 0xa9f2da lra(_IO_FILE*) ../../gcc/gcc/lra.c:2270 0xa5d659 do_reload ../../gcc/gcc/ira.c:5311 0xa5d659 execute ../../gcc/gcc/ira.c:5470 >>> >>> >>> Testcase is attached: >>> >>> % g++ -c -march=amdfam10 -w -O2 js.ii >>> js.ii: In function ‘void RunFile(C)’: >>> js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at >>> lra.c:1221 >>> >> >> Thanks for reporting this, Marcus. The problem now is in >> optimize_function_for_size_p. It is false, when we define and cache enable >> attributes at early stages (instantation of virtual regs) and true later. >> >> It is returning us to the same problem. I believe that we should not have >> enable attributes depending on optimization options or on the current >> running pass if we don't want the current solution in the trunk (with >> recog_init). Setting right value for optimize_function_for_size_p does not >> solve the problem as we can have different options for different functions >> in the same compilation file. >> >> So minimal solution would be removing optimize_function_for_size_p from the >> attribute definition. But I guess we could remove all condition. >> Unfortunately, Ganesh did not post is it really beneficial to switch off >> this alternative for AMD CPUs even if AMD optimization guide recommends it. > > I propose to split the pattern into size-optimized and normal pattern. > The patch implements this proposal. An alternative would be to add two new enabled-like attributes, "good_for_size" and "good_for_speed", that say whether it is efficient to use a particular alternative. These attributes wouldn't ever stop an existing instruction from being recognised; they would just say whether the RA and optimisers should consider the alternative when optimising for size or speed respectively. The attributes would have the same restrictions as the enabled attribute and could be cached in the same way. The "enabled" attribute would then be purely about whether an instruction is available, not whether it's efficient in a particular situation. The main advantage is that it would be possible to make the size/speed choice at a basic block level rather than a function level. In the worst case we might move an instruction from a hot block to a cold block or vice versa, but with intelligent spilling that shouldn't happen too often and at least it wouldn't trigger an ICE. If that sounds OK, I'll try to get something together next week. I think Ramana said he had a use for this on ARM too. Thanks, Richard
Re: RFA: one more version of the patch for PR61360
On Thu, Oct 2, 2014 at 10:13 PM, Vladimir Makarov wrote: I guess we achieved the consensus about the following patch to fix PR61360 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 The patch was successfully bootstrapped and tested (w/wo -march=amdfam10) on x86/x86-64. Is it ok to commit to trunk? >>> >>> >>> I've tested your patch and unfortunately it doesn't work: >>> >>> In file included from >>> /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0: >>> /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void >>> Process(JSContext*, JSObject*, const char*, bool)’: >>> /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler >>> error: in lra_update_insn_recog_data, at lra.c:1221 >>> } >>> ^ >>> 0xa9d9ec lra_update_insn_recog_data(rtx_insn*) >>> ../../gcc/gcc/lra.c:1220 >>> 0xab450f eliminate_regs_in_insn >>> ../../gcc/gcc/lra-eliminations.c:1077 >>> 0xab450f process_insn_for_elimination >>> ../../gcc/gcc/lra-eliminations.c:1344 >>> 0xab450f lra_eliminate(bool, bool) >>> ../../gcc/gcc/lra-eliminations.c:1408 >>> 0xa9f2da lra(_IO_FILE*) >>> ../../gcc/gcc/lra.c:2270 >>> 0xa5d659 do_reload >>> ../../gcc/gcc/ira.c:5311 >>> 0xa5d659 execute >>> ../../gcc/gcc/ira.c:5470 >> >> >> Testcase is attached: >> >> % g++ -c -march=amdfam10 -w -O2 js.ii >> js.ii: In function ‘void RunFile(C)’: >> js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at >> lra.c:1221 >> > > Thanks for reporting this, Marcus. The problem now is in > optimize_function_for_size_p. It is false, when we define and cache enable > attributes at early stages (instantation of virtual regs) and true later. > > It is returning us to the same problem. I believe that we should not have > enable attributes depending on optimization options or on the current > running pass if we don't want the current solution in the trunk (with > recog_init). Setting right value for optimize_function_for_size_p does not > solve the problem as we can have different options for different functions > in the same compilation file. > > So minimal solution would be removing optimize_function_for_size_p from the > attribute definition. But I guess we could remove all condition. > Unfortunately, Ganesh did not post is it really beneficial to switch off > this alternative for AMD CPUs even if AMD optimization guide recommends it. I propose to split the pattern into size-optimized and normal pattern. The patch implements this proposal. Uros. Index: i386.md === --- i386.md (revision 215797) +++ i386.md (working copy) @@ -4766,6 +4766,38 @@ } }) +(define_insn "*float2_sse_size" + [(set (match_operand:MODEF 0 "register_operand" "=f,x,x") + (float:MODEF + (match_operand:SWI48 1 "nonimmediate_operand" "m,r,m")))] + "SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH + && optimize_function_for_size_p (cfun)" + "@ + fild%Z1\t%1 + %vcvtsi2\t{%1, %d0|%d0, %1} + %vcvtsi2\t{%1, %d0|%d0, %1}" + [(set_attr "type" "fmov,sseicvt,sseicvt") + (set_attr "prefix" "orig,maybe_vex,maybe_vex") + (set_attr "mode" "") + (set (attr "prefix_rex") + (if_then_else + (and (eq_attr "prefix" "maybe_vex") + (match_test "mode == DImode")) + (const_string "1") + (const_string "*"))) + (set_attr "unit" "i387,*,*") + (set_attr "athlon_decode" "*,double,direct") + (set_attr "amdfam10_decode" "*,vector,double") + (set_attr "bdver1_decode" "*,double,direct") + (set_attr "fp_int_src" "true") + (set (attr "enabled") + (cond [(eq_attr "alternative" "0") + (symbol_ref "TARGET_MIX_SSE_I387 + && X87_ENABLE_FLOAT (mode, +mode)") + ] + (const_int 1)))]) + (define_insn "*float2_sse" [(set (match_operand:MODEF 0 "register_operand" "=f,x,x") (float:MODEF @@ -4795,16 +4827,9 @@ && X87_ENABLE_FLOAT (mode, mode)") (eq_attr "alternative" "1") - /* ??? For sched1 we need constrain_operands to be able to - select an alternative. Leave this enabled before RA. */ - (symbol_ref "TARGET_INTER_UNIT_CONVERSIONS - || optimize_function_for_size_p (cfun) - || !(reload_completed -|| reload_in_progress -|| lra_in_progress)") + (symbol_ref "TARGET_INTER_UNIT_CONVERSIONS") ] - (symbol_ref "true"))) - ]) + (const_int 1)))]) (define_insn "*float2_i387" [(set (match_operand:MODEF 0 "register_operand" "=f")
Re: RFA: one more version of the patch for PR61360
On 2014-10-02 7:31 AM, Markus Trippelsdorf wrote: On 2014.10.02 at 09:17 +0200, Markus Trippelsdorf wrote: On 2014.09.26 at 16:31 -0400, Vladimir Makarov wrote: I guess we achieved the consensus about the following patch to fix PR61360 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 The patch was successfully bootstrapped and tested (w/wo -march=amdfam10) on x86/x86-64. Is it ok to commit to trunk? I've tested your patch and unfortunately it doesn't work: In file included from /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0: /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void Process(JSContext*, JSObject*, const char*, bool)’: /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler error: in lra_update_insn_recog_data, at lra.c:1221 } ^ 0xa9d9ec lra_update_insn_recog_data(rtx_insn*) ../../gcc/gcc/lra.c:1220 0xab450f eliminate_regs_in_insn ../../gcc/gcc/lra-eliminations.c:1077 0xab450f process_insn_for_elimination ../../gcc/gcc/lra-eliminations.c:1344 0xab450f lra_eliminate(bool, bool) ../../gcc/gcc/lra-eliminations.c:1408 0xa9f2da lra(_IO_FILE*) ../../gcc/gcc/lra.c:2270 0xa5d659 do_reload ../../gcc/gcc/ira.c:5311 0xa5d659 execute ../../gcc/gcc/ira.c:5470 Testcase is attached: % g++ -c -march=amdfam10 -w -O2 js.ii js.ii: In function ‘void RunFile(C)’: js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at lra.c:1221 Thanks for reporting this, Marcus. The problem now is in optimize_function_for_size_p. It is false, when we define and cache enable attributes at early stages (instantation of virtual regs) and true later. It is returning us to the same problem. I believe that we should not have enable attributes depending on optimization options or on the current running pass if we don't want the current solution in the trunk (with recog_init). Setting right value for optimize_function_for_size_p does not solve the problem as we can have different options for different functions in the same compilation file. So minimal solution would be removing optimize_function_for_size_p from the attribute definition. But I guess we could remove all condition. Unfortunately, Ganesh did not post is it really beneficial to switch off this alternative for AMD CPUs even if AMD optimization guide recommends it.
Re: RFA: one more version of the patch for PR61360
On 2014.10.02 at 09:17 +0200, Markus Trippelsdorf wrote: > On 2014.09.26 at 16:31 -0400, Vladimir Makarov wrote: > > I guess we achieved the consensus about the following patch to fix PR61360 > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 > > > > The patch was successfully bootstrapped and tested (w/wo > > -march=amdfam10) on x86/x86-64. > > > > Is it ok to commit to trunk? > > I've tested your patch and unfortunately it doesn't work: > > In file included from > /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0: > /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void > Process(JSContext*, JSObject*, const char*, bool)’: > /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler error: > in lra_update_insn_recog_data, at lra.c:1221 > } > ^ > 0xa9d9ec lra_update_insn_recog_data(rtx_insn*) > ../../gcc/gcc/lra.c:1220 > 0xab450f eliminate_regs_in_insn > ../../gcc/gcc/lra-eliminations.c:1077 > 0xab450f process_insn_for_elimination > ../../gcc/gcc/lra-eliminations.c:1344 > 0xab450f lra_eliminate(bool, bool) > ../../gcc/gcc/lra-eliminations.c:1408 > 0xa9f2da lra(_IO_FILE*) > ../../gcc/gcc/lra.c:2270 > 0xa5d659 do_reload > ../../gcc/gcc/ira.c:5311 > 0xa5d659 execute > ../../gcc/gcc/ira.c:5470 Testcase is attached: % g++ -c -march=amdfam10 -w -O2 js.ii js.ii: In function ‘void RunFile(C)’: js.ii:64:1: internal compiler error: in lra_update_insn_recog_data, at lra.c:1221 -- Markus void printf(...); class A; template class B; class C { public: template C(S); }; template class D { public: void m_fn1(int); }; template <> class B : public D {}; typedef bool (*JSNative)(int *, unsigned, A *); class F { public: B m_fn2(); }; struct G { JSNative a; }; struct H { H(int, int, int); }; class I { public: int m_fn3(); }; class J { public: enum Result {}; H helpOption; int usage; int ver; int descr; int descrWidth; int helpWidth; int nextArgument; int restArgument; J(int) : helpOption('h', 0, 0), usage(ver), descr(descrWidth), helpWidth(nextArgument), restArgument() {} Result m_fn4(); int m_fn5(); }; class OffThreadState { public: int m_fn6(); } d; int f, h; long PRMJ_Now(); static void RunFile(C) { long a = PRMJ_Now(); printf(double(a) / 0); } static void Process(int, int) { RunFile(0); } bool Now(int *, unsigned, A *) { F b; double c = PRMJ_Now(); b.m_fn2().m_fn1(c); } int InitWatchdog(); G e{Now}; static int Shell() { I g; if (g.m_fn3()) Process(0, 0); if (f) Process(0, 0); } main() { J i(0); switch (i.m_fn4()) case 0: return 0; if (i.m_fn5()) return 0; if (h) return 0; if (d.m_fn6()) return 0; if (InitWatchdog()) return 0; Shell(); }
Re: RFA: one more version of the patch for PR61360
On 2014.09.26 at 16:31 -0400, Vladimir Makarov wrote: > I guess we achieved the consensus about the following patch to fix PR61360 > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 > > The patch was successfully bootstrapped and tested (w/wo > -march=amdfam10) on x86/x86-64. > > Is it ok to commit to trunk? I've tested your patch and unfortunately it doesn't work: In file included from /var/tmp/moz-build-dir/js/src/shell/Unified_cpp_js_src_shell0.cpp:15:0: /var/tmp/mozilla-central/js/src/shell/js.cpp: In function ‘void Process(JSContext*, JSObject*, const char*, bool)’: /var/tmp/mozilla-central/js/src/shell/js.cpp:592:1: internal compiler error: in lra_update_insn_recog_data, at lra.c:1221 } ^ 0xa9d9ec lra_update_insn_recog_data(rtx_insn*) ../../gcc/gcc/lra.c:1220 0xab450f eliminate_regs_in_insn ../../gcc/gcc/lra-eliminations.c:1077 0xab450f process_insn_for_elimination ../../gcc/gcc/lra-eliminations.c:1344 0xab450f lra_eliminate(bool, bool) ../../gcc/gcc/lra-eliminations.c:1408 0xa9f2da lra(_IO_FILE*) ../../gcc/gcc/lra.c:2270 0xa5d659 do_reload ../../gcc/gcc/ira.c:5311 0xa5d659 execute ../../gcc/gcc/ira.c:5470 Please submit a full bug report, -- Markus
Re: RFA: one more version of the patch for PR61360
On Fri, Sep 26, 2014 at 10:31 PM, Vladimir Makarov wrote: > I guess we achieved the consensus about the following patch to fix PR61360 > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 > > The patch was successfully bootstrapped and tested (w/wo -march=amdfam10) on > x86/x86-64. > > Is it ok to commit to trunk? > > 2014-09-26 Vladimir Makarov > > PR target/61360 > * lra.c (lra): Remove call of recog_init. > * recog.c (constrain_operands): Permit reg for memory constraint > when LRA is used. > * config/i386/i386.md (*float2_sse): > Enable first alternative independently on RA stage. Please also mention that this patch reverts: 2014-04-01 Richard Henderson PR target/60704 * config/i386/i386.md (*float2_sse): Leave the second alternative enabled before register allocation. The x86 part is OK. BTW: I think that this patch should also be backported to 4.9 branch, since the original patch and the PR60704 fix were also installed there. Thanks, Uros.
Re: RFA: one more version of the patch for PR61360
Hi Vlad, Vladimir Makarov writes: > I guess we achieved the consensus about the following patch to fix PR61360 > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61360 > > The patch was successfully bootstrapped and tested (w/wo > -march=amdfam10) on x86/x86-64. > > Is it ok to commit to trunk? > > 2014-09-26 Vladimir Makarov > > PR target/61360 > * lra.c (lra): Remove call of recog_init. > * recog.c (constrain_operands): Permit reg for memory constraint > when LRA is used. > * config/i386/i386.md (*float2_sse): > Enable first alternative independently on RA stage. Many thanks for doing this! I hadn't realised when seeing the bug originally that LRA made the recog.c part possible. Obviously I can't approve it, but this approach seems much cleaner to me. Richard