On Fri, May 10, 2019 at 11:38:50AM +0000, Wilco Dijkstra wrote: > > __builtin_eh_return is a noreturn call, and we never tail call optimize > > noreturn calls: > > if (flags & ECF_NORETURN) > > { > > maybe_complain_about_tail_call (exp, "callee does not return"); > > return false; > > } > > As both the __builtin_eh_return and other tail calls are points in the CFG > > that are followed by EXIT only, it doesn't make sense to talk about > > tailcalls ignoring __builtin_eh_return. The tailcall is in one epilogue in > > the function, __builtin_eh_return is in another epilogue. We do want that > > add sp, sp, x4 in the eh return epilogue, not in the tail call epilogue. > > Thanks that makes sense, so this change would work fine. > > However I think almost all targets are affected by this bug - I tried 4 random > backends and all had code similar to AArch64: > > if (crtl->calls_eh_return) > { > rtx sa = EH_RETURN_STACKADJ_RTX; > emit_insn (gen_add3_insn (sp_reg_rtx, sp_reg_rtx, sa)); > } > ... with sibcall case handled after this... > > And no code in <target>_function_ok_for_sibcall to block tailcalls... > > So given this and the fact there is no real gain from emitting tailcalls in > eh_return > functions, wouldn't it make more sense to always block tailcalls in the > mid-end?
No. E.g. x86 handles that just fine, the EH_RETURN_STACKADJ_RTX addition is guarded with style == 2 argument to ix86_expand_epilogue, which is only set in eh_return epilogue, not in normal epilogue or sibcall_epilogue. That seems to be the cleanest approach to me, although guarding it with !sibcall or similar as done in this patch should be enough for most of the targets that have the problem; of course there needs to be a way to let targets disallow tail calls in __builtin_eh_return functions, ATM there is not an easy one, because crtl->calls_eh_return is set only during expansion and in the case of if (cond) return tail_call (...); ... __builtin_eh_return (...); the tail call is expanded before crtl->calls_eh_return is set. The question is if we want to replace crtl->calls_eh_return with cfun->calls_eh_return and where we should set it (say in builtins folding and propagate through inlining), or some pass should do that? Can't find a good match ATM though, perhaps just the tailcall pass should check that, like below (untested): --- gcc/tree-tailcall.c.jj 2019-05-08 19:04:58.947797821 +0200 +++ gcc/tree-tailcall.c 2019-05-10 14:16:57.203930630 +0200 @@ -43,6 +43,8 @@ along with GCC; see the file COPYING3. #include "common/common-target.h" #include "ipa-utils.h" #include "tree-ssa-live.h" +#include "memmodel.h" +#include "emit-rtl.h" /* The file implements the tail recursion elimination. It is also used to analyze the tail calls in general, passing the results to the rtl level @@ -1171,6 +1173,25 @@ tree_optimize_tail_calls_1 (bool opt_tai if (phis_constructed) mark_virtual_operands_for_renaming (cfun); + /* If there are any tail calls, check if the function has any + __builtin_eh_return calls, as some targets can't handle tail calls + in functions that call __builtin_eh_return. */ + if (cfun->tail_call_marked) + { + basic_block bb; + + FOR_EACH_BB_FN (bb, cfun) + if (EDGE_COUNT (bb->succs) == 0) + { + gimple *stmt = last_stmt (bb); + if (stmt && gimple_call_builtin_p (stmt, BUILT_IN_EH_RETURN)) + { + crtl->calls_eh_return = 1; + break; + } + } + } + if (changed) return TODO_cleanup_cfg | TODO_update_ssa_only_virtuals; return 0; Jakub