Re: [PATCH] Add XINSN macro and use it within NEXT_INSN/PREV_INSN (was Re: [PATCH] Force rtl templates to be inlined)
On 09/04/14 14:22, Jakub Jelinek wrote: On Thu, Sep 04, 2014 at 04:04:17PM -0400, David Malcolm wrote: --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -195,6 +195,7 @@ union rtunion unsigned int rt_uint; const char *rt_str; rtx rt_rtx; + rtx_insn *rt_insn; rtvec rt_rtvec; enum machine_mode rt_type; addr_diff_vec_flags rt_addr_diff_vec_flags; @@ -1208,6 +1209,7 @@ extern void rtl_check_failed_flag (const char *, const_rtx, const char *, #define XUINT(RTX, N) (RTL_CHECK2 (RTX, N, 'i', 'n').rt_uint) #define XSTR(RTX, N) (RTL_CHECK2 (RTX, N, 's', 'S').rt_str) #define XEXP(RTX, N) (RTL_CHECK2 (RTX, N, 'e', 'u').rt_rtx) Shouldn't XEXP be changed into RTL_CHECK1 (RTX, N, 'e').rt_rtx then and the accessors of first operand of INSN_LIST and only operand of LABEL_REF be changed to XINSN too? Of course doesn't have to be done immediately, can be done as a followup. Yea, let's chase this as a follow-up. Jeff
Re: [PATCH] Add XINSN macro and use it within NEXT_INSN/PREV_INSN (was Re: [PATCH] Force rtl templates to be inlined)
On 09/04/14 15:19, David Malcolm wrote: I suppose so, but I don't see an easy way of locating such XEXP users beyond building with ENABLE_RTL_CHECKING, and checking on every configuration, and trying to exercise all code paths. Can we consider that a *long-term* followup? Well, XEXP isn't supposed to be used to access integer fields (we have XINT for that). Any code doing so is already broken, IMHO. And anything doing so is going to have some ugly cast from an rtx to some integer type -- a strong hint that the original author any any reviewer missed something :-) So I'd be comfortable with a more limited set of testing (say bootstrap x86_64 and perhaps one or two other platforms). Evaluate anything found, and assuming we're not finding anything terribly unexpected, go for it. Use of XINSN does introduce a hole in the rtx-classes type-safety scheme, in that it's only checked with ENABLE_RTL_CHECKING. I see XINSN as a relatively short term wart. Our goal ought to be to make it go away, but I can see how we need it in the immediate term. Jeff
Re: [PATCH] Add XINSN macro and use it within NEXT_INSN/PREV_INSN (was Re: [PATCH] Force rtl templates to be inlined)
On 09/04/14 14:04, David Malcolm wrote: On Tue, 2014-09-02 at 19:50 +0200, Andi Kleen wrote: I suspect the bulk of them currently are coming from the safe_as_a rtx_insn * calls within NEXT_INSN and PREV_INSN; do you happen to have information handy on that? Yes that's right: - 1.03% lto1[.] bool is_a_helperrtx_insn*::testrtx_def(rtx_def*) ▒ - bool is_a_helperrtx_insn*::testrtx_def(rtx_def*) ▒ - 92.20% bool is_artx_insn*, rtx_def(rtx_def*) ▒ - 98.53% rtx_insn* safe_as_artx_insn*, rtx_def(rtx_def*) ▒ - 73.28% NEXT_INSN(rtx_insn const*) ▒ The is_a_helper for rtx_insn * is non-trivial, so it may be worth avoiding it, even when inlined. The attached patch rewrites the inline NEXT_INSN/PREV_INSN to avoid doing the safe_as_a, instead tightening up the interface so that one can only set them to an insn, and introducing a new XINSN access macro and corresponding rt_insn member of the union. Bootstrapped on x86_64-unknown-linux-gnu (Fedora 20), and has been rebuilt as part of a config-list.mk build for all working configurations (albeit with other patches for the latter case). OK for trunk? So is this just to deal with the overhead in the safe_as_a helper until we can strengthen more code? And is that overhead significant in an optimized build? Would an alternate approach be to make the checking in safe_as_a conditionalized on ENABLE_CHECKING? jeff
Re: [PATCH] Add XINSN macro and use it within NEXT_INSN/PREV_INSN (was Re: [PATCH] Force rtl templates to be inlined)
On Fri, 2014-09-05 at 12:45 -0600, Jeff Law wrote: On 09/04/14 14:04, David Malcolm wrote: On Tue, 2014-09-02 at 19:50 +0200, Andi Kleen wrote: I suspect the bulk of them currently are coming from the safe_as_a rtx_insn * calls within NEXT_INSN and PREV_INSN; do you happen to have information handy on that? Yes that's right: - 1.03% lto1[.] bool is_a_helperrtx_insn*::testrtx_def(rtx_def*) ▒ - bool is_a_helperrtx_insn*::testrtx_def(rtx_def*) ▒ - 92.20% bool is_artx_insn*, rtx_def(rtx_def*) ▒ - 98.53% rtx_insn* safe_as_artx_insn*, rtx_def(rtx_def*) ▒ - 73.28% NEXT_INSN(rtx_insn const*) ▒ The is_a_helper for rtx_insn * is non-trivial, so it may be worth avoiding it, even when inlined. The attached patch rewrites the inline NEXT_INSN/PREV_INSN to avoid doing the safe_as_a, instead tightening up the interface so that one can only set them to an insn, and introducing a new XINSN access macro and corresponding rt_insn member of the union. Bootstrapped on x86_64-unknown-linux-gnu (Fedora 20), and has been rebuilt as part of a config-list.mk build for all working configurations (albeit with other patches for the latter case). OK for trunk? So is this just to deal with the overhead in the safe_as_a helper until we can strengthen more code? And is that overhead significant in an optimized build? Would an alternate approach be to make the checking in safe_as_a conditionalized on ENABLE_CHECKING? I wrote the attached patch to do that, but I don't yet have numbers to back it up. Bootstrapped with current settings, and smoketested with --enable-checking=release (both on Fedora 20 x86_64). From 24b98616c4425a8ef380d2d5fef00f82af9df985 Mon Sep 17 00:00:00 2001 From: David Malcolm dmalc...@redhat.com Date: Fri, 29 Aug 2014 17:04:49 -0400 Subject: [PATCH] Ensure that safe_as_a can have no performance overhead in a release build gcc/ * is-a.h (safe_as_a): Eliminate the conditional from the !defined(ENABLE_CHECKING) case, to ensure there can be no performance degradation of the release build. --- gcc/is-a.h | 4 1 file changed, 4 insertions(+) diff --git a/gcc/is-a.h b/gcc/is-a.h index 176066b..f0c05b0 100644 --- a/gcc/is-a.h +++ b/gcc/is-a.h @@ -200,6 +200,7 @@ template typename T, typename U inline T safe_as_a (U *p) { +#ifdef ENABLE_CHECKING if (p) { gcc_checking_assert (is_a T (p)); @@ -207,6 +208,9 @@ safe_as_a (U *p) } else return NULL; +#else + return is_a_helper T::cast (p); +#endif } /* A generic checked conversion from a base type U to a derived type T. See -- 1.8.5.3
Re: [PATCH] Force rtl templates to be inlined
On Thu, Sep 4, 2014 at 5:58 AM, Andi Kleen a...@firstfloor.org wrote: Anyway, removing !optimize checks in favor of flag_no_inline checks and initializing that properly is a cleanup as well. Patch looks good to me. Unfortunately it doesn't pass bootstrap (inline-summary re-use between IPA passes is a maze...). So I need to tweak it more (pushed back for now). Richard. -Andi
Re: [PATCH] Force rtl templates to be inlined
On Thu, Sep 4, 2014 at 5:58 AM, Andi Kleen a...@firstfloor.org wrote: Anyway, removing !optimize checks in favor of flag_no_inline checks and initializing that properly is a cleanup as well. Patch looks good to me. Unfortunately it doesn't pass bootstrap (inline-summary re-use between IPA passes is a maze...). So I need to tweak it more (pushed back for now). From a quick glance it looks resonable thing to do. I can fix the summary issues after returning from trip this Sunday. Honza Richard. -Andi
[PATCH] Add XINSN macro and use it within NEXT_INSN/PREV_INSN (was Re: [PATCH] Force rtl templates to be inlined)
On Tue, 2014-09-02 at 19:50 +0200, Andi Kleen wrote: I suspect the bulk of them currently are coming from the safe_as_a rtx_insn * calls within NEXT_INSN and PREV_INSN; do you happen to have information handy on that? Yes that's right: - 1.03% lto1[.] bool is_a_helperrtx_insn*::testrtx_def(rtx_def*) ▒ - bool is_a_helperrtx_insn*::testrtx_def(rtx_def*) ▒ - 92.20% bool is_artx_insn*, rtx_def(rtx_def*) ▒ - 98.53% rtx_insn* safe_as_artx_insn*, rtx_def(rtx_def*) ▒ - 73.28% NEXT_INSN(rtx_insn const*) ▒ The is_a_helper for rtx_insn * is non-trivial, so it may be worth avoiding it, even when inlined. The attached patch rewrites the inline NEXT_INSN/PREV_INSN to avoid doing the safe_as_a, instead tightening up the interface so that one can only set them to an insn, and introducing a new XINSN access macro and corresponding rt_insn member of the union. Bootstrapped on x86_64-unknown-linux-gnu (Fedora 20), and has been rebuilt as part of a config-list.mk build for all working configurations (albeit with other patches for the latter case). OK for trunk? gcc/ * rtl.h (union rtunion): Add new member rt_insn, of type rtx_insn *. (XINSN): New accessor macro, accessing as an rtx_insn *, requiring code u. (PREV_INSN, NEXT_INSN): Eliminate the checked cast to rtx_insn * and instead directly use XINSN. (SET_PREV_INSN, SET_NEXT_INSN): Strengthen the return type from rtx to rtx_insn *, using XINSN internally. (NEXT_INSN): Eliminate the checked cast and instead directly use XINSN. * cfgrtl.c (fixup_abnormal_edges): Use NULL rather than NULL_RTX when assigning to SET_PREV_INSN/SET_NEXT_INSN. * haifa-sched.c (remove_notes): Likewise. * sel-sched-ir.c (sel_remove_insn): Likewise. (get_bb_note_from_pool): Likewise. * config/ia64/ia64.c (ia64_init_dfa_pre_cycle_insn): Likewise. (ia64_reorg): Likewise. From 6e60e29211314b5865bc7b5b05d586777d96815f Mon Sep 17 00:00:00 2001 From: David Malcolm dmalc...@redhat.com Date: Wed, 3 Sep 2014 11:01:37 -0400 Subject: [PATCH 01/32] Add XINSN macro and use it within NEXT_INSN/PREV_INSN gcc/ * rtl.h (union rtunion): Add new member rt_insn, of type rtx_insn *. (XINSN): New accessor macro, accessing as an rtx_insn *, requiring code u. (PREV_INSN, NEXT_INSN): Eliminate the checked cast to rtx_insn * and instead directly use XINSN. (SET_PREV_INSN, SET_NEXT_INSN): Strengthen the return type from rtx to rtx_insn *, using XINSN internally. (NEXT_INSN): Eliminate the checked cast and instead directly use XINSN. * cfgrtl.c (fixup_abnormal_edges): Use NULL rather than NULL_RTX when assigning to SET_PREV_INSN/SET_NEXT_INSN. * haifa-sched.c (remove_notes): Likewise. * sel-sched-ir.c (sel_remove_insn): Likewise. (get_bb_note_from_pool): Likewise. * config/ia64/ia64.c (ia64_init_dfa_pre_cycle_insn): Likewise. (ia64_reorg): Likewise. --- gcc/cfgrtl.c | 4 ++-- gcc/config/ia64/ia64.c | 6 +++--- gcc/haifa-sched.c | 2 +- gcc/rtl.h | 16 gcc/sel-sched-ir.c | 8 5 files changed, 18 insertions(+), 18 deletions(-) diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c index bc6c965..7a03d78 100644 --- a/gcc/cfgrtl.c +++ b/gcc/cfgrtl.c @@ -3279,8 +3279,8 @@ fixup_abnormal_edges (void) { /* We're not deleting it, we're moving it. */ INSN_DELETED_P (insn) = 0; - SET_PREV_INSN (insn) = NULL_RTX; - SET_NEXT_INSN (insn) = NULL_RTX; + SET_PREV_INSN (insn) = NULL; + SET_NEXT_INSN (insn) = NULL; insert_insn_on_edge (insn, e); inserted = true; diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c index 2ed5ddd..e73a489 100644 --- a/gcc/config/ia64/ia64.c +++ b/gcc/config/ia64/ia64.c @@ -9496,10 +9496,10 @@ ia64_init_dfa_pre_cycle_insn (void) prev_cycle_state = xmalloc (dfa_state_size); } dfa_pre_cycle_insn = make_insn_raw (gen_pre_cycle ()); - SET_PREV_INSN (dfa_pre_cycle_insn) = SET_NEXT_INSN (dfa_pre_cycle_insn) = NULL_RTX; + SET_PREV_INSN (dfa_pre_cycle_insn) = SET_NEXT_INSN (dfa_pre_cycle_insn) = NULL; recog_memoized (dfa_pre_cycle_insn); dfa_stop_insn = make_insn_raw (gen_insn_group_barrier (GEN_INT (3))); - SET_PREV_INSN (dfa_stop_insn) = SET_NEXT_INSN (dfa_stop_insn) = NULL_RTX; + SET_PREV_INSN (dfa_stop_insn) = SET_NEXT_INSN (dfa_stop_insn) = NULL; recog_memoized (dfa_stop_insn); } @@ -9687,7 +9687,7 @@ ia64_reorg (void) initiate_bundle_states
Re: [PATCH] Add XINSN macro and use it within NEXT_INSN/PREV_INSN (was Re: [PATCH] Force rtl templates to be inlined)
On Thu, Sep 04, 2014 at 04:04:17PM -0400, David Malcolm wrote: --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -195,6 +195,7 @@ union rtunion unsigned int rt_uint; const char *rt_str; rtx rt_rtx; + rtx_insn *rt_insn; rtvec rt_rtvec; enum machine_mode rt_type; addr_diff_vec_flags rt_addr_diff_vec_flags; @@ -1208,6 +1209,7 @@ extern void rtl_check_failed_flag (const char *, const_rtx, const char *, #define XUINT(RTX, N) (RTL_CHECK2 (RTX, N, 'i', 'n').rt_uint) #define XSTR(RTX, N) (RTL_CHECK2 (RTX, N, 's', 'S').rt_str) #define XEXP(RTX, N) (RTL_CHECK2 (RTX, N, 'e', 'u').rt_rtx) Shouldn't XEXP be changed into RTL_CHECK1 (RTX, N, 'e').rt_rtx then and the accessors of first operand of INSN_LIST and only operand of LABEL_REF be changed to XINSN too? Of course doesn't have to be done immediately, can be done as a followup. +#define XINSN(RTX, N)(RTL_CHECK1 (RTX, N, 'u').rt_insn) #define XVEC(RTX, N) (RTL_CHECK2 (RTX, N, 'E', 'V').rt_rtvec) #define XMODE(RTX, N)(RTL_CHECK1 (RTX, N, 'M').rt_type) #define XTREE(RTX, N) (RTL_CHECK1 (RTX, N, 't').rt_tree) Jakub
Re: [PATCH] Add XINSN macro and use it within NEXT_INSN/PREV_INSN (was Re: [PATCH] Force rtl templates to be inlined)
On Thu, 2014-09-04 at 22:22 +0200, Jakub Jelinek wrote: On Thu, Sep 04, 2014 at 04:04:17PM -0400, David Malcolm wrote: --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -195,6 +195,7 @@ union rtunion unsigned int rt_uint; const char *rt_str; rtx rt_rtx; + rtx_insn *rt_insn; rtvec rt_rtvec; enum machine_mode rt_type; addr_diff_vec_flags rt_addr_diff_vec_flags; @@ -1208,6 +1209,7 @@ extern void rtl_check_failed_flag (const char *, const_rtx, const char *, #define XUINT(RTX, N) (RTL_CHECK2 (RTX, N, 'i', 'n').rt_uint) #define XSTR(RTX, N) (RTL_CHECK2 (RTX, N, 's', 'S').rt_str) #define XEXP(RTX, N) (RTL_CHECK2 (RTX, N, 'e', 'u').rt_rtx) Shouldn't XEXP be changed into RTL_CHECK1 (RTX, N, 'e').rt_rtx then and the accessors of first operand of INSN_LIST and only operand of LABEL_REF be changed to XINSN too? Of course doesn't have to be done immediately, can be done as a followup. I suppose so, but I don't see an easy way of locating such XEXP users beyond building with ENABLE_RTL_CHECKING, and checking on every configuration, and trying to exercise all code paths. Can we consider that a *long-term* followup? Use of XINSN does introduce a hole in the rtx-classes type-safety scheme, in that it's only checked with ENABLE_RTL_CHECKING. I considered it acceptable in the given patch since the SET_NEXT_INSN/SET_PREV_INSN lvalue accessors return an rtx_insn * and thus will only accept an rtx_insn * as an rvalue - but I'm a bit nervous about its use elsewhere in the code; I'd rather we used rtx_insn_list for homogeneous lists of INSN_LIST nodes, and maybe an rtx_label_ref class for LABEL_REF, and perhaps an rtx_reg_note class for heterogeneous lists of INSN_LIST/EXPR_LIST/INT_LIST nodes for REG_NOTES. +#define XINSN(RTX, N) (RTL_CHECK1 (RTX, N, 'u').rt_insn) #define XVEC(RTX, N) (RTL_CHECK2 (RTX, N, 'E', 'V').rt_rtvec) #define XMODE(RTX, N) (RTL_CHECK1 (RTX, N, 'M').rt_type) #define XTREE(RTX, N) (RTL_CHECK1 (RTX, N, 't').rt_tree) Jakub
Re: [PATCH] Force rtl templates to be inlined
On Tue, Sep 2, 2014 at 6:52 PM, Andi Kleen a...@linux.intel.com wrote: Or we simply should make -finline work at -O0 (I suppose it might already work?) and use it. Yes that's probably better. There are more hot inlines in the stage 1 profile (like wi::storage_ref or vec::length) I suspect with the ongoing C++'ification that will get worse. Like with this (untested apart from on a small testcase). Of course it usually won't help bootstrap because your host compiler doesn't support -O0 -finline yet. Also it might not help that much because while it certainly removes call overhead it will still not optimize anything else. Also inline analysis correctly assumes that no stmts go away so you only have the call overhead as room to allow inlining (so with checking enabled the is_a/as_a stuff is probably too large - didn't check). It may also negatively affect debugging (no var-tracking at -O0). Anyway, removing !optimize checks in favor of flag_no_inline checks and initializing that properly is a cleanup as well. Now to see how to best test this ... Richard. -Andi -- a...@linux.intel.com -- Speaking for myself only make-inlining-work-at-O0 Description: Binary data
Re: [PATCH] Force rtl templates to be inlined
On Tue, Sep 2, 2014 at 6:52 PM, Andi Kleen a...@linux.intel.com wrote: Or we simply should make -finline work at -O0 (I suppose it might already work?) and use it. Yes that's probably better. There are more hot inlines in the stage 1 profile (like wi::storage_ref or vec::length) I suspect with the ongoing C++'ification that will get worse. Btw, it's C++ which I considered that -Og might replace -O0 exactly because of all the abstraction penalty which usually doesn't help debugging at -O0. Also the idea was that -Og might even compile faster than -O0, but that's far from true unfortunately ... Richard. -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: [PATCH] Force rtl templates to be inlined
Anyway, removing !optimize checks in favor of flag_no_inline checks and initializing that properly is a cleanup as well. Patch looks good to me. -Andi
[PATCH] Force rtl templates to be inlined
From: Andi Kleen a...@linux.intel.com I noticed that with the trunk compiler a range of the new rtl inlines show up as hot in a profiler during stage1. I think that happens because stage1 is not using optimization and does not inline plain inline. And these rtl inlines are very frequently called. Mark them all with __attribute__((always_inline)) which forces inlining even with -O0. Passes bootstrap and testing on x86_64-linux. Cc: dmalc...@redhat.com include/: 2014-09-01 Andi Kleen a...@linux.intel.com * ansidecl.h (ALWAYS_INLINE): Add. gcc/: 2014-09-01 Andi Kleen a...@linux.intel.com * rtl.h (is_a_helper): Change inline to ALWAYS_INLINE. (rhs_regno): Dito. (init_costs_to_max): Dito. (init_costs_to_zero): Dito. (costs_lt_p): Dito. (costs_add_n_insns): Dito. (wi::int_traits ::get_precision): Dito. (wi::shwi): Dito. (wi::min_value): Dito. (wi::max_value): Dito. (set_rtx_cost): Dito. (get_full_set_rtx_cost): Dito. (set_src_cost): Dito. (get_full_set_src_cost): Dito. (get_mem_attrs): Dito. --- gcc/rtl.h | 111 +++-- include/ansidecl.h | 6 +++ 2 files changed, 62 insertions(+), 55 deletions(-) diff --git a/gcc/rtl.h b/gcc/rtl.h index beeed2f..d711e43 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3. If not see #define GCC_RTL_H #include utility +#include ansidecl.h #include statistics.h #include machmode.h #include input.h @@ -418,7 +419,7 @@ public: template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_expr_list *::test (rtx rt) { return rt-code == EXPR_LIST; @@ -447,7 +448,7 @@ public: template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_insn_list *::test (rtx rt) { return rt-code == INSN_LIST; @@ -474,7 +475,7 @@ public: template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_sequence *::test (rtx rt) { return rt-code == SEQUENCE; @@ -482,7 +483,7 @@ is_a_helper rtx_sequence *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper const rtx_sequence *::test (const_rtx rt) { return rt-code == SEQUENCE; @@ -778,7 +779,7 @@ struct GTY(()) rtvec_def { template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_insn *::test (rtx rt) { return (INSN_P (rt) @@ -790,7 +791,7 @@ is_a_helper rtx_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper const rtx_insn *::test (const_rtx rt) { return (INSN_P (rt) @@ -802,7 +803,7 @@ is_a_helper const rtx_insn *::test (const_rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_debug_insn *::test (rtx rt) { return DEBUG_INSN_P (rt); @@ -810,7 +811,7 @@ is_a_helper rtx_debug_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_nonjump_insn *::test (rtx rt) { return NONJUMP_INSN_P (rt); @@ -818,7 +819,7 @@ is_a_helper rtx_nonjump_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_jump_insn *::test (rtx rt) { return JUMP_P (rt); @@ -826,7 +827,7 @@ is_a_helper rtx_jump_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_call_insn *::test (rtx rt) { return CALL_P (rt); @@ -834,7 +835,7 @@ is_a_helper rtx_call_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_call_insn *::test (rtx_insn *insn) { return CALL_P (insn); @@ -842,7 +843,7 @@ is_a_helper rtx_call_insn *::test (rtx_insn *insn) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_jump_table_data *::test (rtx rt) { return JUMP_TABLE_DATA_P (rt); @@ -850,7 +851,7 @@ is_a_helper rtx_jump_table_data *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_jump_table_data *::test (rtx_insn *insn) { return JUMP_TABLE_DATA_P (insn); @@ -858,7 +859,7 @@ is_a_helper rtx_jump_table_data *::test (rtx_insn *insn) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_barrier *::test (rtx rt) { return BARRIER_P (rt); @@ -866,7 +867,7 @@ is_a_helper rtx_barrier *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_code_label *::test (rtx rt) { return LABEL_P (rt); @@ -874,7 +875,7 @@ is_a_helper rtx_code_label *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_code_label *::test (rtx_insn *insn) { return LABEL_P (insn); @@ -882,7 +883,7 @@ is_a_helper rtx_code_label *::test (rtx_insn *insn) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_note *::test (rtx rt) { return NOTE_P (rt); @@ -890,7 +891,7 @@ is_a_helper rtx_note *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool
Re: [PATCH] Force rtl templates to be inlined
On Tue, Sep 2, 2014 at 12:03 AM, Andi Kleen a...@firstfloor.org wrote: From: Andi Kleen a...@linux.intel.com I noticed that with the trunk compiler a range of the new rtl inlines show up as hot in a profiler during stage1. I think that happens because stage1 is not using optimization and does not inline plain inline. And these rtl inlines are very frequently called. Mark them all with __attribute__((always_inline)) which forces inlining even with -O0. I think this is wrong and should not be committed. stage1 is designed to be without optimization and there have been bugs in the past in the area of always_inline too. Thanks, Andrew Pinski Passes bootstrap and testing on x86_64-linux. Cc: dmalc...@redhat.com include/: 2014-09-01 Andi Kleen a...@linux.intel.com * ansidecl.h (ALWAYS_INLINE): Add. gcc/: 2014-09-01 Andi Kleen a...@linux.intel.com * rtl.h (is_a_helper): Change inline to ALWAYS_INLINE. (rhs_regno): Dito. (init_costs_to_max): Dito. (init_costs_to_zero): Dito. (costs_lt_p): Dito. (costs_add_n_insns): Dito. (wi::int_traits ::get_precision): Dito. (wi::shwi): Dito. (wi::min_value): Dito. (wi::max_value): Dito. (set_rtx_cost): Dito. (get_full_set_rtx_cost): Dito. (set_src_cost): Dito. (get_full_set_src_cost): Dito. (get_mem_attrs): Dito. --- gcc/rtl.h | 111 +++-- include/ansidecl.h | 6 +++ 2 files changed, 62 insertions(+), 55 deletions(-) diff --git a/gcc/rtl.h b/gcc/rtl.h index beeed2f..d711e43 100644 --- a/gcc/rtl.h +++ b/gcc/rtl.h @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3. If not see #define GCC_RTL_H #include utility +#include ansidecl.h #include statistics.h #include machmode.h #include input.h @@ -418,7 +419,7 @@ public: template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_expr_list *::test (rtx rt) { return rt-code == EXPR_LIST; @@ -447,7 +448,7 @@ public: template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_insn_list *::test (rtx rt) { return rt-code == INSN_LIST; @@ -474,7 +475,7 @@ public: template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_sequence *::test (rtx rt) { return rt-code == SEQUENCE; @@ -482,7 +483,7 @@ is_a_helper rtx_sequence *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper const rtx_sequence *::test (const_rtx rt) { return rt-code == SEQUENCE; @@ -778,7 +779,7 @@ struct GTY(()) rtvec_def { template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_insn *::test (rtx rt) { return (INSN_P (rt) @@ -790,7 +791,7 @@ is_a_helper rtx_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper const rtx_insn *::test (const_rtx rt) { return (INSN_P (rt) @@ -802,7 +803,7 @@ is_a_helper const rtx_insn *::test (const_rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_debug_insn *::test (rtx rt) { return DEBUG_INSN_P (rt); @@ -810,7 +811,7 @@ is_a_helper rtx_debug_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_nonjump_insn *::test (rtx rt) { return NONJUMP_INSN_P (rt); @@ -818,7 +819,7 @@ is_a_helper rtx_nonjump_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_jump_insn *::test (rtx rt) { return JUMP_P (rt); @@ -826,7 +827,7 @@ is_a_helper rtx_jump_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_call_insn *::test (rtx rt) { return CALL_P (rt); @@ -834,7 +835,7 @@ is_a_helper rtx_call_insn *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_call_insn *::test (rtx_insn *insn) { return CALL_P (insn); @@ -842,7 +843,7 @@ is_a_helper rtx_call_insn *::test (rtx_insn *insn) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_jump_table_data *::test (rtx rt) { return JUMP_TABLE_DATA_P (rt); @@ -850,7 +851,7 @@ is_a_helper rtx_jump_table_data *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_jump_table_data *::test (rtx_insn *insn) { return JUMP_TABLE_DATA_P (insn); @@ -858,7 +859,7 @@ is_a_helper rtx_jump_table_data *::test (rtx_insn *insn) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_barrier *::test (rtx rt) { return BARRIER_P (rt); @@ -866,7 +867,7 @@ is_a_helper rtx_barrier *::test (rtx rt) template template -inline bool +ALWAYS_INLINE bool is_a_helper rtx_code_label *::test (rtx rt) { return LABEL_P (rt); @@ -874,7 +875,7 @@ is_a_helper rtx_code_label *::test (rtx rt) template template -inline
Re: [PATCH] Force rtl templates to be inlined
there have been bugs in the past in the area of always_inline too. You're arguing for my patch. It would find those bugs. -Andi
Re: [PATCH] Force rtl templates to be inlined
On Tue, Sep 2, 2014 at 12:20 AM, Andi Kleen a...@firstfloor.org wrote: there have been bugs in the past in the area of always_inline too. You're arguing for my patch. It would find those bugs. No I am arguing against it since the older versions of GCC we cannot change. Thanks, Andrew -Andi
Re: [PATCH] Force rtl templates to be inlined
On Tue, Sep 2, 2014 at 9:22 AM, Andrew Pinski wrote: On Tue, Sep 2, 2014 at 12:20 AM, Andi Kleen wrote: there have been bugs in the past in the area of always_inline too. You're arguing for my patch. It would find those bugs. No I am arguing against it since the older versions of GCC we cannot change. Should such bugs turn up, we can account for them in ansidecl.h. I think Andi's patch should go in. Ciao! Steven
Re: [PATCH] Force rtl templates to be inlined
On Sep 2, 2014, at 1:36 AM, Steven Bosscher stevenb@gmail.com wrote: On Tue, Sep 2, 2014 at 9:22 AM, Andrew Pinski wrote: On Tue, Sep 2, 2014 at 12:20 AM, Andi Kleen wrote: there have been bugs in the past in the area of always_inline too. You're arguing for my patch. It would find those bugs. No I am arguing against it since the older versions of GCC we cannot change. Should such bugs turn up, we can account for them in ansidecl.h. I think Andi's patch should go in. I does hurt debug ability with older compilers too. So if we need to figure out why stage is being miscompiled it is harder to figure how to work around it. I think stage should really be -O0 even with respect of inline. I think we should never force inline inside gcc even at -O0 as it is just a hack (we know it as we added the attribute in the first place). Thanks, Andrew Ciao! Steven
Re: [PATCH] Force rtl templates to be inlined
On Tue, Sep 2, 2014 at 10:36 AM, Steven Bosscher stevenb@gmail.com wrote: On Tue, Sep 2, 2014 at 9:22 AM, Andrew Pinski wrote: On Tue, Sep 2, 2014 at 12:20 AM, Andi Kleen wrote: there have been bugs in the past in the area of always_inline too. You're arguing for my patch. It would find those bugs. No I am arguing against it since the older versions of GCC we cannot change. Should such bugs turn up, we can account for them in ansidecl.h. I think Andi's patch should go in. I disagree. always-inline isn't an optimization attribute but a correctness one. Instead we should not build stage1 with -O0 if we detect a reasonably recent GCC host compiler (say one that is still maintained). Or we simply should make -finline work at -O0 (I suppose it might already work?) and use it. Richard. Ciao! Steven
Re: [PATCH] Force rtl templates to be inlined
On Tue, 2014-09-02 at 00:03 -0700, Andi Kleen wrote: From: Andi Kleen a...@linux.intel.com I noticed that with the trunk compiler a range of the new rtl inlines show up as hot in a profiler during stage1. I think that happens because stage1 is not using optimization and does not inline plain inline. And these rtl inlines are very frequently called. Sorry about that. FWIW I'm working on some followup patches for the rtx-classes work that ought to eliminate some of the is_a_helper calls; I hope to post them in the next few days. [1] I suspect the bulk of them currently are coming from the safe_as_a rtx_insn * calls within NEXT_INSN and PREV_INSN; do you happen to have information handy on that? Dave [1] (I have to take the rest of today off for a family matter).
Re: [PATCH] Force rtl templates to be inlined
Or we simply should make -finline work at -O0 (I suppose it might already work?) and use it. Yes that's probably better. There are more hot inlines in the stage 1 profile (like wi::storage_ref or vec::length) I suspect with the ongoing C++'ification that will get worse. -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: [PATCH] Force rtl templates to be inlined
I suspect the bulk of them currently are coming from the safe_as_a rtx_insn * calls within NEXT_INSN and PREV_INSN; do you happen to have information handy on that? Yes that's right: - 1.03% lto1[.] bool is_a_helperrtx_insn*::testrtx_def(rtx_def*) ▒ - bool is_a_helperrtx_insn*::testrtx_def(rtx_def*) ▒ - 92.20% bool is_artx_insn*, rtx_def(rtx_def*) ▒ - 98.53% rtx_insn* safe_as_artx_insn*, rtx_def(rtx_def*) ▒ - 73.28% NEXT_INSN(rtx_insn const*) ▒