Hi,
this patch adds tunes needed for zen4 microarchitecture. I added two new knobs.
TARGET_AVX512_SPLIT_REGS which is used to specify that internally 512 vectors
are split to 256 vectors. This affects vectorization costs and reassociation
width. It probably should also affect RTX costs however I
Hi
this patch updates cost of znver4 mostly based on data measued by Agner Fog.
Compared to previous generations x87 became bit slower which is probably not
big deal (and we have minimal benchmarking coverage for it). One interesting
improvement is reducation of FMA cost. I also updated costs of
Hi,
this patch fixes streaming of resolution info when flag_incremental_link
== INCREMENTAL_LINK_NOLTO. Here we want to stream the info from WPA to
ltrans as usual.
Bootstrapped/regtested x86_64-linux, tested with kernel LTO builds.
Plan to commit it later today.
Honza
*
> On Fri, 25 Nov 2022, Jan Hubicka wrote:
>
> > >
> > >
> > > > Am 25.11.2022 um 11:05 schrieb Jan Hubicka via Gcc-patches
> > > > :
> > > >
> > > >
> > > >>
> > > >> IPA profile i
>
>
> > Am 25.11.2022 um 11:05 schrieb Jan Hubicka via Gcc-patches
> > :
> >
> >
> >>
> >> IPA profile instrumentation tries to clear the pure and const
> >> flags of functions but that's quite hopeless in particular for
> >>
> IPA profile instrumentation tries to clear the pure and const
> flags of functions but that's quite hopeless in particular for
> const since that attribute prevails on the type and thus on each
> call to the function leading to inconsistencies in the IL and
> eventual checking ICEs. There's no
> Hi,
>
> PR 107661 shows that function push_agg_values_for_index_from_edge
> should not attempt to optimize self-recursive call graph edges when
> called from cgraph_edge_brings_all_agg_vals_for_node. Unlike when
> being called from find_aggregate_values_for_callers_subset, we cannot
> expect
> On Mon, 21 Nov 2022 20:02:49 +0100
> Jan Hubicka wrote:
>
> > > Hi Honza, Ping.
> > > Regtests cleanly for c,fortran,c++,ada,d,go,lto,objc,obj-c++
> > > Ok?
> > > I'd need this for attribute target_clones for the Fortran FE.
> > Sorry for
> Hi Honza, Ping.
> Regtests cleanly for c,fortran,c++,ada,d,go,lto,objc,obj-c++
> Ok?
> I'd need this for attribute target_clones for the Fortran FE.
Sorry for delay here.
> > void
> > @@ -303,6 +301,10 @@ symbol_table::change_decl_assembler_name (tree decl,
> > tree name)
> > warning (0,
> Hi,
>
> this should have been part of r12-578-g717d278af93a4a. Call edge
> summaries provide information required for IPA-SRA transformations in
> the callees but are generated when analyzing callers and thus also
> callers which are not IPA-SRA candidates themselves. Therefore we
> analyze
> Hi,
>
> IPA-CP transformation summary streaming code currently won't stream
> out transformations necessary for clones which are only necessary for
> materialization of other clones (such as an IPA-CP clone which is then
> cloned again by IPA-SRA). However, a follow-up patch for bettor
>
> Hi,
>
> When building vectors of known aggregate values, there is no point in
> including those for parameters which are not used in any way
> whatsoever.
>
> Bootstrapped and tested on x86_64-linux. OK for master?
OK,
thanks!
Honza
>
> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
>
> Hi,
>
> I have noticed that the flag m_split_modifications_p of
> ipa_param_body_adjustments is not really necessary as it has to
> correspond to whether m_replacements is non-empty so this patch
> removes it. This also simplifies a bit some patches I work on.
>
> Bootstrapped and tested on
> Ok for trunk if testing passes?
>
> gcc/ChangeLog:
>
> * cgraph.cc (cgraph_node::make_local): Remove redundant set_section.
> * multiple_target.cc (create_dispatcher_calls): Likewise.
OK (not sure how this slipped in)
The code in create_dispatcher_calls is clearly cut of
> On Fri, Oct 21, 2022 at 12:00 PM Kumar, Venkataramanan via Gcc-patches
> wrote:
> >
> > Hi all,
> >
> > > -Original Message-
> > > From: Joshi, Tejas Sanjay
> > > Sent: Monday, October 17, 2022 8:09 PM
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: Kumar, Venkataramanan ;
> > >
>
> 2022-08-26 Martin Jambor
>
> * ipa-prop.h (ipa_agg_value): Remove type.
> (ipa_agg_value_set): Likewise.
> (ipa_copy_agg_values): Remove function.
> (ipa_release_agg_values): Likewise.
> (ipa_auto_call_arg_values) Add a forward declaration.
>
> gcc/ChangeLog:
>
> 2022-08-26 Martin Jambor
>
> * ipa-prop.h (IPA_PROP_ARG_INDEX_LIMIT_BITS): New.
> (ipcp_transformation): Added forward declaration.
> (ipa_argagg_value): New type.
> (ipa_argagg_value_list): New type.
> (ipa_agg_replacement_value): Removed
> This is implicitly mentioned in the docs, but there were some questions
> in a recent patch. This makes it more exlicit that -falign-functions is
> meant to be ignored under -Os.
>
> gcc/doc/ChangeLog
>
> * invoke.texi (-falign-functions): Mention -Os
> ---
> gcc/doc/invoke.texi | 3
> >> Probably not hard, and the IPA pass adjusting visbility could as well
> >> mark the functions
> >> as not to be inlined with -flive-patching=inline-only-static.
> >>
>
> OTOH inline-only-static could disable WPA inlining and do all inlining
> early ...
> >>>
> >>>
> > WPA is Whole Program Analysis?
>
> Yes.
>
> > Okay, then It will promote all static function to extern functions. That’s
> > reasonable.
>
> No, all extern functions to static functions.
>
> > Is it hard to preserve the original “static” visibility in the IR?
>
> Probably not hard, and
> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng wrote:
> >
> > From: Monk Chiang
> >
> > Currnetly setting of -falign-functions=N will be ignored if the function
> > is optimized for size or marked as cold function.
> >
> > However function alignment requirement is needed even optimized for
> > size
> Hi Honza,
>
> This patch is to add attribute hot judgement for INLINE_HINT_known_hot hint.
>
> We set up INLINE_HINT_known_hot hint only when we have profile feedback,
> now add function attribute judgement for it, when both caller and callee
> have __attribute__((hot)), we will also set up
derived types as done by this patch.
Bootstrpaped/regtested x86_64-linux, comitted.
Honza
gcc/ChangeLog:
2022-08-10 Jan Hubicka
PR middle-end/106057
* ipa-devirt.cc (type_or_derived_type_possibly_instantiated_p): New
function.
(possible_polymorphic_call_targets
> On Tue, 2 Aug 2022, Aldy Hernandez wrote:
>
> > On Tue, Aug 2, 2022 at 1:45 PM Richard Biener wrote:
> > >
> > > On Tue, 2 Aug 2022, Aldy Hernandez wrote:
> > >
> > > > Unfortunately, this was before my time, so I don't know.
> > > >
> > > > That being said, thanks for tackling these issues
> gcc/ChangeLog:
>
> * profile.cc (compute_branch_probabilities): Dump details only
> if TDF_DETAILS.
> * symtab.cc (symtab_node::dump_base): Do not dump pointer unless
> TDF_ADDRESS is used, it makes comparison harder.
> ---
> gcc/profile.cc | 2 +-
> gcc/symtab.cc | 3
> Hi,
>
> with -fno-toplevel-reorder (and -fwhole-program), there apparently can
> be local functions without any callers. This is something that IPA-CP
If there is possibility to trigger a local function without callers, I
think one can also make two local functions calling each other but with
Hello,
> From: Lili
>
>
> Hi Hubicka,
>
> This patch is to add a heuristic inline hint to eliminate redundant load and
> store.
>
> Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
> OK for trunk?
>
> Thanks,
> Lili.
>
> Add a INLINE_HINT_eliminate_load_and_store hint in to
> On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin wrote:
> >
> > Hi,
> >
> > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html
> >
> > BR,
> > Kewen
> >
> > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote:
> > > Hi,
> > >
> > > PR105459 exposes one issue in inline_call
edges.
Honza
gcc/cp/ChangeLog:
2022-06-23 Jan Hubicka
* except.cc (declare_library_fn_1): Add fnspec parameter.
(declare_library_fn): Add fnspec parameter.
(do_allocate_exception): Declare fnspecs.
(do_free_exception): Declare fnspecs.
(build_throw
Hi,
this patch adds missing check to stmt_kills_ref_p for case that function
is terminated by EH before call return value kills the ref. In the PR
I tried to construct testcase but I don't know how to do that until I
annotate EH code with fnspec attributes which I will do in separate patch
and
> @Honza: PING
>
> On 5/20/22 09:46, Martin Liška wrote:
> > On 5/19/22 17:02, Jan Hubicka wrote:
> >>> Similarly to cgraph_nodes, it may happen that body_removed is set
> >>> during merging of symbols.
> >>>
> >>> PR ipa/105600
&
> PING^2
Sorry, I thought it is approved once we settled down the multiplicatoin
datatype, but apparently never sent the email.
Patch is oK.
Honza
>
> On 5/24/22 13:35, Martin Liška wrote:
> > PING^1
> >
> > On 5/5/22 20:15, Martin Liška wrote:
> >> On 5/5/2
> Hi,
>
> Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO
> if func->decl is not null but no cgraph node is available for it.
> As PR105818 shows, this could give unexpected result. For the
> case in PR105818, when parsing bar decl in function foo, the cfun
> is a function
Hi,
this patch prevents ipa-prop from propagating aggregates when load is
volatile. Martin, does this look OK? It seem to me that ipa-prop may
need some additional volatile flag checks.
Bootstrapped/regtested x86_64-linux, OK?
Honza
gcc/ChangeLog:
2022-06-10 Jan Hubicka
PR ipa
> On Mon, 16 May 2022, Alexander Monakov wrote:
>
> > On Mon, 9 May 2022, Jan Hubicka wrote:
> >
> > > > On second thought, it might be better to keep the assert, and place the
> > > > loop
> > > > under 'if (optimize)'?
> > >
> Similarly to cgraph_nodes, it may happen that body_removed is set
> during merging of symbols.
>
> PR ipa/105600
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> * ipa-icf.cc
> On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote:
> >>
> >> Sure having a 'plugin was compiled from sources of the GCC N.M compiler'
> >> is useful if bugs are discovered in old versions that you by definition
> >> cannot
> >> fix but can appl
>
> Sure having a 'plugin was compiled from sources of the GCC N.M compiler'
> is useful if bugs are discovered in old versions that you by definition cannot
> fix but can apply workarounds to. Note the actual compiler used might still
> differ. Note that still isn't clean API documentation /
> On Mon, 2 May 2022, Alexander Monakov wrote:
> > > > --- a/gcc/ipa-visibility.cc
> > > > +++ b/gcc/ipa-visibility.cc
> > > > @@ -872,6 +872,22 @@ function_and_variable_visibility (bool
> > > > whole_program)
> > > > }
> > > > }
> > > > }
> > > > + FOR_EACH_VARIABLE
Hi,
> The patch simplifies usage of the profile_{count,probability} types.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
The reason I intentionally did not add * and / to the original API was
to detect situations where values that should
> On Thu, 5 May 2022, Jan Hubicka wrote:
>
> > Also note that visibility pass is run twice (once at compile time before
> > early optimizations and then again at LTO). Since LTO linking may
> > promote public symbols to local/hidden, perhaps we want to do this only
> > @@ -872,6 +872,22 @@ function_and_variable_visibility (bool whole_program)
> > }
> > }
> > }
> > + FOR_EACH_VARIABLE (vnode)
> > +{
> > + tree decl = vnode->decl;
> > +
> > + /* Optimize TLS model based on visibility (taking into account
> > +
> On Mon, May 2, 2022 at 10:51 AM Richard Biener
> wrote:
> >
> > On Mon, May 2, 2022 at 10:19 AM Martin Liška wrote:
> > >
> > > On 5/2/22 10:09, Richard Biener wrote:
> > > > On Mon, May 2, 2022 at 9:52 AM Martin Liška wrote:
> > > >>
> > > >> Hi.
> > > >>
> > > >> This in a new plug-in
> On Thu, Apr 28, 2022 at 01:54:51PM +0200, Jan Hubicka wrote:
> > > --- gcc/cgraph.cc.jj 2022-04-20 09:24:12.194579146 +0200
> > > +++ gcc/cgraph.cc 2022-04-27 11:53:52.102173154 +0200
> > > @@ -3488,7 +3488,9 @@ cgraph_node::verify_node (void)
&
Hello,
> Hi!
>
> The following testcase ICEs, because the ctors during cc1plus all have
> !opt_for_fn (decl, flag_semantic_interposition) - they have NULL
> DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl) and optimization_default_node
> is for -Ofast and so has flag_semantic_interposition cleared.
>
> Hi,
>
> In the PR, the verifier complains that we did not manage to remove the
> body of a node and it is right. The node is kept for materialization
> of two clones but after one is materialized, the other one is removed
> as unneeded (as a part of delete_unreachable_blocks_update_callgraph).
> On Wed, Apr 20, 2022 at 01:47:43PM +0200, Martin Jambor wrote:
> > Hi,
> >
> > On Wed, Apr 20 2022, Jan Hubicka via Gcc-patches wrote:
> > >> On Wed, 20 Apr 2022, Jakub Jelinek wrote:
> >
> > [...]
> >
>
>
> The cgraph.cc change was what I actually needed for the fix, the
> cgraphclones.cc was only because I've noticed that it constructs a new
> node (so is initialized to whatever random flag_semantic_interposition is
> right now) and initializing it to what it is cloned from made more sense.
> On Wed, Apr 20, 2022 at 10:45:53AM +0200, Jan Hubicka wrote:
> > So this change should be unnecessary unless there are nodes that are
> > missing finalization stage. It also is not good enough since frontends
> > may change opt_for_fn between node creation and finalizati
> From: Sergei Trofimovich
>
> TOPN metrics are histograms that contain overall count and per-bucket
> count. Overall count can be nevative when two profiles merge and some
> of per-bucket metrics are dropped.
>
> Noticed as an ICE on python PGO build where gcc crashes as:
>
> during IPA
> The following makes sure that when we build the versioning condition
> for vectorization including the cost model check, we check for the
> cost model and branch over other versioning checks. That is what
> the cost modeling assumes, since the cost model check is the only
> one accounted for in
> On Wed, 20 Apr 2022, Jakub Jelinek wrote:
>
> > Hi!
> >
> > cgraph_node has a semantic_interposition flag which should mirror
> > opt_for_fn (decl, flag_semantic_interposition). But it actually is
> > initialized not from that, but from flag_semantic_interposition in the
> > explicit
ngeLog:
2022-04-11 Jan Hubicka
* ipa-modref-tree.cc (modref_access_node::closer_pair_p): Use
poly_offset_int to avoid overflow.
(modref_access_node::update2): Likewise.
gcc/testsuite/ChangeLog:
2022-04-11 Jan Hubicka
* gcc.c-torture/compile/103818.c: New test.
Hi,
This patch solves problem with FE first finalizing function and then adding
-fno-semantic-interposition flag (by parsing optimization attribute).
Bootstrapped/regtested x86_64-linux, comitted.
Honza
gcc/ChangeLog:
2022-04-09 Jan Hubicka
PR ipa/103376
* cgraphunit.cc
Hi,
this patch adds logic to propagate nondeterministic and side_effects
bits in modref when summary is updated after inlining.
Bootstrapped/regtested x86_64-linux, comitted.
gcc/ChangeLog:
2022-04-09 Jan Hubicka
* ipa-modref.cc (ipa_merge_modref_summary_after_inlining): Propagate
> On Thu, 7 Apr 2022, Jan Hubicka wrote:
>
> > Hi,
> > this patch fixes miscompilation of gnatmake. Modref attempts to track
> > memory
> > accesses relative to the base pointers which are parameters of functions.
> > If it fails, it still makes diff
> On Thu, Apr 7, 2022 at 1:20 PM Jan Hubicka via Gcc-patches
> wrote:
> >
> > Hi,
> > this patch fixes ipa-modref propagation of pure/const functions. When we
> > inline
> > function, the modref summary is updated to represent the function after
> &g
.
Honza
gcc/ChangeLog:
2022-04-07 Jan Hubicka
PR 104303
* tree-ssa-alias.cc (ref_may_access_global_memory_p): Fix handling of
refs.
diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc
index 50bd47b31f3..9e34f76c3cb 100644
--- a/gcc/tree-ssa-alias.cc
+++ b/gcc
.
Honza
gcc/ChangeLog:
2022-04-07 Jan Hubicka
PR ipa/105160
* ipa-modref.cc (ipa_merge_modref_summary_after_inlining):
gcc/testsuite/ChangeLog:
2022-04-07 Jan Hubicka
PR ipa/105160
* gcc.dg/ipa/pr105160.c: New test.
diff --git a/gcc/ipa-modref.cc b/gcc
> Hi,
>
> PR 102513 shows we emit bogus array access warnings when IPA-CP
> creates clones specialized for values which it deduces from arithmetic
> jump functions describing self-recursive calls. Those can however be
> avoided if we consult the IPA-VR information that the same pass also
> has.
> Hi,
>
> in r12-2523-g13586172d0b70c ipa-prop tracking of jump functions during
> inlining got the ability to remove ADDR references when inlining
> discovered that they were not necessary or turn them into LOAD
> references when we know that what was a function call argument passed
> by
> IPA_JF_ANCESTOR jump functions are constructed also when the formal
> parameter of the caller is first checked whether it is NULL and left
> as it is if it is NULL, to accommodate C++ casts to an ancestor class.
>
> The jump function type was invented for devirtualization and IPA-CP
>
tomorrow if there are no
complains.
Honza
gcc/ChangeLog:
2022-03-28 Jan Hubicka
* config/i386/i386-builtins.cc (ix86_vectorize_builtin_gather): Test
TARGET_USE_GATHER_2PARTS and TARGET_USE_GATHER_4PARTS.
* config/i386/i386.h (TARGET_USE_GATHER_2PARTS): New macro
> +/* Returns whether the control parents of BB are preserved. */
> +
> +static bool
> +control_parents_preserved_p (basic_block bb)
> +{
> + /* If we marked the control parents from BB they are preserved. */
> + if (bitmap_bit_p (visited_control_parents, bb->index))
> +return true;
> +
>
> @@ -1272,7 +1275,7 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator
> *gsi,
> contributes nothing to the program, and can be deleted. */
>
> static bool
> -eliminate_unnecessary_stmts (void)
> +eliminate_unnecessary_stmts (bool aggressive)
> {
>bool something_changed = false;
> That's follow up patch based on the discussion with Jakub.
>
> Ready to be installed?
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> * config/rs6000/host-darwin.cc (segv_crash_handler):
> Do not use leading capital letter.
> (segv_handler): Likewise.
> * ipa-sra.cc
> The following change avoids doing IPA inlining of small functions
> into functions compiled with -Og - those functions will see almost no
> followup scalar cleanups so that the benefit anticipated by the
> inliner will not be realized and instead the late diagnostic code
> will be confused by
> With -Og we are not prepared to do cleanup after IPA optimizations
> and dead code exposed by those confuses late diagnostic passes.
> This is a first patch removing unwanted IPA optimizations, namely
> both late modref and pure-const analysis.
>
> Bootstrap and regtest running on
> > > > > --- a/gcc/ipa-split.c
> > > > > +++ b/gcc/ipa-split.c
> > > > > @@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb,
> > > > > gimple *stmt = gsi_stmt (bsi);
> > > > > tree op;
> > > > > ssa_op_iter iter;
> > > > > - tree decl;
> > > > > +
> The final index into (ira_)memory_move_cost is 1 for loads and
> 0 for stores. Thus the combination:
>
> entry_freq * memory_cost[1] + exit_freq * memory_cost[0]
>
> is the cost of loading a register on entry to a loop and
> storing it back on exit from the loop. This is the cost to
> use
> Hi!
>
> I'd like to ping the
> https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586553.html
> symtab: Fold == to 0 if folding_initializer [PR94716]
>
> patch. Thanks.
OK.
Note that with LTO partitioning it may happen that alias is defined in
one partition but used in another. We
> >
> > From: Xiong Hu Luo
> >
> > gcc/ChangeLog:
> >
> > * loop-invariant.c (find_invariants_bb): Check profile count
> > before motion.
> > (find_invariants_body): Add argument.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/loop-invariant-2.c: New.
OK,
thanks!
Honza
> - /* Proportion second loop's bb counts except those dominated by false
> -branch to avoid drop 1s down. */
> - basic_block bbi_copy = get_bb_copy (false_edge->dest);
> - bbs2 = get_loop_body (loop2);
> - for (j = 0; j < loop2->num_nodes; j++)
> - if (bbs2[j] ==
Hi,
in the testcase we fail to analyze SSA name because flag do_dataflow is set
and thus triggers early exist in analyze_ssa_name. Fixed by disabling
early exits when handling deferred names.
Bootstrapped/regtested x86_64-linux, comitted.
gcc/ChangeLog:
2021-12-20 Jan Hubicka
PR
/regtested x86_64-linux, comitted.
gcc/ChangeLog:
2021-12-19 Jan Hubicka
PR ipa/103766
* ipa-modref.c (modref_merge_call_site_flags): Fix early exit condition
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index d3590f0b62b..9c411a6297a 100644
--- a/gcc/ipa-modref.c
+++ b/gcc
>
> OK. Comments like?
>
> /* Don't move insn of cold BB out of loop to preheader to reduce calculations
>and register live range in hot loop with cold BB. */
Looks good.
>
>
> And maybe some dump log will help tracking in xxx.c.271r.loop2_invariant.
>
> --- a/gcc/loop-invariant.c
>
> >
> >
> > ./contrib/analyze_brprob.py ~/workspace/tests/spec2017/dump_file_all
> > HEURISTICS BRANCHES (REL) BR. HITRATE
> > HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT
> > branches (>10%)
> > noreturn call
Hi,
this is a variant I comitted (with updated documentation as Richard
requested).
Honza
gcc/ChangeLog:
2021-12-13 Jan Hubicka
* common.opt: Add -fipa-strict-aliasing.
* doc/invoke.texi: Document -fipa-strict-aliasing.
* ipa-modref.c (modref_access_analysis
> >>> + || (only_for_nonzero && !src_lats->bits_lattice.known_nonzero_p ()))
> >>> + {
> >>> + if (jfunc->bits)
> >>> + return dest_lattice->meet_with (jfunc->bits->value,
> >>> + jfunc->bits->mask, precision);
> >>> + else
> >>> + return
> > gcc/ChangeLog:
> >
> > * loop-invariant.c (find_invariants_bb): Check profile count
> > before motion.
> > (find_invariants_body): Add argument.
> > ---
> > gcc/loop-invariant.c | 10 +++---
> > 1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git
> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in
> profile-estimate when predict_extra_loop_exits, outer loop's exit edge
> is marked as inner loop's extra loop exit and set with incorrect
> prediction, then a hot inner loop will become cold loop finally through
>
> gcc/ChangeLog:
>
> * loop-invariant.c (find_invariants_bb): Check profile count
> before motion.
> (find_invariants_body): Add argument.
> ---
> gcc/loop-invariant.c | 10 +++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/loop-invariant.c
it is so rare case, i guess we can just punt and give up
on producing it.
Bootstrapped/regtsted x86_64-linux, OK?
gcc/ChangeLog:
2021-12-12 Jan Hubicka
* ipa-fnsummary.c (evaluate_conditions_for_known_args): Do not ICE
on ternary expression.
gcc/testsuite/ChangeLog:
2021-12
>
> I think this is common pattern in C++ code originating from cast with
> multiple inheritance. I would vote towards optimizing out the conditial
> move in this case and I think it is correct.
I crafted a testcse and filled in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103674
Honza
>
>
>
>
> On 12/12/2021 3:49 AM, Jan Hubicka via Gcc-patches wrote:
> > Hi,
> > As discussed in the PR, we miss some optimization becuase
> > gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds
> > __builtin_trap after them. This is seen
> On December 12, 2021 1:22:09 PM GMT+01:00, Jan Hubicka via Gcc-patches
> wrote:
> >Hi,
> >ipa-modref is using TBAA to disambiguate memory accesses inter-procedurally.
> >This sometimes breaks programs with TBAA violations including clang with LTO.
> >To workaroun
> >+ /* NULL memory accesses terminates BB. These accesses are known
> >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them
> >+ to volatile accesses and adds builtin_trap call which would
> >+ confuse us otherwise. */
> >+ if
patch that
controls only the TBAA based analysis in ipa-modref while keeping all other
optimizations.
Bootstrapped/regtested x86_64-linux, will commit it shortly.
gcc/ChangeLog:
2021-12-12 Jan Hubicka
* common.opt: Add -fipa-strict-aliasing.
* doc/invoke.texi: Document -fipa
disambiguations, 27571176 queries
pt_solutions_intersect: 1594296 disambiguations, 15943975 queries
Bootstrapped/regtested x86_64-linux, comitted.
gcc/ChangeLog:
2021-12-12 Jan Hubicka
PR ipa/103665
* ipa-modref.c (modref_access_analysis::analyze
but conceptualy simple and handles a lot
of common cases).
gcc/ChangeLog:
2021-12-12 Jan Hubicka
PR ipa/103585
* ipa-modref-tree.c (modref_access_node::range_info_useful_p): Handle
MODREF_GLOBAL_MEMORY_PARM.
(modref_access_node::dump): Likewise.
(modref_access_node
BBs hot and hten
unrolling or vectorization may reduce it to some fraction of the count that
makes it cold. We may want to add some buffer and divide the value by,
say 32, but that shoulid be done independently of speculative calls.
gcc/ChangeLog:
2021-12-11 Jan Hubicka
* ipa-profile.c
, will commit it shortly.
gcc/ChangeLog:
2021-12-11 Jan Hubicka
* ipa-modref.c (get_modref_function_summary): Use ultimate_alias_target.
(ignore_edge): Likewise.
(compute_parm_map): Likewise.
(modref_propagate_in_scc): Likewise
> Fixes ICE spotted by Honza where we have a better place where
> to check for no_profile_instrument_function attribute.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> PR ipa/103636
>
> gcc/ChangeLog:
>
>
> On Fri, Dec 10, 2021 at 2:30 AM Roger Sayle
> wrote:
> >
> >
> > This patch fixes PR ipa/103061 which is P1 regression that shows up as
> > an ICE in ipa-modref-tree.c's insert_kill when compiling the CSiBE
> > benchmark. I believe the underlying cause is that the new kill tracking
> >
>
> Ah, indeed, good idea. FYI, clang++ seems to constant fold
> (x) != (y) already, so Jonathan could use it even for
> clang++ in the constexpr operator==. But it folds even
> extern int ,
> constexpr bool c = !=
> regardless of whether some other TU has
> int a;
> int b
> > I plan to reduce the value during before christmas after bit more testing
> > since
> > it seems to be overall win even if we trade fatigue2 performance, but I
> > would
> > like to get more testing on larger C++ APPs first.
>
> Will this hurt -Os -finline-limit=0 ?
Why do you use
>
> Yeah, the fast summary array lookup itself seems fine. What slowed
> this down for me was instead:
>
> /* A single function body may be represented by multiple symbols with
> different visibility. For example, if FUNC is an interposable alias,
> we don't want to return
Hi,
as dicussed in PR ipa/103454 there are several benchmarks that regresses
for -finline-functions-called once. Runtmes:
- tramp3d with -Ofast. 31%
- exchange2 with -Ofast 11-21%
- roms O2 9%-10%
- tonto 2.5-3.5% with LTO
Build times:
- specfp2006 41% (mostly wrf that builds 71% faster)
-
> > Notice the ??? comment. The code does not set clobbers here because it
> > assumes that tree-ssa-alias will do the right thing.
> > So one may make builtins handling first, PTA next and only if both say
> > "may alias" continue. Other option is to extend the code here to add
> > propert
> The attached patch fixes an ICE in lto1 at lto-partition.c:215 that
> was reported by a customer. Unfortunately I have no test case for
> this; the customer's application is a big C++ shared library with lots
> of dependencies and proprietary code under NDA. I did try reducing it
> with cvise
301 - 400 of 5075 matches
Mail list logo