Re: [PATCH v2 2/3] doc: -falign-functions is ignored under -Os

2022-10-12 Thread Jan Hubicka via Gcc-patches
> This is implicitly mentioned in the docs, but there were some questions > in a recent patch. This makes it more exlicit that -falign-functions is > meant to be ignored under -Os. > > gcc/doc/ChangeLog > > * invoke.texi (-falign-functions): Mention -Os > --- > gcc/doc/invoke.texi | 3

Re: [PATCH] IPA: support -flto + -flive-patching=inline-clone

2022-10-07 Thread Jan Hubicka via Gcc-patches
> >> Probably not hard, and the IPA pass adjusting visbility could as well > >> mark the functions > >> as not to be inlined with -flive-patching=inline-only-static. > >> > > OTOH inline-only-static could disable WPA inlining and do all inlining > early ... > >>> > >>>

Re: [PATCH] IPA: support -flto + -flive-patching=inline-clone

2022-10-07 Thread Jan Hubicka via Gcc-patches
> > WPA is Whole Program Analysis? > > Yes. > > > Okay, then It will promote all static function to extern functions. That’s > > reasonable. > > No, all extern functions to static functions. > > > Is it hard to preserve the original “static” visibility in the IR? > > Probably not hard, and

Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.

2022-10-07 Thread Jan Hubicka via Gcc-patches
> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng wrote: > > > > From: Monk Chiang > > > > Currnetly setting of -falign-functions=N will be ignored if the function > > is optimized for size or marked as cold function. > > > > However function alignment requirement is needed even optimized for > > size

Re: [PATCH] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-09-20 Thread Jan Hubicka via Gcc-patches
> Hi Honza, > > This patch is to add attribute hot judgement for INLINE_HINT_known_hot hint. > > We set up INLINE_HINT_known_hot hint only when we have profile feedback, > now add function attribute judgement for it, when both caller and callee > have __attribute__((hot)), we will also set up

Fix invalid devirtualization when combining final keyword and anonymous types

2022-08-12 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes a wrong code issue where we incorrectly devirtualize to __builtin_unreachable. The problem occurs in combination of anonymous namespaces and final keyword used on methods. We do two optimizations here 1) when reacing final method we cut the search for possible new targets

Re: [PATCH] Properly honor param_max_fsm_thread_path_insns in backwards threader

2022-08-02 Thread Jan Hubicka via Gcc-patches
> On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > On Tue, Aug 2, 2022 at 1:45 PM Richard Biener wrote: > > > > > > On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > > > > > Unfortunately, this was before my time, so I don't know. > > > > > > > > That being said, thanks for tackling these issues

Re: [PATCH] IPA: reduce what we dump in normal mode

2022-08-02 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > * profile.cc (compute_branch_probabilities): Dump details only > if TDF_DETAILS. > * symtab.cc (symtab_node::dump_base): Do not dump pointer unless > TDF_ADDRESS is used, it makes comparison harder. > --- > gcc/profile.cc | 2 +- > gcc/symtab.cc | 3

Re: [PATCH] ipa-cp: Fix assert triggering with -fno-toplevel-reorder (PR 106260)

2022-07-18 Thread Jan Hubicka via Gcc-patches
> Hi, > > with -fno-toplevel-reorder (and -fwhole-program), there apparently can > be local functions without any callers. This is something that IPA-CP If there is possibility to trigger a local function without callers, I think one can also make two local functions calling each other but with

Re: [PATCH] Add a heuristic for eliminate redundant load and store in inline pass.

2022-07-07 Thread Jan Hubicka via Gcc-patches
Hello, > From: Lili > > > Hi Hubicka, > > This patch is to add a heuristic inline hint to eliminate redundant load and > store. > > Bootstrap and regtest pending on x86_64-unknown-linux-gnu. > OK for trunk? > > Thanks, > Lili. > > Add a INLINE_HINT_eliminate_load_and_store hint in to

Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-01 Thread Jan Hubicka via Gcc-patches
> On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin wrote: > > > > Hi, > > > > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html > > > > BR, > > Kewen > > > > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: > > > Hi, > > > > > > PR105459 exposes one issue in inline_call

Add fnspec attributes to cxa_* functions

2022-06-23 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds fnspecs for cxa_* functions in except.cc. Main goal is to make modref to see proper side-effects of functions which may throw. So in general we get - cxa_allocate_exception which gets the same annotations as malloc (since it is kind of same thing) - cxa_free_exception

Fix stmt_kills_ref_p wrt external throws

2022-06-23 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds missing check to stmt_kills_ref_p for case that function is terminated by EH before call return value kills the ref. In the PR I tried to construct testcase but I don't know how to do that until I annotate EH code with fnspec attributes which I will do in separate patch and

Re: [PATCH] ipa-icf: skip variables with body_removed

2022-06-22 Thread Jan Hubicka via Gcc-patches
> @Honza: PING > > On 5/20/22 09:46, Martin Liška wrote: > > On 5/19/22 17:02, Jan Hubicka wrote: > >>> Similarly to cgraph_nodes, it may happen that body_removed is set > >>> during merging of symbols. > >>> > >>> PR ipa/105600 > >>> > >>> Patch can bootstrap on x86_64-linux-gnu and survives

Re: [PATCH] Add operators / and * for profile_{count,probability}.

2022-06-17 Thread Jan Hubicka via Gcc-patches
> PING^2 Sorry, I thought it is approved once we settled down the multiplicatoin datatype, but apparently never sent the email. Patch is oK. Honza > > On 5/24/22 13:35, Martin Liška wrote: > > PING^1 > > > > On 5/5/22 20:15, Martin Liška wrote: > >> On 5/5/22 15:49, Jan Hubicka wrote: > >>> Hi,

Re: [PATCH] predict: Adjust optimize_function_for_size_p [PR105818]

2022-06-14 Thread Jan Hubicka via Gcc-patches
> Hi, > > Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO > if func->decl is not null but no cgraph node is available for it. > As PR105818 shows, this could give unexpected result. For the > case in PR105818, when parsing bar decl in function foo, the cfun > is a function

Fix ipa-prop wrt volatile memory accesses

2022-06-10 Thread Jan Hubicka via Gcc-patches
Hi, this patch prevents ipa-prop from propagating aggregates when load is volatile. Martin, does this look OK? It seem to me that ipa-prop may need some additional volatile flag checks. Bootstrapped/regtested x86_64-linux, OK? Honza gcc/ChangeLog: 2022-06-10 Jan Hubicka PR

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-25 Thread Jan Hubicka via Gcc-patches
> On Mon, 16 May 2022, Alexander Monakov wrote: > > > On Mon, 9 May 2022, Jan Hubicka wrote: > > > > > > On second thought, it might be better to keep the assert, and place the > > > > loop > > > > under 'if (optimize)'? > > > > > > The problem is that at IPA level it does not make sense to

Re: [PATCH] ipa-icf: skip variables with body_removed

2022-05-19 Thread Jan Hubicka via Gcc-patches
> Similarly to cgraph_nodes, it may happen that body_removed is set > during merging of symbols. > > PR ipa/105600 > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > * ipa-icf.cc

Re: [PATCH] lto-plugin: add support for feature detection

2022-05-16 Thread Jan Hubicka via Gcc-patches
> On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > >> > >> Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > >> is useful if bugs are discovered in old versions that you by definition > >> cannot > >> fix but can appl

Re: [PATCH] lto-plugin: add support for feature detection

2022-05-16 Thread Jan Hubicka via Gcc-patches
> > Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > is useful if bugs are discovered in old versions that you by definition cannot > fix but can apply workarounds to. Note the actual compiler used might still > differ. Note that still isn't clean API documentation /

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-09 Thread Jan Hubicka via Gcc-patches
> On Mon, 2 May 2022, Alexander Monakov wrote: > > > > --- a/gcc/ipa-visibility.cc > > > > +++ b/gcc/ipa-visibility.cc > > > > @@ -872,6 +872,22 @@ function_and_variable_visibility (bool > > > > whole_program) > > > > } > > > > } > > > > } > > > > + FOR_EACH_VARIABLE

Re: [PATCH] Add operators / and * for profile_{count,probability}.

2022-05-05 Thread Jan Hubicka via Gcc-patches
Hi, > The patch simplifies usage of the profile_{count,probability} types. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? The reason I intentionally did not add * and / to the original API was to detect situations where values that should

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-05 Thread Jan Hubicka via Gcc-patches
> On Thu, 5 May 2022, Jan Hubicka wrote: > > > Also note that visibility pass is run twice (once at compile time before > > early optimizations and then again at LTO). Since LTO linking may > > promote public symbols to local/hidden, perhaps we want to do this only > > second time the pass is

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-05 Thread Jan Hubicka via Gcc-patches
> > @@ -872,6 +872,22 @@ function_and_variable_visibility (bool whole_program) > > } > > } > > } > > + FOR_EACH_VARIABLE (vnode) > > +{ > > + tree decl = vnode->decl; > > + > > + /* Optimize TLS model based on visibility (taking into account > > +

Re: [PATCH] LTO plugin: add ld_plugin_version callback.

2022-05-02 Thread Jan Hubicka via Gcc-patches
> On Mon, May 2, 2022 at 10:51 AM Richard Biener > wrote: > > > > On Mon, May 2, 2022 at 10:19 AM Martin Liška wrote: > > > > > > On 5/2/22 10:09, Richard Biener wrote: > > > > On Mon, May 2, 2022 at 9:52 AM Martin Liška wrote: > > > >> > > > >> Hi. > > > >> > > > >> This in a new plug-in

Re: [PATCH] cgraph: Don't verify semantic_interposition flag for aliases [PR105399]

2022-04-28 Thread Jan Hubicka via Gcc-patches
> On Thu, Apr 28, 2022 at 01:54:51PM +0200, Jan Hubicka wrote: > > > --- gcc/cgraph.cc.jj 2022-04-20 09:24:12.194579146 +0200 > > > +++ gcc/cgraph.cc 2022-04-27 11:53:52.102173154 +0200 > > > @@ -3488,7 +3488,9 @@ cgraph_node::verify_node (void) > > >"returns a pointer"); > > >

Re: [PATCH] cgraph: Don't verify semantic_interposition flag for aliases [PR105399]

2022-04-28 Thread Jan Hubicka via Gcc-patches
Hello, > Hi! > > The following testcase ICEs, because the ctors during cc1plus all have > !opt_for_fn (decl, flag_semantic_interposition) - they have NULL > DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl) and optimization_default_node > is for -Ofast and so has flag_semantic_interposition cleared. >

Re: [PATCH] ipa: Release body of clone_of when removing its last clone (PR 100413)

2022-04-28 Thread Jan Hubicka via Gcc-patches
> Hi, > > In the PR, the verifier complains that we did not manage to remove the > body of a node and it is right. The node is kept for materialization > of two clones but after one is materialized, the other one is removed > as unneeded (as a part of delete_unreachable_blocks_update_callgraph).

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> On Wed, Apr 20, 2022 at 01:47:43PM +0200, Martin Jambor wrote: > > Hi, > > > > On Wed, Apr 20 2022, Jan Hubicka via Gcc-patches wrote: > > >> On Wed, 20 Apr 2022, Jakub Jelinek wrote: > > > > [...] > > >

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> > The cgraph.cc change was what I actually needed for the fix, the > cgraphclones.cc was only because I've noticed that it constructs a new > node (so is initialized to whatever random flag_semantic_interposition is > right now) and initializing it to what it is cloned from made more sense.

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> On Wed, Apr 20, 2022 at 10:45:53AM +0200, Jan Hubicka wrote: > > So this change should be unnecessary unless there are nodes that are > > missing finalization stage. It also is not good enough since frontends > > may change opt_for_fn between node creation and finalization of > > compilation

Re: [PATCH] gcov-profile: Allow negavive counts of indirect calls [PR105282]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> From: Sergei Trofimovich > > TOPN metrics are histograms that contain overall count and per-bucket > count. Overall count can be nevative when two profiles merge and some > of per-bucket metrics are dropped. > > Noticed as an ICE on python PGO build where gcc crashes as: > > during IPA

Re: [PATCH][v2] tree-optimization/104912 - ensure cost model is checked first

2022-04-20 Thread Jan Hubicka via Gcc-patches
> The following makes sure that when we build the versioning condition > for vectorization including the cost model check, we check for the > cost model and branch over other versioning checks. That is what > the cost modeling assumes, since the cost model check is the only > one accounted for in

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> On Wed, 20 Apr 2022, Jakub Jelinek wrote: > > > Hi! > > > > cgraph_node has a semantic_interposition flag which should mirror > > opt_for_fn (decl, flag_semantic_interposition). But it actually is > > initialized not from that, but from flag_semantic_interposition in the > > explicit

Avoid overflow in ipa-modref-tree.cc

2022-04-10 Thread Jan Hubicka via Gcc-patches
Hi, the testcase triggers ICE since computation overflows on two accesses that are very far away d->b[-144115188075855873] and d->b[144678138029277184]. This patch makes the relevant part of modref to use poly_offset_int. It is kind of weird to store bit offsets into poly_int64 but it is what

Fix ICE with -fno-semantic-interposition added via option attribut

2022-04-09 Thread Jan Hubicka via Gcc-patches
Hi, This patch solves problem with FE first finalizing function and then adding -fno-semantic-interposition flag (by parsing optimization attribute). Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: 2022-04-09 Jan Hubicka PR ipa/103376 * cgraphunit.cc

Fix nondeterministic and side_effect propagation in ipa-modref

2022-04-09 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds logic to propagate nondeterministic and side_effects bits in modref when summary is updated after inlining. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2022-04-09 Jan Hubicka * ipa-modref.cc (ipa_merge_modref_summary_after_inlining): Propagate

Re: Fix wrong code in gnatmake

2022-04-07 Thread Jan Hubicka via Gcc-patches
> On Thu, 7 Apr 2022, Jan Hubicka wrote: > > > Hi, > > this patch fixes miscompilation of gnatmake. Modref attempts to track > > memory > > accesses relative to the base pointers which are parameters of functions. > > If it fails, it still makes difference between unknown memory access and > >

Re: Fix pure/const propagation in modref

2022-04-07 Thread Jan Hubicka via Gcc-patches
> On Thu, Apr 7, 2022 at 1:20 PM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > this patch fixes ipa-modref propagation of pure/const functions. When we > > inline > > function, the modref summary is updated to represent the function after > &g

Fix wrong code in gnatmake

2022-04-07 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes miscompilation of gnatmake. Modref attempts to track memory accesses relative to the base pointers which are parameters of functions. If it fails, it still makes difference between unknown memory access and global memory access. The second makes it possible to disambiguate

Fix pure/const propagation in modref

2022-04-07 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes ipa-modref propagation of pure/const functions. When we inline function, the modref summary is updated to represent the function after inlining and there we need to propagate nondeterministic and side-effects flag. Bootstrapped/regtested x86_64-linux, will commit it shortly.

Re: [PATCH] ipa-cp: Do not create clones for values outside known value range (PR 102513)

2022-03-31 Thread Jan Hubicka via Gcc-patches
> Hi, > > PR 102513 shows we emit bogus array access warnings when IPA-CP > creates clones specialized for values which it deduces from arithmetic > jump functions describing self-recursive calls. Those can however be > avoided if we consult the IPA-VR information that the same pass also > has.

Re: [PATCH] ipa: Create LOAD references when necessary during inlining (PR 103171)

2022-03-31 Thread Jan Hubicka via Gcc-patches
> Hi, > > in r12-2523-g13586172d0b70c ipa-prop tracking of jump functions during > inlining got the ability to remove ADDR references when inlining > discovered that they were not necessary or turn them into LOAD > references when we know that what was a function call argument passed > by

Re: [PATCH] ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

2022-03-31 Thread Jan Hubicka via Gcc-patches
> IPA_JF_ANCESTOR jump functions are constructed also when the formal > parameter of the caller is first checked whether it is NULL and left > as it is if it is NULL, to accommodate C++ casts to an ancestor class. > > The jump function type was invented for devirtualization and IPA-CP >

Disable gathers on zen3 for vectors with few elements

2022-03-27 Thread Jan Hubicka via Gcc-patches
Hi, as seen on TSVC, Spec2017, the Zen3 gather instruction is a win only for vectors with 8 elements. At the time I was implementing the tuning vectorizer did not know how to open-code gather and thus it was still a win to enable it for shorter vector, but this has changed. The following are

Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-17 Thread Jan Hubicka via Gcc-patches
> +/* Returns whether the control parents of BB are preserved. */ > + > +static bool > +control_parents_preserved_p (basic_block bb) > +{ > + /* If we marked the control parents from BB they are preserved. */ > + if (bitmap_bit_p (visited_control_parents, bb->index)) > +return true; > + >

Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-15 Thread Jan Hubicka via Gcc-patches
> @@ -1272,7 +1275,7 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator > *gsi, > contributes nothing to the program, and can be deleted. */ > > static bool > -eliminate_unnecessary_stmts (void) > +eliminate_unnecessary_stmts (bool aggressive) > { >bool something_changed = false;

Re: [PATCH] internal_error - do not use leading capital letter

2022-01-27 Thread Jan Hubicka via Gcc-patches
> That's follow up patch based on the discussion with Jakub. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > * config/rs6000/host-darwin.cc (segv_crash_handler): > Do not use leading capital letter. > (segv_handler): Likewise. > * ipa-sra.cc

Re: [PATCH] ipa/103989 - avoid IPA inlining of small functions with -Og

2022-01-18 Thread Jan Hubicka via Gcc-patches
> The following change avoids doing IPA inlining of small functions > into functions compiled with -Og - those functions will see almost no > followup scalar cleanups so that the benefit anticipated by the > inliner will not be realized and instead the late diagnostic code > will be confused by

Re: [PATCH] ipa/103989 - tame IPA optimizations at -Og

2022-01-18 Thread Jan Hubicka via Gcc-patches
> With -Og we are not prepared to do cleanup after IPA optimizations > and dead code exposed by those confuses late diagnostic passes. > This is a first patch removing unwanted IPA optimizations, namely > both late modref and pure-const analysis. > > Bootstrap and regtest running on

Re: [PATCH] Fix tree-optimization/101941: IPA splitting out function with error attribute

2022-01-14 Thread Jan Hubicka via Gcc-patches
> > > > > --- a/gcc/ipa-split.c > > > > > +++ b/gcc/ipa-split.c > > > > > @@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb, > > > > > gimple *stmt = gsi_stmt (bsi); > > > > > tree op; > > > > > ssa_op_iter iter; > > > > > - tree decl; > > > > > +

Re: [PATCH 1/6] ira: Add a ira_loop_border_costs class

2022-01-06 Thread Jan Hubicka via Gcc-patches
> The final index into (ira_)memory_move_cost is 1 for loads and > 0 for stores. Thus the combination: > > entry_freq * memory_cost[1] + exit_freq * memory_cost[0] > > is the cost of loading a register on entry to a loop and > storing it back on exit from the loop. This is the cost to > use

Re: Patch ping

2022-01-03 Thread Jan Hubicka via Gcc-patches
> Hi! > > I'd like to ping the > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586553.html > symtab: Fold == to 0 if folding_initializer [PR94716] > > patch. Thanks. OK. Note that with LTO partitioning it may happen that alias is defined in one partition but used in another. We

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-29 Thread Jan Hubicka via Gcc-patches
> > > > From: Xiong Hu Luo > > > > gcc/ChangeLog: > > > > * loop-invariant.c (find_invariants_bb): Check profile count > > before motion. > > (find_invariants_body): Add argument. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/loop-invariant-2.c: New. OK, thanks! Honza

Re: [PATCH] Fix ICE in lsplit when built with -O3 -fno-guess-branch-probability [PR103793]

2021-12-28 Thread Jan Hubicka via Gcc-patches
> - /* Proportion second loop's bb counts except those dominated by false > -branch to avoid drop 1s down. */ > - basic_block bbi_copy = get_bb_copy (false_edge->dest); > - bbs2 = get_loop_body (loop2); > - for (j = 0; j < loop2->num_nodes; j++) > - if (bbs2[j] ==

Fix handling of deferred SSA names in modref

2021-12-19 Thread Jan Hubicka via Gcc-patches
Hi, in the testcase we fail to analyze SSA name because flag do_dataflow is set and thus triggers early exist in analyze_ssa_name. Fixed by disabling early exits when handling deferred names. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2021-12-20 Jan Hubicka PR

Fix early exit in modref_merge_call_site_flags

2021-12-19 Thread Jan Hubicka via Gcc-patches
Hi, when adding support for static chain and return slot flags I forgot to update early exit condition in modref_merge_call_site_flags. This yields to wrong code as demonstrated by the Fortran testcase attached to PR (which I hope someone will help me to turn into testuite one).

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-16 Thread Jan Hubicka via Gcc-patches
> > OK. Comments like? > > /* Don't move insn of cold BB out of loop to preheader to reduce calculations >and register live range in hot loop with cold BB. */ Looks good. > > > And maybe some dump log will help tracking in xxx.c.271r.loop2_invariant. > > --- a/gcc/loop-invariant.c >

Re: [PATCH 2/3] Fix incorrect loop exit edge probability [PR103270]

2021-12-16 Thread Jan Hubicka via Gcc-patches
> > > > > > ./contrib/analyze_brprob.py ~/workspace/tests/spec2017/dump_file_all > > HEURISTICS BRANCHES (REL) BR. HITRATE > > HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT > > branches (>10%) > > noreturn call

Re: Add -fipa-strict-aliasing

2021-12-13 Thread Jan Hubicka via Gcc-patches
Hi, this is a variant I comitted (with updated documentation as Richard requested). Honza gcc/ChangeLog: 2021-12-13 Jan Hubicka * common.opt: Add -fipa-strict-aliasing. * doc/invoke.texi: Document -fipa-strict-aliasing. * ipa-modref.c

Re: [PATCH] ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

2021-12-13 Thread Jan Hubicka via Gcc-patches
> >>> + || (only_for_nonzero && !src_lats->bits_lattice.known_nonzero_p ())) > >>> + { > >>> + if (jfunc->bits) > >>> + return dest_lattice->meet_with (jfunc->bits->value, > >>> + jfunc->bits->mask, precision); > >>> + else > >>> + return

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-13 Thread Jan Hubicka via Gcc-patches
> > gcc/ChangeLog: > > > > * loop-invariant.c (find_invariants_bb): Check profile count > > before motion. > > (find_invariants_body): Add argument. > > --- > > gcc/loop-invariant.c | 10 +++--- > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > diff --git

Re: [PATCH 2/3] Fix incorrect loop exit edge probability [PR103270]

2021-12-13 Thread Jan Hubicka via Gcc-patches
> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in > profile-estimate when predict_extra_loop_exits, outer loop's exit edge > is marked as inner loop's extra loop exit and set with incorrect > prediction, then a hot inner loop will become cold loop finally through >

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-13 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > * loop-invariant.c (find_invariants_bb): Check profile count > before motion. > (find_invariants_body): Add argument. > --- > gcc/loop-invariant.c | 10 +++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/gcc/loop-invariant.c

Do not ICE when computing value range of ternary expression

2021-12-12 Thread Jan Hubicka via Gcc-patches
Hi, In evaluate_conditions_for_known_args we use range_fold_unary_expr and range_fold_binary_expr to produce value ranges of the expression. However the expression also may contain ternary COND_EXPR on which we ICE. I did not find interface to do similar folding easily on ternary exprs and since

Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
> > I think this is common pattern in C++ code originating from cast with > multiple inheritance. I would vote towards optimizing out the conditial > move in this case and I think it is correct. I crafted a testcse and filled in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103674 Honza > >

Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
> > > On 12/12/2021 3:49 AM, Jan Hubicka via Gcc-patches wrote: > > Hi, > > As discussed in the PR, we miss some optimization becuase > > gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds > > __builtin_trap after them. This is seen

Re: Add -fipa-strict-aliasing

2021-12-12 Thread Jan Hubicka via Gcc-patches
> On December 12, 2021 1:22:09 PM GMT+01:00, Jan Hubicka via Gcc-patches > wrote: > >Hi, > >ipa-modref is using TBAA to disambiguate memory accesses inter-procedurally. > >This sometimes breaks programs with TBAA violations including clang with LTO. > >To workaroun

Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
> >+ /* NULL memory accesses terminates BB. These accesses are known > >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them > >+ to volatile accesses and adds builtin_trap call which would > >+ confuse us otherwise. */ > >+ if

Add -fipa-strict-aliasing

2021-12-12 Thread Jan Hubicka via Gcc-patches
Hi, ipa-modref is using TBAA to disambiguate memory accesses inter-procedurally. This sometimes breaks programs with TBAA violations including clang with LTO. To workaround that one can use -fno-strict-aliasing or -fno-ipa-modref which are both quite big hammers. So I added -fipa-strict-aliasing

Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
Hi, As discussed in the PR, we miss some optimization becuase gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds __builtin_trap after them. This is seen as a side-effect by IPA analysis and additionally the (fully unreachable) builtin_trap is believed to load all global

Distinguish global and unkonwn memory accesses in ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
Hi, As discussed in PR103585, fatigue2 is now only benchmark from my usual testing set (SPEC2k6, SPEC2k17, CPP benchmarks, polyhedron, Firefox, clang) which sees important regression when inlining functions called once is limited. This prevents us from solving runtime issues in roms benchmarks

Fix handling of histogram in ipa-profile

2021-12-11 Thread Jan Hubicka via Gcc-patches
Hi, this patch removes apparently forgotten debugging hack (which got in during the speculative call patchset) which reduces hot bb threshold. This does not make sense since it is set and reset randomly as the summaries are processed. One problem is that we set the BB threshold to make certain

Fix ipa-modref handling of thunks

2021-12-11 Thread Jan Hubicka via Gcc-patches
Hi, thunks are not transparent for ipa-modref summary since it cares about offsets from pointer parameters and also for virtual thunk about the read from memory in there. We however use function_or_virtual_thunk_symbol to get the summary that may lead to wrong code (and does in two testsuite

Re: [PATCH] inline: fix ICE with -fprofile-generate

2021-12-10 Thread Jan Hubicka via Gcc-patches
> Fixes ICE spotted by Honza where we have a better place where > to check for no_profile_instrument_function attribute. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > PR ipa/103636 > > gcc/ChangeLog: > >

Re: [PATCH] PR ipa/103601: ICE compiling CSiBE in ipa-modref's insert_kill

2021-12-10 Thread Jan Hubicka via Gcc-patches
> On Fri, Dec 10, 2021 at 2:30 AM Roger Sayle > wrote: > > > > > > This patch fixes PR ipa/103061 which is P1 regression that shows up as > > an ICE in ipa-modref-tree.c's insert_kill when compiling the CSiBE > > benchmark. I believe the underlying cause is that the new kill tracking > >

Re: [PATCH] c++, symtab: Support (x) == (y) in constant evaluation [PR103600]

2021-12-09 Thread Jan Hubicka via Gcc-patches
> > Ah, indeed, good idea. FYI, clang++ seems to constant fold > (x) != (y) already, so Jonathan could use it even for > clang++ in the constexpr operator==. But it folds even > extern int , > constexpr bool c = != > regardless of whether some other TU has > int a; > int b

Re: Limit inlining functions called once

2021-12-09 Thread Jan Hubicka via Gcc-patches
> > I plan to reduce the value during before christmas after bit more testing > > since > > it seems to be overall win even if we trade fatigue2 performance, but I > > would > > like to get more testing on larger C++ APPs first. > > Will this hurt -Os -finline-limit=0 ? Why do you use

Re: [PATCH] alias: Optimise call_may_clobber_ref_p

2021-12-07 Thread Jan Hubicka via Gcc-patches
> > Yeah, the fast summary array lookup itself seems fine. What slowed > this down for me was instead: > > /* A single function body may be represented by multiple symbols with > different visibility. For example, if FUNC is an interposable alias, > we don't want to return

Limit inlining functions called once

2021-12-07 Thread Jan Hubicka via Gcc-patches
Hi, as dicussed in PR ipa/103454 there are several benchmarks that regresses for -finline-functions-called once. Runtmes: - tramp3d with -Ofast. 31% - exchange2 with -Ofast 11-21% - roms O2 9%-10% - tonto 2.5-3.5% with LTO Build times: - specfp2006 41% (mostly wrf that builds 71% faster) -

Re: [PATCH] alias: Optimise call_may_clobber_ref_p

2021-12-07 Thread Jan Hubicka via Gcc-patches
> > Notice the ??? comment. The code does not set clobbers here because it > > assumes that tree-ssa-alias will do the right thing. > > So one may make builtins handling first, PTA next and only if both say > > "may alias" continue. Other option is to extend the code here to add > > propert

Re: [patch] lto: Don't run ipa-comdats pass during LTO

2021-12-07 Thread Jan Hubicka via Gcc-patches
> The attached patch fixes an ICE in lto1 at lto-partition.c:215 that > was reported by a customer. Unfortunately I have no test case for > this; the customer's application is a big C++ shared library with lots > of dependencies and proprietary code under NDA. I did try reducing it > with cvise

Re: [PATCH] alias: Optimise call_may_clobber_ref_p

2021-12-06 Thread Jan Hubicka via Gcc-patches
> On Mon, Dec 6, 2021 at 4:03 PM Richard Biener > wrote: > > > > On Mon, Dec 6, 2021 at 11:10 AM Richard Sandiford > > wrote: > > > > > > Richard Biener writes: > > > > On Sun, Dec 5, 2021 at 10:59 PM Richard Sandiford via Gcc-patches > > > > wrote: > > > >> > > > >> When compiling an

Re: Improve -fprofile-report

2021-12-03 Thread Jan Hubicka via Gcc-patches
> On 11/27/21 16:56, Jan Hubicka via Gcc-patches wrote: > > Hi, > > Profile-report was never properly updated after switch to new profile > > representation. This patch fixes the way profile mismatches are > > calculated: we used to collect separately count and freq m

Re: [PATCH] ipa-param-manip: Be careful about a reallocating hash_map (PR 103449)

2021-11-29 Thread Jan Hubicka via Gcc-patches
> Hi, > > PR 103449 revealed that when I was storing result of one hash_map > lookup into another entry in the hash_map, I was still accessing the > entry in the table, which meanwhile could get reallocated, making the > accesses invalid-after-free. > > Fixed with the following, which also

Compare guessed profile frequencies to actual profile feedback in profile dump file

2021-11-28 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds simple code to dump and compare frequencies of basic blocks read from the profile feedback and frequencies guessed statically. It dumps basic blocks in the order of decreasing frequencies from feedback along with guessed frequencies and histograms. It makes it to possible spot

Improve -fprofile-report

2021-11-27 Thread Jan Hubicka via Gcc-patches
Hi, Profile-report was never properly updated after switch to new profile representation. This patch fixes the way profile mismatches are calculated: we used to collect separately count and freq mismatches, while now we have only counts & probabilities. So we verify - in count: that total

Re: [PATCH] ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

2021-11-27 Thread Jan Hubicka via Gcc-patches
> Hi, > > IPA_JF_ANCESTOR jump functions are constructed also when the formal > parameter of the caller is first checked whether it is NULL and left > as it is if it is NULL, to accommodate C++ casts to an ancestor class. > > The jump function type was invented for devirtualization and IPA-CP >

Minod modref tweeks

2021-11-26 Thread Jan Hubicka via Gcc-patches
Hi, while working on analyzing the previous miscomple I made dumps easier to read by dumping cgraph_node name rather then cfun name in function being analysed and I also fixed minor issue with ECF flags merging when updating inline summary. gcc/ChangeLog: 2021-11-26 Jan Hubicka *

Fix wrong code caused by min_flags update in update_summary

2021-11-26 Thread Jan Hubicka via Gcc-patches
Hi update_escape_summary_1 has thinko where it compues proper min_flags but then stores original value (ignoring the fact whether there was a dereference in the escape point). Bootstrapped/regtested and comitted. PR ipa/103432 * ipa-modref.c (update_escape_summary_1): Fix handling

Fix fail in inline-9.c testcase

2021-11-26 Thread Jan Hubicka via Gcc-patches
Hi, it turns out that I made testcase for value range propagation (which was disabled by accidental return statement) but the testcase was confused by partial inlininig. The right number of inlines is 2, since the function in question is first split and then both function and the split part

Re: [PATCH] ipa: Teach IPA-CP transformation about IPA-SRA modifications (PR 103227)

2021-11-25 Thread Jan Hubicka via Gcc-patches
> > > > In ipa-modref I precompute this to map so we do not need to walk all > > params, but the loop is probably not bad since functions do not have > > tens of thousdands parameters :) > > The most I have seen is about 70 and those were big outliers. > > I was thinking of precomputing it

Re: [PATCH] [RFC] unreachable returns

2021-11-25 Thread Jan Hubicka via Gcc-patches
> We have quite a number of "default" returns that cannot be reached. > One is particularly interesting since it says (see patch below): > > default: >gcc_unreachable (); > } >/* We can get here with --disable-checking. */ >return false; > > which suggests that _maybe_

Do not check gimple_call_chain in tree-ssa-alias

2021-11-25 Thread Jan Hubicka via Gcc-patches
Hi, this pach removes gimple_call_cahin checkin ref_maybe_used_by_call_p that disables check for CONST functions. I suppose it was meant to allow consts to read variables from the static chain but this is not what other places do. The testcase: int main() { int a =0;

Re: [PATCH] Remove dead code and function

2021-11-25 Thread Jan Hubicka via Gcc-patches
> The only use of get_alias_symbol is gated by a gcc_unreachable (), > so the following patch gets rid of it. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? OK, thanks! Honza

Re: [PATCH] ipa: Teach IPA-CP transformation about IPA-SRA modifications (PR 103227)

2021-11-25 Thread Jan Hubicka via Gcc-patches
> > gcc/ChangeLog: > > 2021-11-23 Martin Jambor > > PR ipa/103227 > * ipa-prop.h (ipa_get_param): New overload. Move bits of the existing > one to the new one. > * ipa-param-manipulation.h (ipa_param_adjustments): New member > function

Fix handling of static chain in modref

2021-11-24 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes wrong code issue where modref did not propagate flags for static chain in ipa_merge_modref_summary_after_inlininig. It is a place I missed to update in original patch extending return slot tracking to static chain. Unlike return slot we need to propagate flags here (return

Re: [PATCH] tree-optimization/103168 - Improve VN of pure function calls

2021-11-24 Thread Jan Hubicka via Gcc-patches
> > Yes, note that we don't have callused unless IPA PTA is enabled, > but it might be salveagable from IPA reference info? What we're > missing is a stmt_clobbers_pt_solution_p, or rather a reasonably > cheap way to construct an ao_ref covering all of a points-to > solution. The not-so-cheap

Re: [PATCH] tree-optimization/103168 - Improve VN of pure function calls

2021-11-24 Thread Jan Hubicka via Gcc-patches
> This improves value-numbering of calls that read memory, calls > to const functions with aggregate arguments and calls to > pure functions where the latter include const functions we > demoted to pure for the fear of interposing with a less > optimized version. Note that for pure functions we

<    1   2   3   4   5   >