Re: Basic kill analysis for modref

2021-11-12 Thread Jan Hubicka via Gcc-patches
> > I wonder why we bother producing summaries for things that do not > bind locally? The summary->kills.length () has an upper bound? Because of local aliases. The size of the array is capped by param_max_modref_accesses which is 16. > > > + && summary->kills.length ()) > > + { > > +

Fix ipa-modref pure/const discovery

2021-11-12 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes bug I introduced while breaking up the bigger change. We currently can not use pure/const to discover looping pures since lack of global memory writes/stores does not imply we can CSE on the function. THis is witnessed by testsuite doing volatile asm or also can happen if

Re: Use modref summary to DSE calls to non-pure functions

2021-11-12 Thread Jan Hubicka via Gcc-patches
Hi, this is updated patch. It moves the summary walk checking if we can possibly suceed on dse to summary->finalize member function so it is done once per summary and refactors dse_optimize_call to be called from dse_optimize_stmt after early checks. I did not try to handle the special case of

Use modref kills in tree-ssa-dse

2021-11-14 Thread Jan Hubicka via Gcc-patches
Hi, this patch extends tree-ssa-dse to use modref kill summary to clear live_bytes. This makes it possible to remove calls that are killed in parts. I noticed that DSE duplicates the logic of tree-ssa-alias that is mathing bases of memory accesses. Here operands_equal_p (base1, base,

Track nondeterminism and interposable calls in ipa-modref

2021-11-14 Thread Jan Hubicka via Gcc-patches
Hi, This patch adds tracking of two new flags in ipa-modref: nondeterministic and calls_interposable. First is set when function does something that is not guaranteed to be the same if run again (volatile memory access, volatile asm or external function call). Second is set if function calls

Re: Basic kill analysis for modref

2021-11-14 Thread Jan Hubicka via Gcc-patches
> > > > I think you want get_addr_base_and_unit_offset here. All > > variable indexed addresses are in separate stmts. That also means > > you can eventually work with just byte sizes/offsets? > > Will do. The access range in modref summary is bit based (since we want > to disabiguate

Re: [PATCH] tree-optimization/103168 - Improve VN of pure function calls

2021-11-24 Thread Jan Hubicka via Gcc-patches
> This improves value-numbering of calls that read memory, calls > to const functions with aggregate arguments and calls to > pure functions where the latter include const functions we > demoted to pure for the fear of interposing with a less > optimized version. Note that for pure functions we

Re: [PATCH] tree-optimization/103168 - Improve VN of pure function calls

2021-11-24 Thread Jan Hubicka via Gcc-patches
> > Yes, note that we don't have callused unless IPA PTA is enabled, > but it might be salveagable from IPA reference info? What we're > missing is a stmt_clobbers_pt_solution_p, or rather a reasonably > cheap way to construct an ao_ref covering all of a points-to > solution. The not-so-cheap

Improve -fprofile-report

2021-11-27 Thread Jan Hubicka via Gcc-patches
Hi, Profile-report was never properly updated after switch to new profile representation. This patch fixes the way profile mismatches are calculated: we used to collect separately count and freq mismatches, while now we have only counts & probabilities. So we verify - in count: that total

Compare guessed profile frequencies to actual profile feedback in profile dump file

2021-11-28 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds simple code to dump and compare frequencies of basic blocks read from the profile feedback and frequencies guessed statically. It dumps basic blocks in the order of decreasing frequencies from feedback along with guessed frequencies and histograms. It makes it to possible spot

Re: [PATCH] ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

2021-11-27 Thread Jan Hubicka via Gcc-patches
> Hi, > > IPA_JF_ANCESTOR jump functions are constructed also when the formal > parameter of the caller is first checked whether it is NULL and left > as it is if it is NULL, to accommodate C++ casts to an ancestor class. > > The jump function type was invented for devirtualization and IPA-CP >

Fix handling of static chain in modref

2021-11-24 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes wrong code issue where modref did not propagate flags for static chain in ipa_merge_modref_summary_after_inlininig. It is a place I missed to update in original patch extending return slot tracking to static chain. Unlike return slot we need to propagate flags here (return

Fix fail in inline-9.c testcase

2021-11-26 Thread Jan Hubicka via Gcc-patches
Hi, it turns out that I made testcase for value range propagation (which was disabled by accidental return statement) but the testcase was confused by partial inlininig. The right number of inlines is 2, since the function in question is first split and then both function and the split part

Fix wrong code caused by min_flags update in update_summary

2021-11-26 Thread Jan Hubicka via Gcc-patches
Hi update_escape_summary_1 has thinko where it compues proper min_flags but then stores original value (ignoring the fact whether there was a dereference in the escape point). Bootstrapped/regtested and comitted. PR ipa/103432 * ipa-modref.c (update_escape_summary_1): Fix handling

Minod modref tweeks

2021-11-26 Thread Jan Hubicka via Gcc-patches
Hi, while working on analyzing the previous miscomple I made dumps easier to read by dumping cgraph_node name rather then cfun name in function being analysed and I also fixed minor issue with ECF flags merging when updating inline summary. gcc/ChangeLog: 2021-11-26 Jan Hubicka *

Re: [PATCH] ipa: Teach IPA-CP transformation about IPA-SRA modifications (PR 103227)

2021-11-25 Thread Jan Hubicka via Gcc-patches
> > gcc/ChangeLog: > > 2021-11-23 Martin Jambor > > PR ipa/103227 > * ipa-prop.h (ipa_get_param): New overload. Move bits of the existing > one to the new one. > * ipa-param-manipulation.h (ipa_param_adjustments): New member > function

Re: [PATCH] Remove dead code and function

2021-11-25 Thread Jan Hubicka via Gcc-patches
> The only use of get_alias_symbol is gated by a gcc_unreachable (), > so the following patch gets rid of it. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? OK, thanks! Honza

Do not check gimple_call_chain in tree-ssa-alias

2021-11-25 Thread Jan Hubicka via Gcc-patches
Hi, this pach removes gimple_call_cahin checkin ref_maybe_used_by_call_p that disables check for CONST functions. I suppose it was meant to allow consts to read variables from the static chain but this is not what other places do. The testcase: int main() { int a =0;

Re: [PATCH] [RFC] unreachable returns

2021-11-25 Thread Jan Hubicka via Gcc-patches
> We have quite a number of "default" returns that cannot be reached. > One is particularly interesting since it says (see patch below): > > default: >gcc_unreachable (); > } >/* We can get here with --disable-checking. */ >return false; > > which suggests that _maybe_

Improve modref tracking of base pointers

2021-11-22 Thread Jan Hubicka via Gcc-patches
Hi, on exchange2 benchamrk we miss some useful propagation because modref gives up very early on analyzing accesses through pointers. For example in int test (int *a) { int i; for (i=0; a[i];i++); return i+a[i]; } We are not able to determine that a[i] accesses are relative to a. This is

Fix crash in gamess

2021-11-13 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds debug counters for pure/const discover and fixes somewhat embarrasing pasto I made while breaking out ipa_make_function_* helpers out of propagate_pure_const which led to wrong function being marked as pure that in turn leads to wrong code. My apologizes for that.

Re: Enable pure/const discovery in modref

2021-11-12 Thread Jan Hubicka via Gcc-patches
> Hi Honza, > > On Thu, 11 Nov 2021 17:39:18 +0100 > Jan Hubicka via Gcc-patches wrote: > > > diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c > > index 422b52fba4b..550bdeded16 100644 > > --- a/gcc/ipa-pure-const.c > > +++ b/gcc/

Fix wrong code with pure functions

2021-11-12 Thread Jan Hubicka via Gcc-patches
Fix wrong code with pure functions I introduced bug into find_func_aliases_for_call in handling pure functions. Instead of reading global memory pure functions are believed to write global memory. This results in misoptimization of the testcase at -O1. The change to pta-callused.c updates the

Re: [PATCH] vect: Remove vec_outside/inside_cost fields

2021-11-11 Thread Jan Hubicka via Gcc-patches
> > > > > > I think the patch causes the following on x86_64-linux-gnu: > > > FAIL: gfortran.dg/inline_matmul_17.f90 -O scan-tree-dump-times > > > optimized "matmul_r4" 2 > > > > I get that failure even with d70ef65692f (from before the patches > > I committed today). > > Sorry, you are

Fix optimization difference caused by -fdump-ipa-inline

2021-11-16 Thread Jan Hubicka via Gcc-patches
Hi, This patch fixes a bug that caused some optimizations to be dropped with -fdump-ipa-inline. gcc/ChangeLog: 2021-11-17 Jan Hubicka PR ipa/103246 * ipa-modref.c (ipa_merge_modref_summary_after_inlining): Fix clearing of to_info_lto diff --git a/gcc/ipa-modref.c

Re: [PATCH] Fix IPA modref ubsan.

2021-11-18 Thread Jan Hubicka via Gcc-patches
> modref_tree::merge(modref_tree*, vec va_heap, vl_ptr>*, modref_parm_map*, bool) > > is called with modref_parm_map chain_map; > > The variable has uninitialized m.parm_offset_known and it is accessed > here: > > gcc/ipa-modref-tree.h:572 a.parm_offset_known &= m.parm_offset_known; > > Ready

Fix modref and hadnling of some builtins

2021-11-12 Thread Jan Hubicka via Gcc-patches
Hi, ipa-modref gets confused by EAF flags of memcpy becuase parameter 1 is escaping but used only directly. In modref we do not track values saved to memory and thus we clear all other flags on each store. This needs to also happen when called function escapes parameter. gcc/ChangeLog:

Re: [PATCH] ipa: Teach IPA-CP transformation about IPA-SRA modifications (PR 103227)

2021-11-25 Thread Jan Hubicka via Gcc-patches
> > > > In ipa-modref I precompute this to map so we do not need to walk all > > params, but the loop is probably not bad since functions do not have > > tens of thousdands parameters :) > > The most I have seen is about 70 and those were big outliers. > > I was thinking of precomputing it

Re: Use modref summary to DSE calls to non-pure functions

2021-11-11 Thread Jan Hubicka via Gcc-patches
> > + /* Unlike alias oracle we can not skip subtrees based on TBAA check. > > + Count the size of the whole tree to verify that we will not need too > > many > > + tests. */ > > + FOR_EACH_VEC_SAFE_ELT (summary->stores->bases, i, base_node) > > +FOR_EACH_VEC_SAFE_ELT

Re: Use modref summary to DSE calls to non-pure functions

2021-11-11 Thread Jan Hubicka via Gcc-patches
Hi, > > No, I think if it turns out useful then we want a way to have such ref > represented by an ao_ref. Note that when we come from a > ref tree we know handled-components only will increase offset, > only the base MEM_REF can contain a pointer subtraction (but > the result of that is the

Basic kill analysis for modref

2021-11-11 Thread Jan Hubicka via Gcc-patches
Hi, This patch enables optimization of stores that are killed by calls. Modref summary is extended by array containing list of access ranges, relative to function parameters, that are known to be killed by the function. This array is collected during local analysis and optimized (so separate

Add EAF_NOT_RETURNED_DIRECTLY

2021-11-01 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds EAF_NOT_RETURNED_DIRECTLY which works similarly as EAF_NODIRECTESCAPE. Values pointed to by a given argument may be returned but not the argument itself. This helps PTA quite noticeably because we mostly care about tracking points to which given memory location can escape. I

ipa-modref cleanup

2021-11-02 Thread Jan Hubicka via Gcc-patches
Hi, this patch is a small refactoring of ipa-modref to make it bit more C++y by moving logic analyzing ssa name flags to a class and I also moved the anonymous namespace markers so we do not export unnecessary stuff. There are no functional changes. Bootstrapped/regtested x86_64-linux, will

Re: [Bug d/103040] [12 Regression] gdc.dg/torture/pr101273.d FAILs

2021-11-02 Thread Jan Hubicka via Gcc-bugs
> See above comments from Iain, even if that pre-initialization is removed it is > still miscompiled. And, the testcase fails not because of the padding bits > not > being zero, but because the address of self stored into one of the fields > isn't > there or modref thinks it can't be changed or

Re: [Bug d/103040] [12 Regression] gdc.dg/torture/pr101273.d FAILs

2021-11-02 Thread Jan Hubicka via Gcc-bugs
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103040 > > --- Comment #15 from Iain Buclaw --- > Got it. The difference between D and C++ is a matter of early inlining. > > The C++ example Jakub posted fails in the same way that D does if you compile > with: -O1 -fno-inline Great, I will take a

Re: ipa-modref cleanup

2021-11-02 Thread Jan Hubicka via Gcc-patches
> It broke GCC bootstrap: > > https://gcc.gnu.org/pipermail/gcc-regression/2021-November/075676.html > > In file included from ../../src-master/gcc/coretypes.h:474, > from ../../src-master/gcc/expmed.c:26: > In function ‘poly_uint16 mode_to_bytes(machine_mode)’, > inlined

Fix wrong code caused by ipa-modref retslot handling

2021-11-02 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes (quite nasty) thinko in how I propagate EAF flags from callee to caller. In this case some flags needs to be changed. In particular - EAF_NOT_RETURNED in callee does not really mean EAF_NOT_RETURNED in caller since we speak of different return values - if callee

Re: [Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast between ce4d1f632ff3f680550d3b186b60176022f41190 and 6fca1761a16c68740f875fc487b98b6bde8e9be7

2021-10-29 Thread Jan Hubicka via Gcc-bugs
> Not seen on Haswell (but w/o PGO). Is this PGO specific? There's another > large jump visible end of 2019. It is between 2019-11-15 and 18 but the revisions does not exist at git - perhaps they reffer to the old git mirror. Martin will know better. In that range there are many of Richard's

Re: [Bug ipa/102982] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)

2021-10-28 Thread Jan Hubicka via Gcc-bugs
> > fixup_cfg already removes write-only stores so that seems fit for that > purpose. > > Btw, > > static int x = 1; > > int main() > { > x = 1; > } > > should ideally be handled as well as maybe the more common(?) > > static int x[128]; > > int main() > { > memset (x, 0, 128*4); > } >

Re: [PATCH] Fix loop split incorrect count and probability

2021-10-26 Thread Jan Hubicka via Gcc-patches
> On Tue, 26 Oct 2021, Xionghu Luo wrote: > > > > > > > On 2021/10/21 18:55, Richard Biener wrote: > > > On Thu, 21 Oct 2021, Xionghu Luo wrote: > > > > > >> > > >> > > >> On 2021/10/15 13:51, Xionghu Luo via Gcc-patches wrote: > > >>> > > >>> > > >>> On 2021/9/23 20:17, Richard Biener wrote:

Re: [PATCH] Fix loop split incorrect count and probability

2021-10-26 Thread Jan Hubicka via Gcc-patches
> > > That said, likely the profile update cannot be done uniformly > for all blocks of a loop? For the loop: for (i = 0; i < n; i = inc (i)) { if (ga) ga = do_something (); } to: for (i = 0; i < x; i = inc (i)) { if (true) ga = do_something (); if

Re: [PATCH v2 1/4] Fix loop split incorrect count and probability

2021-10-27 Thread Jan Hubicka via Gcc-patches
> > gcc/ChangeLog: > > * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. > (do_split_loop_on_cond): Likewise. > --- > gcc/tree-ssa-loop-split.c | 25 - > 1 file changed, 16 insertions(+), 9 deletions(-) > > diff --git

Re: [PATCH v2 1/4] Fix loop split incorrect count and probability

2021-10-27 Thread Jan Hubicka via Gcc-patches
> As discussed yesterday, for loop of form > > for (...) > if (cond) > cond = something(); > else > something2 > > Split as > Say "if (cond)" has probability p, then individual statements scale as follows: loop1: pfor (...) p if (true) 1cond = something(); 1

Re: [PATCH v2 1/4] Fix loop split incorrect count and probability

2021-10-27 Thread Jan Hubicka via Gcc-patches
> On Wed, 27 Oct 2021, Jan Hubicka wrote: > > > > > > > gcc/ChangeLog: > > > > > > * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. > > > (do_split_loop_on_cond): Likewise. > > > --- > > > gcc/tree-ssa-loop-split.c | 25 - > > > 1 file changed, 16

Re: [PATCH] ipa: Unshare expresseions before putting them into debug statements (PR 103099, PR 103107)

2021-11-08 Thread Jan Hubicka via Gcc-patches
> Hi, > > my recent patch to improve debug experience when there are removed > parameters (by ipa-sra or ipa-split) was not careful to unshare the > expressions that were then put into debug statements, which manifests > itself as PR 103099. This patch adds unsharing them using >

Revert workaround allowing interposition on nested functions

2021-11-08 Thread Jan Hubicka via Gcc-patches
Hi, the workaround seems to be no longer necessary - it seems that all the issues was isolated to wrong beaviour of can_be_interposed wrt partitioned functions. Honza * gimple.c (gimple_call_static_chain_flags): Revert the workaround allowing interposition since issues with

Move uncprop after modref pass

2021-11-08 Thread Jan Hubicka via Gcc-patches
Hi, this patch moves uncprop after modref and pure/const pass and adds a comment that this pass should alwasy be last since it is only supposed to help PHI lowering. The pass replaces constant by SSA names that are known to be constant at the place which hardly helps other passes. Modref now

Merge IPA and late local modref flags

2021-11-09 Thread Jan Hubicka via Gcc-patches
Hi, since at the time we compute local solution during late modref the summaries from IPA are readily available (and I added logic to compare them), it is easy to intersect both solutions to get around cases where late optimization obstructate code enough so flags are no longer analyzed correctly.

Re: Merge IPA and late local modref flags

2021-11-09 Thread Jan Hubicka via Gcc-patches
> > + } > > + if (!(flags & EAF_UNUSED)) > > + lags |= past; >    ^ > > > Broke bootstrap. Martin just fixed it. Sorry for that. Diff complained about 8 spaces instead of tab and I did not rebuild after replacing it bit too overzelaously. Honza > > jeff >

Make EAF flags more regular (and expressive)

2021-11-09 Thread Jan Hubicka via Gcc-patches
Hi, I hoped that I am done with EAF flags related changes, but while looking into the Fortran testcases I noticed that I have designed them in unnecesarily restricted way. I followed the scheme of NOESCAPE and NODIRECTESCAPE which is however the only property tht is naturally transitive. This

Re: Workaround ICE in gimple_static_chain_flags

2021-11-04 Thread Jan Hubicka via Gcc-patches
> On Thu, Nov 04, 2021 at 05:13:41PM +0100, Jan Hubicka via Gcc-patches wrote: > > this patch workarounds ICE in gimple_static_chain_flags. I added a > > sanity check that the nested function is never considered interposable > > because such situation makes no sense:

Re: [PATCH] gcov-profile: Fix -fcompare-debug with -fprofile-generate [PR100520]

2021-11-05 Thread Jan Hubicka via Gcc-patches
> Hello. > > This strips .gk from aux_base_name in coverage.c. > Do you like the implementation of endswith, or do we have the functionality > somewhere? > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > PR

Avoid left shift of negative value in ipa-modref-tree.h

2021-11-05 Thread Jan Hubicka via Gcc-patches
Hi, ubsan is complaining about left shift of negative value which is undefined in c++11..c++17. Replaced by multiplication. Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: PR ipa/103082 * ipa-modref-tree.h (struct modref_access_node): Avoid left shift

Fix ICE in insert_access

2021-11-05 Thread Jan Hubicka via Gcc-patches
Hi, this patch makes insert_access to ignore accesses that are paradoxical (i.e. their max_size is smaller than size) which can happen for example when VRP proves that the access happens past the end of array bounds. It also checks for zero sized accesses and verifies that max_size is never

Re: Implement intraprocedural dataflow for ipa-modref EAF analyser

2021-11-04 Thread Jan Hubicka via Gcc-patches
> On 11/4/21 15:12, Jan Hubicka via Gcc-patches wrote: > > |Bootstrapped/regtested x86_64-linux, plan to commit after bit more > > testing.| > > Can you please install the patch after the current MOD REF crashes are fixed? > It will help us with the future bisection.

Re: Workaround ICE in gimple_static_chain_flags

2021-11-04 Thread Jan Hubicka via Gcc-patches
> On Thu, Nov 04, 2021 at 05:13:41PM +0100, Jan Hubicka via Gcc-patches wrote: > > this patch workarounds ICE in gimple_static_chain_flags. I added a > > sanity check that the nested function is never considered interposable > > because such situation makes no sense:

Re: [Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

2021-11-04 Thread Jan Hubicka via Gcc-bugs
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943 > > Aldy Hernandez changed: > >What|Removed |Added > > Depends on||103058 > > --- Comment

Re: [PATCH] Rename predicate class to ipa_predicate

2021-11-03 Thread Jan Hubicka via Gcc-patches
> Hello. > > The renaming patch fixes a -Wodr warning seen and reported in the PR. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > PR bootstrap/102828 > > gcc/ChangeLog: > > * ipa-fnsummary.c (edge_predicate_pool): Rename predicate class to >

Implement intraprocedural dataflow for ipa-modref EAF analyser

2021-11-04 Thread Jan Hubicka via Gcc-patches
Hi, this patch implements the (long promised) intraprocedural dataflow for propagating eaf flags, so we can handle parameters that participate in loops in SSA graphs. Typical example are acessors that walk linked lists, for example. I implemented dataflow using the standard iteration over BBs in

Workaround ICE in gimple_static_chain_flags

2021-11-04 Thread Jan Hubicka via Gcc-patches
Hi, this patch workarounds ICE in gimple_static_chain_flags. I added a sanity check that the nested function is never considered interposable because such situation makes no sense: nested functions have no static API and can not be safely merged across translation units. It turns out however that

Re: [Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

2021-11-07 Thread Jan Hubicka via Gcc-bugs
> > This PR is still open, at least for slowdown in the threader with LTO. The > issue is ranger wide, so it may also cause slowdowns on non-LTO builds for > WRF, though I haven't checked. I just wanted to record the fact somewhere since I was looking up the revision range mostly to figure out

Fix can_be_discarded_p wrt partitioned functions

2021-11-06 Thread Jan Hubicka via Gcc-patches
Hi, can_be_discarded_p is testing DECL_EXTERNAL flag to see if the symbol can be discarded by linker if unreachable. This is meant to catch extern inline functions (which is bit side case and it is intded to avoid gcc from producing new references to them if there were no refernces before) but it

Re: Implement intraprocedural dataflow for ipa-modref EAF analyser

2021-11-07 Thread Jan Hubicka via Gcc-patches
Hi, I have commited the patch now. On the current tree the patch causes new failure ./gcc/testsuite/gfortran/gfortran.sum:FAIL: gfortran.dg/vector_subscript_1.f90 -O1 execution test ./gcc/testsuite/gfortran/gfortran.sum:FAIL: gfortran.dg/vector_subscript_1.f90 -O2 execution test

Fix inter-procedural EAF flags propagation with respect to !binds_to_current_def_p

2021-11-07 Thread Jan Hubicka via Gcc-patches
Hi, while proofreading the code for handling EAF flags of !binds_to_current_def_p I noticed that the interprocedural dataflow actually ignores the flag possibly introducing wrong code on nterposable functions in non-trivial recursion cycles or at ltrans partition boundary. This patch unifies the

Re: [Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

2021-11-08 Thread Jan Hubicka via Gcc-bugs
Note that it still seems to me that the crossed_loop_header handling is overly conservative. We have: @ -2771,6 +2771,7 @@ jt_path_registry::cancel_invalid_paths (vec ) bool seen_latch = false; int loops_crossed = 0; bool crossed_latch = false; + bool crossed_loop_header = false;

Improve optimization of some builtins

2021-11-07 Thread Jan Hubicka via Gcc-patches
Hi, for nested functions we output call to builtin_dwarf_cfa which initializes frame entry used only for debugging. This however prevents us from detecting functions containing nested functions as const/pure or analyze side effects in modref. builtin_dwarf_cfa is not documented and I wonder if

Re: [PATCH] Dump static chain for cgraph_node.

2021-11-08 Thread Jan Hubicka via Gcc-patches
> > diff --git a/gcc/cgraph.c b/gcc/cgraph.c > > index de078653781..8299ee92946 100644 > > --- a/gcc/cgraph.c > > +++ b/gcc/cgraph.c > > @@ -2203,6 +2203,10 @@ cgraph_node::dump (FILE *f) > > fprintf (f, " %soperator_delete", > > DECL_IS_REPLACEABLE_OPERATOR (decl) ? "replaceable_"

Re: [PATCH] Dump static chain for cgraph_node.

2021-11-08 Thread Jan Hubicka via Gcc-patches
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > * cgraph.c (cgraph_node::dump): Dump static_chain_decl. OK Honza > --- > gcc/cgraph.c | 4 > 1 file changed, 4 insertions(+) > > diff --git

Re: [PATCH] gcov-profile: Fix -fcompare-debug with -fprofile-generate [PR100520]

2021-11-08 Thread Jan Hubicka via Gcc-patches
> On 11/5/21 18:30, Jan Hubicka wrote: > > every gcc source looks like bit of overkill given that is can be open > > coded in 3 statements? > > Why? It's a static inline function with few statements. I don't want to > copy > the same code at every location. I bet there must quite some open-coded

Add static_chain support to ipa-modref

2021-11-01 Thread Jan Hubicka via Gcc-patches
Hi, this is patchs teaches ipa-modref about the static chain that is, like retslot, a hiden argument. The patch is pretty much symemtric to what was done for retslot handling and I verified it does the intended job for Ada LTO bootstrap. Bootstrapped/regtested x86_64-linux, OK? Honza

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-10-27 Thread Jan Hubicka via Gcc-patches
> Hi, > > On 2021/9/28 20:09, Richard Biener wrote: > > On Fri, Sep 24, 2021 at 8:29 AM Xionghu Luo wrote: > >> > >> Update the patch to v3, not sure whether you prefer the paste style > >> and continue to link the previous thread as Segher dislikes this... > >> > >> > >> [PATCH v3] Don't move

Handle retslot_flags in ipa-modref and PTA

2021-10-29 Thread Jan Hubicka via Gcc-patches
Hi, this patch extends modref and tree-ssa-structalias to handle retslot flags. Since retslot it essentially a hidden argument that is known to be write-only we can do pretty much the same stuff as we do for regular parameters. I plan to add static chain handling similar way. We do not handle IPA

Re: Limit inlining functions called once

2021-12-09 Thread Jan Hubicka via Gcc-patches
> > I plan to reduce the value during before christmas after bit more testing > > since > > it seems to be overall win even if we trade fatigue2 performance, but I > > would > > like to get more testing on larger C++ APPs first. > > Will this hurt -Os -finline-limit=0 ? Why do you use

Re: [PATCH] c++, symtab: Support (x) == (y) in constant evaluation [PR103600]

2021-12-09 Thread Jan Hubicka via Gcc-patches
> > Ah, indeed, good idea. FYI, clang++ seems to constant fold > (x) != (y) already, so Jonathan could use it even for > clang++ in the constexpr operator==. But it folds even > extern int , > constexpr bool c = != > regardless of whether some other TU has > int a; > int b

Re: [PATCH] PR ipa/103601: ICE compiling CSiBE in ipa-modref's insert_kill

2021-12-10 Thread Jan Hubicka via Gcc-patches
> On Fri, Dec 10, 2021 at 2:30 AM Roger Sayle > wrote: > > > > > > This patch fixes PR ipa/103061 which is P1 regression that shows up as > > an ICE in ipa-modref-tree.c's insert_kill when compiling the CSiBE > > benchmark. I believe the underlying cause is that the new kill tracking > >

Re: [PATCH] alias: Optimise call_may_clobber_ref_p

2021-12-07 Thread Jan Hubicka via Gcc-patches
> > Yeah, the fast summary array lookup itself seems fine. What slowed > this down for me was instead: > > /* A single function body may be represented by multiple symbols with > different visibility. For example, if FUNC is an interposable alias, > we don't want to return

Limit inlining functions called once

2021-12-07 Thread Jan Hubicka via Gcc-patches
Hi, as dicussed in PR ipa/103454 there are several benchmarks that regresses for -finline-functions-called once. Runtmes: - tramp3d with -Ofast. 31% - exchange2 with -Ofast 11-21% - roms O2 9%-10% - tonto 2.5-3.5% with LTO Build times: - specfp2006 41% (mostly wrf that builds 71% faster) -

Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
> >+ /* NULL memory accesses terminates BB. These accesses are known > >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them > >+ to volatile accesses and adds builtin_trap call which would > >+ confuse us otherwise. */ > >+ if

Add -fipa-strict-aliasing

2021-12-12 Thread Jan Hubicka via Gcc-patches
Hi, ipa-modref is using TBAA to disambiguate memory accesses inter-procedurally. This sometimes breaks programs with TBAA violations including clang with LTO. To workaround that one can use -fno-strict-aliasing or -fno-ipa-modref which are both quite big hammers. So I added -fipa-strict-aliasing

Distinguish global and unkonwn memory accesses in ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
Hi, As discussed in PR103585, fatigue2 is now only benchmark from my usual testing set (SPEC2k6, SPEC2k17, CPP benchmarks, polyhedron, Firefox, clang) which sees important regression when inlining functions called once is limited. This prevents us from solving runtime issues in roms benchmarks

Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
Hi, As discussed in the PR, we miss some optimization becuase gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds __builtin_trap after them. This is seen as a side-effect by IPA analysis and additionally the (fully unreachable) builtin_trap is believed to load all global

Re: Add -fipa-strict-aliasing

2021-12-12 Thread Jan Hubicka via Gcc-patches
> On December 12, 2021 1:22:09 PM GMT+01:00, Jan Hubicka via Gcc-patches > wrote: > >Hi, > >ipa-modref is using TBAA to disambiguate memory accesses inter-procedurally. > >This sometimes breaks programs with TBAA violations including clang with LTO. > >To workaroun

Do not ICE when computing value range of ternary expression

2021-12-12 Thread Jan Hubicka via Gcc-patches
Hi, In evaluate_conditions_for_known_args we use range_fold_unary_expr and range_fold_binary_expr to produce value ranges of the expression. However the expression also may contain ternary COND_EXPR on which we ICE. I did not find interface to do similar folding easily on ternary exprs and since

Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
> > I think this is common pattern in C++ code originating from cast with > multiple inheritance. I would vote towards optimizing out the conditial > move in this case and I think it is correct. I crafted a testcse and filled in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103674 Honza > >

Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
> > > On 12/12/2021 3:49 AM, Jan Hubicka via Gcc-patches wrote: > > Hi, > > As discussed in the PR, we miss some optimization becuase > > gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds > > __builtin_trap after them. This is seen

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-13 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > * loop-invariant.c (find_invariants_bb): Check profile count > before motion. > (find_invariants_body): Add argument. > --- > gcc/loop-invariant.c | 10 +++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/gcc/loop-invariant.c

Re: [PATCH 2/3] Fix incorrect loop exit edge probability [PR103270]

2021-12-13 Thread Jan Hubicka via Gcc-patches
> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in > profile-estimate when predict_extra_loop_exits, outer loop's exit edge > is marked as inner loop's extra loop exit and set with incorrect > prediction, then a hot inner loop will become cold loop finally through >

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-13 Thread Jan Hubicka via Gcc-patches
> > gcc/ChangeLog: > > > > * loop-invariant.c (find_invariants_bb): Check profile count > > before motion. > > (find_invariants_body): Add argument. > > --- > > gcc/loop-invariant.c | 10 +++--- > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > diff --git

Re: [Bug gcov-profile/103652] Producing profile with -O2 -flto and trying to consume it with -O3 -flto leads to ICEs on indirect call profiling

2021-12-13 Thread Jan Hubicka via Gcc-bugs
> > Well, I'm specifically speaking about: > error: the control flow of function ‘BZ2_compressBlock’ does not match its > profile data (counter ‘arcs’) > > this type of errors should not happen even in a multi-threaded programs. There are some cases where I see even those on clang build - I am

Re: [PATCH] ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

2021-12-13 Thread Jan Hubicka via Gcc-patches
> >>> + || (only_for_nonzero && !src_lats->bits_lattice.known_nonzero_p ())) > >>> + { > >>> + if (jfunc->bits) > >>> + return dest_lattice->meet_with (jfunc->bits->value, > >>> + jfunc->bits->mask, precision); > >>> + else > >>> + return

Re: Add -fipa-strict-aliasing

2021-12-13 Thread Jan Hubicka via Gcc-patches
Hi, this is a variant I comitted (with updated documentation as Richard requested). Honza gcc/ChangeLog: 2021-12-13 Jan Hubicka * common.opt: Add -fipa-strict-aliasing. * doc/invoke.texi: Document -fipa-strict-aliasing. * ipa-modref.c

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-16 Thread Jan Hubicka via Gcc-patches
> > OK. Comments like? > > /* Don't move insn of cold BB out of loop to preheader to reduce calculations >and register live range in hot loop with cold BB. */ Looks good. > > > And maybe some dump log will help tracking in xxx.c.271r.loop2_invariant. > > --- a/gcc/loop-invariant.c >

Re: [PATCH 2/3] Fix incorrect loop exit edge probability [PR103270]

2021-12-16 Thread Jan Hubicka via Gcc-patches
> > > > > > ./contrib/analyze_brprob.py ~/workspace/tests/spec2017/dump_file_all > > HEURISTICS BRANCHES (REL) BR. HITRATE > > HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT > > branches (>10%) > > noreturn call

Re: Improve -fprofile-report

2021-12-03 Thread Jan Hubicka via Gcc-patches
> On 11/27/21 16:56, Jan Hubicka via Gcc-patches wrote: > > Hi, > > Profile-report was never properly updated after switch to new profile > > representation. This patch fixes the way profile mismatches are > > calculated: we used to collect separately count and freq m

Fix handling of histogram in ipa-profile

2021-12-11 Thread Jan Hubicka via Gcc-patches
Hi, this patch removes apparently forgotten debugging hack (which got in during the speculative call patchset) which reduces hot bb threshold. This does not make sense since it is set and reset randomly as the summaries are processed. One problem is that we set the BB threshold to make certain

Fix ipa-modref handling of thunks

2021-12-11 Thread Jan Hubicka via Gcc-patches
Hi, thunks are not transparent for ipa-modref summary since it cares about offsets from pointer parameters and also for virtual thunk about the read from memory in there. We however use function_or_virtual_thunk_symbol to get the summary that may lead to wrong code (and does in two testsuite

Re: [PATCH] inline: fix ICE with -fprofile-generate

2021-12-10 Thread Jan Hubicka via Gcc-patches
> Fixes ICE spotted by Honza where we have a better place where > to check for no_profile_instrument_function attribute. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > PR ipa/103636 > > gcc/ChangeLog: > >

Re: [patch] lto: Don't run ipa-comdats pass during LTO

2021-12-07 Thread Jan Hubicka via Gcc-patches
> The attached patch fixes an ICE in lto1 at lto-partition.c:215 that > was reported by a customer. Unfortunately I have no test case for > this; the customer's application is a big C++ shared library with lots > of dependencies and proprietary code under NDA. I did try reducing it > with cvise

Re: [PATCH] alias: Optimise call_may_clobber_ref_p

2021-12-07 Thread Jan Hubicka via Gcc-patches
> > Notice the ??? comment. The code does not set clobbers here because it > > assumes that tree-ssa-alias will do the right thing. > > So one may make builtins handling first, PTA next and only if both say > > "may alias" continue. Other option is to extend the code here to add > > propert

Re: [Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread Jan Hubicka via Gcc-bugs
> --- Comment #6 from Richard Biener --- > Honza, -Og was supposed to not do so much work, I intended to disable IPA > inlining but there's no knob for that. I wonder where to best put such > guard? I set flag_inline_small_functions to zero for -Og but we still > run inline_small_functions ().

<    1   2   3   4   5   >