Re: [Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread Jan Hubicka via Gcc-bugs
> > Sure - I just remember (falsely?) that we finally decided to do it :) I do not recall this, but I may have forgotten :)) > If we don't run IPA inline we don't figure we failed to inline the > always_inline either ;) And IPA inline can expose more indirect > alywas-inlines we only discover

Re: [Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread Jan Hubicka via Gcc-bugs
> You can not disable an IPA pass becasuse then we will mishandle > optimize attributes. I think you simply want to set > > flag_inline_small_functions = 0 > flag_inline_functions_called_once = 0 Actually I forgot, we have flag_no_inline which makes tree_inlinable_function_p to return false

Re: [PATCH] Fix tree-optimization/101941: IPA splitting out function with error attribute

2022-01-14 Thread Jan Hubicka via Gcc-patches
> > > > > --- a/gcc/ipa-split.c > > > > > +++ b/gcc/ipa-split.c > > > > > @@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb, > > > > > gimple *stmt = gsi_stmt (bsi); > > > > > tree op; > > > > > ssa_op_iter iter; > > > > > - tree decl; > > > > > +

Re: [Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2022-01-11 Thread Jan Hubicka via Gcc-bugs
on zen2 and 3 with -flto the speedup seems to be cca 12% for both -O2 and -Ofast -march=native which is both very nice! Zen1 for some reason sees less improvement, about 6%. With PGO it is 3.8% Overall it seems a win, but there are few noteworthy issues. I also see a 6.69% regression on x64 with

Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-15 Thread Jan Hubicka via Gcc-patches
> @@ -1272,7 +1275,7 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator > *gsi, > contributes nothing to the program, and can be deleted. */ > > static bool > -eliminate_unnecessary_stmts (void) > +eliminate_unnecessary_stmts (bool aggressive) > { >bool something_changed = false;

Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-17 Thread Jan Hubicka via Gcc-patches
> +/* Returns whether the control parents of BB are preserved. */ > + > +static bool > +control_parents_preserved_p (basic_block bb) > +{ > + /* If we marked the control parents from BB they are preserved. */ > + if (bitmap_bit_p (visited_control_parents, bb->index)) > +return true; > + >

Disable gathers on zen3 for vectors with few elements

2022-03-27 Thread Jan Hubicka via Gcc-patches
Hi, as seen on TSVC, Spec2017, the Zen3 gather instruction is a win only for vectors with 8 elements. At the time I was implementing the tuning vectorizer did not know how to open-code gather and thus it was still a win to enable it for shorter vector, but this has changed. The following are

Re: [PATCH] ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

2022-03-31 Thread Jan Hubicka via Gcc-patches
> IPA_JF_ANCESTOR jump functions are constructed also when the formal > parameter of the caller is first checked whether it is NULL and left > as it is if it is NULL, to accommodate C++ casts to an ancestor class. > > The jump function type was invented for devirtualization and IPA-CP >

Re: [PATCH] ipa: Create LOAD references when necessary during inlining (PR 103171)

2022-03-31 Thread Jan Hubicka via Gcc-patches
> Hi, > > in r12-2523-g13586172d0b70c ipa-prop tracking of jump functions during > inlining got the ability to remove ADDR references when inlining > discovered that they were not necessary or turn them into LOAD > references when we know that what was a function call argument passed > by

Re: [PATCH] ipa-cp: Do not create clones for values outside known value range (PR 102513)

2022-03-31 Thread Jan Hubicka via Gcc-patches
> Hi, > > PR 102513 shows we emit bogus array access warnings when IPA-CP > creates clones specialized for values which it deduces from arithmetic > jump functions describing self-recursive calls. Those can however be > avoided if we consult the IPA-VR information that the same pass also > has.

Re: [Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

2022-01-27 Thread Jan Hubicka via Gcc-bugs
> > According to znver2_cost > > > > Cost of sse_to_integer is a little bit less than fp_store, maybe increase > > sse_to_integer cost(more than fp_store) can helps RA to choose memory > > instead of GPR. > > That sounds reasonable - GPR<->xmm is cheaper than GPR -> stack -> xmm > but GPR<->xmm

Re: [PATCH] internal_error - do not use leading capital letter

2022-01-27 Thread Jan Hubicka via Gcc-patches
> That's follow up patch based on the discussion with Jakub. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > * config/rs6000/host-darwin.cc (segv_crash_handler): > Do not use leading capital letter. > (segv_handler): Likewise. > * ipa-sra.cc

Re: [Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

2022-01-27 Thread Jan Hubicka via Gcc-bugs
> I would say so. It saves code size and also uop space unless the two > can magically fuse to a immediate to %xmm move (I doubt that). I made simple benchmark double a=10; int main() { long int i; double sum,val1,val2,val3,val4; for (i=0;i<10;i++) { #if

Re: [PATCH] ipa/103989 - tame IPA optimizations at -Og

2022-01-18 Thread Jan Hubicka via Gcc-patches
> With -Og we are not prepared to do cleanup after IPA optimizations > and dead code exposed by those confuses late diagnostic passes. > This is a first patch removing unwanted IPA optimizations, namely > both late modref and pure-const analysis. > > Bootstrap and regtest running on

Re: [PATCH] ipa/103989 - avoid IPA inlining of small functions with -Og

2022-01-18 Thread Jan Hubicka via Gcc-patches
> The following change avoids doing IPA inlining of small functions > into functions compiled with -Og - those functions will see almost no > followup scalar cleanups so that the benefit anticipated by the > inliner will not be realized and instead the late diagnostic code > will be confused by

Re: [Bug tree-optimization/103195] [12 Regression] tfft2 text grows by 70% with -Ofast since r12-5113-gd70ef65692fced7a

2022-01-18 Thread Jan Hubicka via Gcc-bugs
> So nothing to see? I guess our unit growth limit doesn't trigger because it's > a small (benchmark) unit? Yep, unit growths do not apply for very small units. ipa-cp heuristics still IMO needs work and be based on relative speedups rather then absolute for the cutoffs.

Re: [PATCH 1/6] ira: Add a ira_loop_border_costs class

2022-01-06 Thread Jan Hubicka via Gcc-patches
> The final index into (ira_)memory_move_cost is 1 for loads and > 0 for stores. Thus the combination: > > entry_freq * memory_cost[1] + exit_freq * memory_cost[0] > > is the cost of loading a register on entry to a loop and > storing it back on exit from the loop. This is the cost to > use

Re: [Bug ipa/104203] [12 Regressions] huge IPA compile-time regression since r12-6606-g9d6a0f388eb048f8

2022-01-24 Thread Jan Hubicka via Gcc-bugs
So I assume that this is due to new pass_waccess which was added into early optimizations. I think this is not really ipa component but tree-optimize.

Re: [Bug tree-optimization/104203] [12 Regressions] huge compile-time regression since r12-6606-g9d6a0f388eb048f8

2022-01-24 Thread Jan Hubicka via Gcc-bugs
> > bool > Since the pass issues a bunch other warnings (e.g., -Wstringop-overflow, > -Wuse-after-free, etc.) the gate doesn't seem right. But since #pragma GCC > diagnostic can re-enable warnings disabled by -w (or turn them into errors) > any > gate that considers the global option setting

Re: Fix pure/const propagation in modref

2022-04-07 Thread Jan Hubicka via Gcc-patches
> On Thu, Apr 7, 2022 at 1:20 PM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > this patch fixes ipa-modref propagation of pure/const functions. When we > > inline > > function, the modref summary is updated to represent the function after > &g

Fix wrong code in gnatmake

2022-04-07 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes miscompilation of gnatmake. Modref attempts to track memory accesses relative to the base pointers which are parameters of functions. If it fails, it still makes difference between unknown memory access and global memory access. The second makes it possible to disambiguate

Fix pure/const propagation in modref

2022-04-07 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes ipa-modref propagation of pure/const functions. When we inline function, the modref summary is updated to represent the function after inlining and there we need to propagate nondeterministic and side-effects flag. Bootstrapped/regtested x86_64-linux, will commit it shortly.

Re: Fix wrong code in gnatmake

2022-04-07 Thread Jan Hubicka via Gcc-patches
> On Thu, 7 Apr 2022, Jan Hubicka wrote: > > > Hi, > > this patch fixes miscompilation of gnatmake. Modref attempts to track > > memory > > accesses relative to the base pointers which are parameters of functions. > > If it fails, it still makes difference between unknown memory access and > >

Fix nondeterministic and side_effect propagation in ipa-modref

2022-04-09 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds logic to propagate nondeterministic and side_effects bits in modref when summary is updated after inlining. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2022-04-09 Jan Hubicka * ipa-modref.cc (ipa_merge_modref_summary_after_inlining): Propagate

Fix ICE with -fno-semantic-interposition added via option attribut

2022-04-09 Thread Jan Hubicka via Gcc-patches
Hi, This patch solves problem with FE first finalizing function and then adding -fno-semantic-interposition flag (by parsing optimization attribute). Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: 2022-04-09 Jan Hubicka PR ipa/103376 * cgraphunit.cc

Re: Check that passes do not forget to define profile

2023-08-24 Thread Jan Hubicka via Gcc-patches
> On Thu, Aug 24, 2023 at 3:15 PM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > this patch extends verifier to check that all probabilities and counts are > > initialized if profile is supposed to be present. This is a bit complicated > &

Re: [Bug middle-end/111088] useless 'xor eax,eax' inserted when a value is not returned and icf

2023-08-21 Thread Jan Hubicka via Gcc-bugs
> But adds a return with a value. And then the inliner inlines foo into foo2 but > we still have the return with a value around ... I guess ICF can special case unused return value, but why this is not taken care of by ipa-sra?

Re: Loop-ch improvements, part 3

2023-08-23 Thread Jan Hubicka via Gcc-patches
> We seem to peel one iteration for no good reason. The loop is > a do-while loop already. The key is we see the first iteration > exit condition is known not taken and then: Hi, this is patch fixing wrong return value in should_duplicate_loop_header_p. Doing so uncovered suboptimal decisions on

Fix profile update in tree-ssa-reassoc

2023-08-23 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds missing profile update to maybe_optimize_range_tests. Jakub, I hope I got the code right: I think it basically analyzes the chain of conditionals, finds some basic blocks involved in the range testing and then puts all the test into first BB. The patch fixes

Avoid division by zero in fold_loop_internal_call

2023-08-14 Thread Jan Hubicka via Gcc-patches
Hi, My patch to fix profile after folding internal call is missing check for the case profile was already zero before if-conversion. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: PR gcov-profile/110988 * tree-cfg.cc (fold_loop_internal_call): Avoid division by

Check that passes do not forget to define profile

2023-08-24 Thread Jan Hubicka via Gcc-patches
Hi, this patch extends verifier to check that all probabilities and counts are initialized if profile is supposed to be present. This is a bit complicated by the posibility that we inline !flag_guess_branch_probability function into function with profile defined and in this case we need to stop

Re: Loop-ch improvements, part 3

2023-08-22 Thread Jan Hubicka via Gcc-patches
> > We seem to peel one iteration for no good reason. The loop is > a do-while loop already. The key is we see the first iteration > exit condition is known not taken and then: > > Registering value_relation (path_oracle) (iter.24_6 > iter.24_5) (root: > bb2) > Stmt is static (constant

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> On Wed, 20 Apr 2022, Jakub Jelinek wrote: > > > Hi! > > > > cgraph_node has a semantic_interposition flag which should mirror > > opt_for_fn (decl, flag_semantic_interposition). But it actually is > > initialized not from that, but from flag_semantic_interposition in the > > explicit

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> On Wed, Apr 20, 2022 at 10:45:53AM +0200, Jan Hubicka wrote: > > So this change should be unnecessary unless there are nodes that are > > missing finalization stage. It also is not good enough since frontends > > may change opt_for_fn between node creation and finalization of > > compilation

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> On Wed, Apr 20, 2022 at 01:47:43PM +0200, Martin Jambor wrote: > > Hi, > > > > On Wed, Apr 20 2022, Jan Hubicka via Gcc-patches wrote: > > >> On Wed, 20 Apr 2022, Jakub Jelinek wrote: > > > > [...] > > >

Re: [PATCH] gcov-profile: Allow negavive counts of indirect calls [PR105282]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> From: Sergei Trofimovich > > TOPN metrics are histograms that contain overall count and per-bucket > count. Overall count can be nevative when two profiles merge and some > of per-bucket metrics are dropped. > > Noticed as an ICE on python PGO build where gcc crashes as: > > during IPA

Re: [PATCH][v2] tree-optimization/104912 - ensure cost model is checked first

2022-04-20 Thread Jan Hubicka via Gcc-patches
> The following makes sure that when we build the versioning condition > for vectorization including the cost model check, we check for the > cost model and branch over other versioning checks. That is what > the cost modeling assumes, since the cost model check is the only > one accounted for in

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> > The cgraph.cc change was what I actually needed for the fix, the > cgraphclones.cc was only because I've noticed that it constructs a new > node (so is initialized to whatever random flag_semantic_interposition is > right now) and initializing it to what it is cloned from made more sense.

Avoid overflow in ipa-modref-tree.cc

2022-04-10 Thread Jan Hubicka via Gcc-patches
Hi, the testcase triggers ICE since computation overflows on two accesses that are very far away d->b[-144115188075855873] and d->b[144678138029277184]. This patch makes the relevant part of modref to use poly_offset_int. It is kind of weird to store bit offsets into poly_int64 but it is what

Re: [PATCH] lto-plugin: add support for feature detection

2022-05-16 Thread Jan Hubicka via Gcc-patches
> > Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > is useful if bugs are discovered in old versions that you by definition cannot > fix but can apply workarounds to. Note the actual compiler used might still > differ. Note that still isn't clean API documentation /

Re: [PATCH] ipa-icf: skip variables with body_removed

2022-05-19 Thread Jan Hubicka via Gcc-patches
> Similarly to cgraph_nodes, it may happen that body_removed is set > during merging of symbols. > > PR ipa/105600 > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > * ipa-icf.cc

Re: [Bug lto/105727] __builtin_constant_p expansion in LTO

2022-05-25 Thread Jan Hubicka via Gcc-bugs
> > My guess is that the > > BUILD_BUG(); > > line is the sole thing that is wrong, it should be just break; > > as the memory_is_poisoned_n(addr, size); will handle all the sizes, > > regardless if they are constants or not. > > Sure, I'm going to suggest such a change. To me it looked like a

Re: [Bug c/105728] New: dead store to static var not optimized out

2022-05-25 Thread Jan Hubicka via Gcc-bugs
> To me, all of these do the same thing and should generate the same code. > As nobody else can see removeme, and we aren't leaking its address, shouldn't > the compiler be able to deduce that all accesses to removeme are > inconsequential and can be removed? > > My gcc 11.3 generates a condidion

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-25 Thread Jan Hubicka via Gcc-patches
> On Mon, 16 May 2022, Alexander Monakov wrote: > > > On Mon, 9 May 2022, Jan Hubicka wrote: > > > > > > On second thought, it might be better to keep the assert, and place the > > > > loop > > > > under 'if (optimize)'? > > > > > > The problem is that at IPA level it does not make sense to

Re: [PATCH] lto-plugin: add support for feature detection

2022-05-16 Thread Jan Hubicka via Gcc-patches
> On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > >> > >> Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > >> is useful if bugs are discovered in old versions that you by definition > >> cannot > >> fix but can appl

Re: [Bug middle-end/106078] Invalid loop invariant motion with non-call-exceptions

2022-06-25 Thread Jan Hubicka via Gcc-bugs
> > For this one it's PRE hoisting *b across the endless loop (PRE handles > > calls as possibly not returning but not loops as possibly not > > terminating...) > > So it's a different bug. > > Btw, C++ requiring forward progress makes the testcase undefined. In my understanding access to

Fix stmt_kills_ref_p wrt external throws

2022-06-23 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds missing check to stmt_kills_ref_p for case that function is terminated by EH before call return value kills the ref. In the PR I tried to construct testcase but I don't know how to do that until I annotate EH code with fnspec attributes which I will do in separate patch and

Add fnspec attributes to cxa_* functions

2022-06-23 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds fnspecs for cxa_* functions in except.cc. Main goal is to make modref to see proper side-effects of functions which may throw. So in general we get - cxa_allocate_exception which gets the same annotations as malloc (since it is kind of same thing) - cxa_free_exception

Re: [PATCH] ipa-icf: skip variables with body_removed

2022-06-22 Thread Jan Hubicka via Gcc-patches
> @Honza: PING > > On 5/20/22 09:46, Martin Liška wrote: > > On 5/19/22 17:02, Jan Hubicka wrote: > >>> Similarly to cgraph_nodes, it may happen that body_removed is set > >>> during merging of symbols. > >>> > >>> PR ipa/105600 > >>> > >>> Patch can bootstrap on x86_64-linux-gnu and survives

Re: [PATCH] Add a heuristic for eliminate redundant load and store in inline pass.

2022-07-07 Thread Jan Hubicka via Gcc-patches
Hello, > From: Lili > > > Hi Hubicka, > > This patch is to add a heuristic inline hint to eliminate redundant load and > store. > > Bootstrap and regtest pending on x86_64-unknown-linux-gnu. > OK for trunk? > > Thanks, > Lili. > > Add a INLINE_HINT_eliminate_load_and_store hint in to

Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-01 Thread Jan Hubicka via Gcc-patches
> On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin wrote: > > > > Hi, > > > > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html > > > > BR, > > Kewen > > > > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: > > > Hi, > > > > > > PR105459 exposes one issue in inline_call

Fix ipa-prop wrt volatile memory accesses

2022-06-10 Thread Jan Hubicka via Gcc-patches
Hi, this patch prevents ipa-prop from propagating aggregates when load is volatile. Martin, does this look OK? It seem to me that ipa-prop may need some additional volatile flag checks. Bootstrapped/regtested x86_64-linux, OK? Honza gcc/ChangeLog: 2022-06-10 Jan Hubicka PR

Re: [PATCH] Add operators / and * for profile_{count,probability}.

2022-06-17 Thread Jan Hubicka via Gcc-patches
> PING^2 Sorry, I thought it is approved once we settled down the multiplicatoin datatype, but apparently never sent the email. Patch is oK. Honza > > On 5/24/22 13:35, Martin Liška wrote: > > PING^1 > > > > On 5/5/22 20:15, Martin Liška wrote: > >> On 5/5/22 15:49, Jan Hubicka wrote: > >>> Hi,

Re: [PATCH] predict: Adjust optimize_function_for_size_p [PR105818]

2022-06-14 Thread Jan Hubicka via Gcc-patches
> Hi, > > Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO > if func->decl is not null but no cgraph node is available for it. > As PR105818 shows, this could give unexpected result. For the > case in PR105818, when parsing bar decl in function foo, the cfun > is a function

Re: [PATCH] ipa-cp: Fix assert triggering with -fno-toplevel-reorder (PR 106260)

2022-07-18 Thread Jan Hubicka via Gcc-patches
> Hi, > > with -fno-toplevel-reorder (and -fwhole-program), there apparently can > be local functions without any callers. This is something that IPA-CP If there is possibility to trigger a local function without callers, I think one can also make two local functions calling each other but with

Re: [PATCH] LTO plugin: add ld_plugin_version callback.

2022-05-02 Thread Jan Hubicka via Gcc-patches
> On Mon, May 2, 2022 at 10:51 AM Richard Biener > wrote: > > > > On Mon, May 2, 2022 at 10:19 AM Martin Liška wrote: > > > > > > On 5/2/22 10:09, Richard Biener wrote: > > > > On Mon, May 2, 2022 at 9:52 AM Martin Liška wrote: > > > >> > > > >> Hi. > > > >> > > > >> This in a new plug-in

Re: [PATCH] ipa: Release body of clone_of when removing its last clone (PR 100413)

2022-04-28 Thread Jan Hubicka via Gcc-patches
> Hi, > > In the PR, the verifier complains that we did not manage to remove the > body of a node and it is right. The node is kept for materialization > of two clones but after one is materialized, the other one is removed > as unneeded (as a part of delete_unreachable_blocks_update_callgraph).

Re: [PATCH] cgraph: Don't verify semantic_interposition flag for aliases [PR105399]

2022-04-28 Thread Jan Hubicka via Gcc-patches
> On Thu, Apr 28, 2022 at 01:54:51PM +0200, Jan Hubicka wrote: > > > --- gcc/cgraph.cc.jj 2022-04-20 09:24:12.194579146 +0200 > > > +++ gcc/cgraph.cc 2022-04-27 11:53:52.102173154 +0200 > > > @@ -3488,7 +3488,9 @@ cgraph_node::verify_node (void) > > >"returns a pointer"); > > >

Re: [PATCH] cgraph: Don't verify semantic_interposition flag for aliases [PR105399]

2022-04-28 Thread Jan Hubicka via Gcc-patches
Hello, > Hi! > > The following testcase ICEs, because the ctors during cc1plus all have > !opt_for_fn (decl, flag_semantic_interposition) - they have NULL > DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl) and optimization_default_node > is for -Ofast and so has flag_semantic_interposition cleared. >

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-09 Thread Jan Hubicka via Gcc-patches
> On Mon, 2 May 2022, Alexander Monakov wrote: > > > > --- a/gcc/ipa-visibility.cc > > > > +++ b/gcc/ipa-visibility.cc > > > > @@ -872,6 +872,22 @@ function_and_variable_visibility (bool > > > > whole_program) > > > > } > > > > } > > > > } > > > > + FOR_EACH_VARIABLE

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-05 Thread Jan Hubicka via Gcc-patches
> > @@ -872,6 +872,22 @@ function_and_variable_visibility (bool whole_program) > > } > > } > > } > > + FOR_EACH_VARIABLE (vnode) > > +{ > > + tree decl = vnode->decl; > > + > > + /* Optimize TLS model based on visibility (taking into account > > +

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-05 Thread Jan Hubicka via Gcc-patches
> On Thu, 5 May 2022, Jan Hubicka wrote: > > > Also note that visibility pass is run twice (once at compile time before > > early optimizations and then again at LTO). Since LTO linking may > > promote public symbols to local/hidden, perhaps we want to do this only > > second time the pass is

Re: [PATCH] Add operators / and * for profile_{count,probability}.

2022-05-05 Thread Jan Hubicka via Gcc-patches
Hi, > The patch simplifies usage of the profile_{count,probability} types. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? The reason I intentionally did not add * and / to the original API was to detect situations where values that should

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen4 CPU

2022-10-21 Thread Jan Hubicka via Gcc-patches
> On Fri, Oct 21, 2022 at 12:00 PM Kumar, Venkataramanan via Gcc-patches > wrote: > > > > Hi all, > > > > > -Original Message- > > > From: Joshi, Tejas Sanjay > > > Sent: Monday, October 17, 2022 8:09 PM > > > To: gcc-patches@gcc.gnu.org > > > Cc: Kumar, Venkataramanan ; > > >

Re: [PATCH] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-09-20 Thread Jan Hubicka via Gcc-patches
> Hi Honza, > > This patch is to add attribute hot judgement for INLINE_HINT_known_hot hint. > > We set up INLINE_HINT_known_hot hint only when we have profile feedback, > now add function attribute judgement for it, when both caller and callee > have __attribute__((hot)), we will also set up

Re: [PATCH] IPA: support -flto + -flive-patching=inline-clone

2022-10-07 Thread Jan Hubicka via Gcc-patches
> >> Probably not hard, and the IPA pass adjusting visbility could as well > >> mark the functions > >> as not to be inlined with -flive-patching=inline-only-static. > >> > > OTOH inline-only-static could disable WPA inlining and do all inlining > early ... > >>> > >>>

Re: [PATCH v2 2/3] doc: -falign-functions is ignored under -Os

2022-10-12 Thread Jan Hubicka via Gcc-patches
> This is implicitly mentioned in the docs, but there were some questions > in a recent patch. This makes it more exlicit that -falign-functions is > meant to be ignored under -Os. > > gcc/doc/ChangeLog > > * invoke.texi (-falign-functions): Mention -Os > --- > gcc/doc/invoke.texi | 3

Re: [PATCH 2/2] ipa-cp: Better representation of aggregate values in call contexts

2022-10-14 Thread Jan Hubicka via Gcc-patches
> > 2022-08-26 Martin Jambor > > * ipa-prop.h (ipa_agg_value): Remove type. > (ipa_agg_value_set): Likewise. > (ipa_copy_agg_values): Remove function. > (ipa_release_agg_values): Likewise. > (ipa_auto_call_arg_values) Add a forward declaration. >

Re: [PATCH 1/2] ipa-cp: Better representation of aggregate values we clone for

2022-10-14 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > 2022-08-26 Martin Jambor > > * ipa-prop.h (IPA_PROP_ARG_INDEX_LIMIT_BITS): New. > (ipcp_transformation): Added forward declaration. > (ipa_argagg_value): New type. > (ipa_argagg_value_list): New type. > (ipa_agg_replacement_value): Removed

Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.

2022-10-07 Thread Jan Hubicka via Gcc-patches
> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng wrote: > > > > From: Monk Chiang > > > > Currnetly setting of -falign-functions=N will be ignored if the function > > is optimized for size or marked as cold function. > > > > However function alignment requirement is needed even optimized for > > size

Re: [PATCH] IPA: support -flto + -flive-patching=inline-clone

2022-10-07 Thread Jan Hubicka via Gcc-patches
> > WPA is Whole Program Analysis? > > Yes. > > > Okay, then It will promote all static function to extern functions. That’s > > reasonable. > > No, all extern functions to static functions. > > > Is it hard to preserve the original “static” visibility in the IR? > > Probably not hard, and

Fix invalid devirtualization when combining final keyword and anonymous types

2022-08-12 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes a wrong code issue where we incorrectly devirtualize to __builtin_unreachable. The problem occurs in combination of anonymous namespaces and final keyword used on methods. We do two optimizations here 1) when reacing final method we cut the search for possible new targets

Re: [PATCH] IPA: reduce what we dump in normal mode

2022-08-02 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > * profile.cc (compute_branch_probabilities): Dump details only > if TDF_DETAILS. > * symtab.cc (symtab_node::dump_base): Do not dump pointer unless > TDF_ADDRESS is used, it makes comparison harder. > --- > gcc/profile.cc | 2 +- > gcc/symtab.cc | 3

Re: [PATCH] Properly honor param_max_fsm_thread_path_insns in backwards threader

2022-08-02 Thread Jan Hubicka via Gcc-patches
> On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > On Tue, Aug 2, 2022 at 1:45 PM Richard Biener wrote: > > > > > > On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > > > > > Unfortunately, this was before my time, so I don't know. > > > > > > > > That being said, thanks for tackling these issues

More znver4 x86-tune flags

2023-01-09 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds more tunes for zen4: - new tunes for avx512 scater instructions. In micro benchmarks these seems consistent loss compared to open-coded coe - disable use of gather for zen4 While these are win for a micro benchmarks (based on TSVC), enabling gather is a loss for

Avoid quadratic behaviour of symbol renaming

2023-01-04 Thread Jan Hubicka via Gcc-patches
Hi, LTO partitioning does renaming of symbols that ends up in same partition and clash with assembler name. This is done for "ordinary" symbols (such as static functions) but also for symbols that are kept only as master clones holding bodies of functions to be specialized later. This is done

Re: [PATCH][X86_64] Separate znver4 insn reservations from older znvers

2023-01-03 Thread Jan Hubicka via Gcc-patches
> [Public] > > Hello, > > I have addressed all your comments in this revision of the patch, please find > attached and inlined. > > * I have updated all the latencies with Agner's measurements. > * Incorrect pipelines, loads/stores are addressed. > * The double pumped avx512 insns take one

Re: [PATCH][X86_64] Separate znver4 insn reservations from older znvers

2023-01-03 Thread Jan Hubicka via Gcc-patches
> > On Tue, 3 Jan 2023, Jan Hubicka wrote: > > > > * gcc/common/config/i386/i386-common.cc (processor_alias_table): > > > Use CPU_ZNVER4 for znver4. > > > * config/i386/i386.md: Add znver4.md. > > > * config/i386/znver4.md: New. > > OK, > > thanks! > > Honza, I'm curious what are your

Re: [PATCH 2/9] ipa: Better way of applying both IPA-CP and IPA-SRA (PR 103227)

2022-12-12 Thread Jan Hubicka via Gcc-patches
> 2022-11-11 Martin Jambor > > PR ipa/103227 > * ipa-param-manipulation.h (class ipa_param_adjustments): Removed > member function get_updated_index_or_split. > (class ipa_param_body_adjustments): New overload of > register_replacement, new member function

Re: [PATCH 9/9] ipa: Avoid looking for IPA-SRA replacements where there are none

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > While modifying the code, I realized that we do look into statements > even when there are no

Re: [PATCH 3/9] ipa-cp: Leave removal of unused parameters to IPA-SRA

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > Looking at some benchmarks I have noticed many cases when IPA-CP > cloned a function for all contexts

Re: [PATCH 5/9] ipa-sra: Move caller->callee propagation before callee->caller one

2022-12-12 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > 2022-11-11 Martin Jambor > > * ipa-sra.c (ipa_sra_analysis): Move top-down analysis before > bottom-up analysis. Replace FOR_EACH_VEC_ELT with C++11 iteration. > > gcc/testsuite/ChangeLog: > > 2021-12-14 Martin Jambor > > *

Re: [PATCH 6/9] ipa-sra: Be optimistic about Fortran descriptors

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > Fortran descriptors are structures which are often constructed just > for a particular argument of a

Re: [PATCH 1/9] ipa-cp: Write transformation summaries of all functions

2022-12-12 Thread Jan Hubicka via Gcc-patches
> -void > -write_ipcp_transformation_info (output_block *ob, cgraph_node *node) > +/* Return true if the IPA-CP transformation summary TS is non-NULL and > contains > + useful info. */ > +static bool > +useful_ipcp_transformation_info_p (ipcp_transformation *ts) > { > - int node_ref; > -

Re: [PATCH 4/9] ipa-sra: Treat REFERENCE_TYPES as always dereferencable

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > C++ and especially Fortran pass data by references which are not > pointers potentially pointing

Re: [PATCH 7/9] ipa-sra: Forward propagation of sizes which are safe to dereference

2022-12-12 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > 2022-11-11 Martin Jambor > > * ipa-sra.cc (isra_param_desc): New fields safe_size, > conditionally_dereferenceable and safe_size_set. > (struct gensum_param_desc): New field conditionally_dereferenceable. > (struct isra_param_flow): Updated comment

Re: [PATCH 8/9] ipa-sra: Make scan_expr_access bail out on uninteresting expressions

2022-12-12 Thread Jan Hubicka via Gcc-patches
> > Hi, > > > > I'm re-posting patches which I have posted at the end of stage 1 but > > which have not passed review yet. > > > > 8< > > > > I have noticed that scan_expr_access passes all the expressions it > > gets to

Re: [PATCH 1/9] ipa-cp: Write transformation summaries of all functions

2022-12-12 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > 2022-11-25 Martin Jambor > > * ipa-prop.cc (useful_ipcp_transformation_info_p): New function. > (write_ipcp_transformation_info): Added a parameter, simplified > given that is known not to be NULL. > (ipcp_write_transformation_summaries): Write out

Re: [PATCH 8/9] ipa-sra: Make scan_expr_access bail out on uninteresting expressions

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > I have noticed that scan_expr_access passes all the expressions it > gets to get_ref_base_and_extent

Zen4 tuning part 1 - cost tables

2022-12-06 Thread Jan Hubicka via Gcc-patches
Hi this patch updates cost of znver4 mostly based on data measued by Agner Fog. Compared to previous generations x87 became bit slower which is probably not big deal (and we have minimal benchmarking coverage for it). One interesting improvement is reducation of FMA cost. I also updated costs of

Re: Zen4 tuning part 1 - cost tables

2022-12-06 Thread Jan Hubicka via Gcc-patches
> > - COSTS_N_INSNS (5), /* cost of FADD and FSUB insns. */ > > - COSTS_N_INSNS (5), /* cost of FMUL instruction. */ > > + COSTS_N_INSNS (7), /* cost of FADD and FSUB insns. */ > > + COSTS_N_INSNS (7), /* cost of FMUL

Zen4 tuning part 2 - tuning flags

2022-12-06 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds tunes needed for zen4 microarchitecture. I added two new knobs. TARGET_AVX512_SPLIT_REGS which is used to specify that internally 512 vectors are split to 256 vectors. This affects vectorization costs and reassociation width. It probably should also affect RTX costs however I

Re: [PATCH] ipa-sra: Consider the first parameter of methods safe to dereference

2022-12-14 Thread Jan Hubicka via Gcc-patches
> Hi, > > Honza requested this after reviewing the patch that taught IPA-SRA > that REFERENCE_TYPEs are always non-NULL that the pass also handles > the first parameters of methods, this pointers, in the same way. So > this patch does that. > > The patch is undergoing bootstrap and testing on

Re: [PATCH] ipa-sra: Fix address escape case when detecting Fortran descriptors

2022-12-14 Thread Jan Hubicka via Gcc-patches
> Hi, > > The discussion about scan_expr_access in ipa-sra.cc brought my > attention to a missing case of handling an ADDR_EXPR. As the added > testcase shows, the heuristics which looks for parameters which are > local variables that are only written to and passed by reference in > calls can

Re: PING^1 [PATCH v2] predict: Adjust optimize_function_for_size_p [PR105818]

2022-12-14 Thread Jan Hubicka via Gcc-patches
> > PR middle-end/105818 > > > > gcc/ChangeLog: > > > > * predict.cc (optimize_function_for_size_p): Further check > > optimize_size of fun->decl when it is valid but no cgraph node. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/powerpc/pr105818.c: New test. > > *

Re: [PATCH] ipa-sra: Consider the first parameter of methods safe to dereference

2022-12-15 Thread Jan Hubicka via Gcc-patches
> On Wed, Dec 14, 2022 at 4:20 PM Jan Hubicka via Gcc-patches > wrote: > > > > > Hi, > > > > > > Honza requested this after reviewing the patch that taught IPA-SRA > > > that REFERENCE_TYPEs are always non-NULL that the pass also handles >

Make -fwhole-program to work with incremental LTO linking

2022-12-21 Thread Jan Hubicka via Gcc-patches
Hi, this patches updates documentation of -fwhole-program which was wrongly claiming that it is useless with LTO whole it is useful for LTO without plugin and extends -fwhole-program to also work with incremental linking when non-LTO code is produced. This is useful when building kernel where the

Re: [PATCH] ipa-cp: Do not be too optimistic about self-recursive edges (PR 107661)

2022-11-22 Thread Jan Hubicka via Gcc-patches
> Hi, > > PR 107661 shows that function push_agg_values_for_index_from_edge > should not attempt to optimize self-recursive call graph edges when > called from cgraph_edge_brings_all_agg_vals_for_node. Unlike when > being called from find_aggregate_values_for_callers_subset, we cannot > expect

Re: [PATCH 03/12] ipa-cp: Write transformation summaries of all functions

2022-11-16 Thread Jan Hubicka via Gcc-patches
> Hi, > > IPA-CP transformation summary streaming code currently won't stream > out transformations necessary for clones which are only necessary for > materialization of other clones (such as an IPA-CP clone which is then > cloned again by IPA-SRA). However, a follow-up patch for bettor >

Re: [Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread Jan Hubicka via Gcc-bugs
> > Do you mean we should fix modeling of divisions there as well? I don't have > latency/throughput measurements for those CPUs, nor access so I can run > experiments myself, unfortunately. > > I guess you mean just making a patch to model division units separately, > leaving latency/throughput

<    1   2   3   4   5   >