Zen4 tuning part 2 - tuning flags

2022-12-06 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds tunes needed for zen4 microarchitecture. I added two new knobs. TARGET_AVX512_SPLIT_REGS which is used to specify that internally 512 vectors are split to 256 vectors. This affects vectorization costs and reassociation width. It probably should also affect RTX costs however I

Zen4 tuning part 1 - cost tables

2022-12-06 Thread Jan Hubicka via Gcc-patches
Hi this patch updates cost of znver4 mostly based on data measued by Agner Fog. Compared to previous generations x87 became bit slower which is probably not big deal (and we have minimal benchmarking coverage for it). One interesting improvement is reducation of FMA cost. I also updated costs of

Fix resolution streaming with incremental linking

2022-11-25 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes streaming of resolution info when flag_incremental_link == INCREMENTAL_LINK_NOLTO. Here we want to stream the info from WPA to ltrans as usual. Bootstrapped/regtested x86_64-linux, tested with kernel LTO builds. Plan to commit it later today. Honza *

Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const

2022-11-25 Thread Jan Hubicka via Gcc-patches
> On Fri, 25 Nov 2022, Jan Hubicka wrote: > > > > > > > > > > > Am 25.11.2022 um 11:05 schrieb Jan Hubicka via Gcc-patches > > > > : > > > > > > > >  > > > >> > > > >> IPA profile i

Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const

2022-11-25 Thread Jan Hubicka via Gcc-patches
> > > > Am 25.11.2022 um 11:05 schrieb Jan Hubicka via Gcc-patches > > : > > > >  > >> > >> IPA profile instrumentation tries to clear the pure and const > >> flags of functions but that's quite hopeless in particular for > >>

Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const

2022-11-25 Thread Jan Hubicka via Gcc-patches
> IPA profile instrumentation tries to clear the pure and const > flags of functions but that's quite hopeless in particular for > const since that attribute prevails on the type and thus on each > call to the function leading to inconsistencies in the IL and > eventual checking ICEs. There's no

Re: [PATCH] ipa-cp: Do not be too optimistic about self-recursive edges (PR 107661)

2022-11-22 Thread Jan Hubicka via Gcc-patches
> Hi, > > PR 107661 shows that function push_agg_values_for_index_from_edge > should not attempt to optimize self-recursive call graph edges when > called from cgraph_edge_brings_all_agg_vals_for_node. Unlike when > being called from find_aggregate_values_for_callers_subset, we cannot > expect

Re: [PATCH 1/2] symtab: also change RTL decl name

2022-11-22 Thread Jan Hubicka via Gcc-patches
> On Mon, 21 Nov 2022 20:02:49 +0100 > Jan Hubicka wrote: > > > > Hi Honza, Ping. > > > Regtests cleanly for c,fortran,c++,ada,d,go,lto,objc,obj-c++ > > > Ok? > > > I'd need this for attribute target_clones for the Fortran FE. > > Sorry for

Re: [PATCH 1/2] symtab: also change RTL decl name

2022-11-21 Thread Jan Hubicka via Gcc-patches
> Hi Honza, Ping. > Regtests cleanly for c,fortran,c++,ada,d,go,lto,objc,obj-c++ > Ok? > I'd need this for attribute target_clones for the Fortran FE. Sorry for delay here. > > void > > @@ -303,6 +301,10 @@ symbol_table::change_decl_assembler_name (tree decl, > > tree name) > > warning (0,

Re: [PATCH 05/12] ipa-sra: Dump edge summaries also for non-candidates

2022-11-16 Thread Jan Hubicka via Gcc-patches
> Hi, > > this should have been part of r12-578-g717d278af93a4a. Call edge > summaries provide information required for IPA-SRA transformations in > the callees but are generated when analyzing callers and thus also > callers which are not IPA-SRA candidates themselves. Therefore we > analyze

Re: [PATCH 03/12] ipa-cp: Write transformation summaries of all functions

2022-11-16 Thread Jan Hubicka via Gcc-patches
> Hi, > > IPA-CP transformation summary streaming code currently won't stream > out transformations necessary for clones which are only necessary for > materialization of other clones (such as an IPA-CP clone which is then > cloned again by IPA-SRA). However, a follow-up patch for bettor >

Re: [PATCH 02/12] ipa-cp: Do not consider useless aggregate constants

2022-11-16 Thread Jan Hubicka via Gcc-patches
> Hi, > > When building vectors of known aggregate values, there is no point in > including those for parameters which are not used in any way > whatsoever. > > Bootstrapped and tested on x86_64-linux. OK for master? OK, thanks! Honza > > Thanks, > > Martin > > > gcc/ChangeLog: > >

Re: [PATCH 01/12] ipa: IPA-SRA split detection simplification

2022-11-16 Thread Jan Hubicka via Gcc-patches
> Hi, > > I have noticed that the flag m_split_modifications_p of > ipa_param_body_adjustments is not really necessary as it has to > correspond to whether m_replacements is non-empty so this patch > removes it. This also simplifies a bit some patches I work on. > > Bootstrapped and tested on

Re: [PATCH] cgraph_node: Remove redundant section clearing

2022-11-04 Thread Jan Hubicka via Gcc-patches
> Ok for trunk if testing passes? > > gcc/ChangeLog: > > * cgraph.cc (cgraph_node::make_local): Remove redundant set_section. > * multiple_target.cc (create_dispatcher_calls): Likewise. OK (not sure how this slipped in) The code in create_dispatcher_calls is clearly cut of

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen4 CPU

2022-10-21 Thread Jan Hubicka via Gcc-patches
> On Fri, Oct 21, 2022 at 12:00 PM Kumar, Venkataramanan via Gcc-patches > wrote: > > > > Hi all, > > > > > -Original Message- > > > From: Joshi, Tejas Sanjay > > > Sent: Monday, October 17, 2022 8:09 PM > > > To: gcc-patches@gcc.gnu.org > > > Cc: Kumar, Venkataramanan ; > > >

Re: [PATCH 2/2] ipa-cp: Better representation of aggregate values in call contexts

2022-10-14 Thread Jan Hubicka via Gcc-patches
> > 2022-08-26 Martin Jambor > > * ipa-prop.h (ipa_agg_value): Remove type. > (ipa_agg_value_set): Likewise. > (ipa_copy_agg_values): Remove function. > (ipa_release_agg_values): Likewise. > (ipa_auto_call_arg_values) Add a forward declaration. >

Re: [PATCH 1/2] ipa-cp: Better representation of aggregate values we clone for

2022-10-14 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > 2022-08-26 Martin Jambor > > * ipa-prop.h (IPA_PROP_ARG_INDEX_LIMIT_BITS): New. > (ipcp_transformation): Added forward declaration. > (ipa_argagg_value): New type. > (ipa_argagg_value_list): New type. > (ipa_agg_replacement_value): Removed

Re: [PATCH v2 2/3] doc: -falign-functions is ignored under -Os

2022-10-12 Thread Jan Hubicka via Gcc-patches
> This is implicitly mentioned in the docs, but there were some questions > in a recent patch. This makes it more exlicit that -falign-functions is > meant to be ignored under -Os. > > gcc/doc/ChangeLog > > * invoke.texi (-falign-functions): Mention -Os > --- > gcc/doc/invoke.texi | 3

Re: [PATCH] IPA: support -flto + -flive-patching=inline-clone

2022-10-07 Thread Jan Hubicka via Gcc-patches
> >> Probably not hard, and the IPA pass adjusting visbility could as well > >> mark the functions > >> as not to be inlined with -flive-patching=inline-only-static. > >> > > OTOH inline-only-static could disable WPA inlining and do all inlining > early ... > >>> > >>>

Re: [PATCH] IPA: support -flto + -flive-patching=inline-clone

2022-10-07 Thread Jan Hubicka via Gcc-patches
> > WPA is Whole Program Analysis? > > Yes. > > > Okay, then It will promote all static function to extern functions. That’s > > reasonable. > > No, all extern functions to static functions. > > > Is it hard to preserve the original “static” visibility in the IR? > > Probably not hard, and

Re: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size.

2022-10-07 Thread Jan Hubicka via Gcc-patches
> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng wrote: > > > > From: Monk Chiang > > > > Currnetly setting of -falign-functions=N will be ignored if the function > > is optimized for size or marked as cold function. > > > > However function alignment requirement is needed even optimized for > > size

Re: [PATCH] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-09-20 Thread Jan Hubicka via Gcc-patches
> Hi Honza, > > This patch is to add attribute hot judgement for INLINE_HINT_known_hot hint. > > We set up INLINE_HINT_known_hot hint only when we have profile feedback, > now add function attribute judgement for it, when both caller and callee > have __attribute__((hot)), we will also set up

Fix invalid devirtualization when combining final keyword and anonymous types

2022-08-12 Thread Jan Hubicka via Gcc-patches
derived types as done by this patch. Bootstrpaped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: 2022-08-10 Jan Hubicka PR middle-end/106057 * ipa-devirt.cc (type_or_derived_type_possibly_instantiated_p): New function. (possible_polymorphic_call_targets

Re: [PATCH] Properly honor param_max_fsm_thread_path_insns in backwards threader

2022-08-02 Thread Jan Hubicka via Gcc-patches
> On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > On Tue, Aug 2, 2022 at 1:45 PM Richard Biener wrote: > > > > > > On Tue, 2 Aug 2022, Aldy Hernandez wrote: > > > > > > > Unfortunately, this was before my time, so I don't know. > > > > > > > > That being said, thanks for tackling these issues

Re: [PATCH] IPA: reduce what we dump in normal mode

2022-08-02 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > * profile.cc (compute_branch_probabilities): Dump details only > if TDF_DETAILS. > * symtab.cc (symtab_node::dump_base): Do not dump pointer unless > TDF_ADDRESS is used, it makes comparison harder. > --- > gcc/profile.cc | 2 +- > gcc/symtab.cc | 3

Re: [PATCH] ipa-cp: Fix assert triggering with -fno-toplevel-reorder (PR 106260)

2022-07-18 Thread Jan Hubicka via Gcc-patches
> Hi, > > with -fno-toplevel-reorder (and -fwhole-program), there apparently can > be local functions without any callers. This is something that IPA-CP If there is possibility to trigger a local function without callers, I think one can also make two local functions calling each other but with

Re: [PATCH] Add a heuristic for eliminate redundant load and store in inline pass.

2022-07-07 Thread Jan Hubicka via Gcc-patches
Hello, > From: Lili > > > Hi Hubicka, > > This patch is to add a heuristic inline hint to eliminate redundant load and > store. > > Bootstrap and regtest pending on x86_64-unknown-linux-gnu. > OK for trunk? > > Thanks, > Lili. > > Add a INLINE_HINT_eliminate_load_and_store hint in to

Re: PING^1 [PATCH] inline: Rebuild target option node for caller [PR105459]

2022-07-01 Thread Jan Hubicka via Gcc-patches
> On Thu, Jun 23, 2022 at 4:03 AM Kewen.Lin wrote: > > > > Hi, > > > > Gentle ping https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596212.html > > > > BR, > > Kewen > > > > on 2022/6/6 14:20, Kewen.Lin via Gcc-patches wrote: > > > Hi, > > > > > > PR105459 exposes one issue in inline_call

Add fnspec attributes to cxa_* functions

2022-06-23 Thread Jan Hubicka via Gcc-patches
edges. Honza gcc/cp/ChangeLog: 2022-06-23 Jan Hubicka * except.cc (declare_library_fn_1): Add fnspec parameter. (declare_library_fn): Add fnspec parameter. (do_allocate_exception): Declare fnspecs. (do_free_exception): Declare fnspecs. (build_throw

Fix stmt_kills_ref_p wrt external throws

2022-06-23 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds missing check to stmt_kills_ref_p for case that function is terminated by EH before call return value kills the ref. In the PR I tried to construct testcase but I don't know how to do that until I annotate EH code with fnspec attributes which I will do in separate patch and

Re: [PATCH] ipa-icf: skip variables with body_removed

2022-06-22 Thread Jan Hubicka via Gcc-patches
> @Honza: PING > > On 5/20/22 09:46, Martin Liška wrote: > > On 5/19/22 17:02, Jan Hubicka wrote: > >>> Similarly to cgraph_nodes, it may happen that body_removed is set > >>> during merging of symbols. > >>> > >>> PR ipa/105600 &

Re: [PATCH] Add operators / and * for profile_{count,probability}.

2022-06-17 Thread Jan Hubicka via Gcc-patches
> PING^2 Sorry, I thought it is approved once we settled down the multiplicatoin datatype, but apparently never sent the email. Patch is oK. Honza > > On 5/24/22 13:35, Martin Liška wrote: > > PING^1 > > > > On 5/5/22 20:15, Martin Liška wrote: > >> On 5/5/2

Re: [PATCH] predict: Adjust optimize_function_for_size_p [PR105818]

2022-06-14 Thread Jan Hubicka via Gcc-patches
> Hi, > > Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO > if func->decl is not null but no cgraph node is available for it. > As PR105818 shows, this could give unexpected result. For the > case in PR105818, when parsing bar decl in function foo, the cfun > is a function

Fix ipa-prop wrt volatile memory accesses

2022-06-10 Thread Jan Hubicka via Gcc-patches
Hi, this patch prevents ipa-prop from propagating aggregates when load is volatile. Martin, does this look OK? It seem to me that ipa-prop may need some additional volatile flag checks. Bootstrapped/regtested x86_64-linux, OK? Honza gcc/ChangeLog: 2022-06-10 Jan Hubicka PR ipa

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-25 Thread Jan Hubicka via Gcc-patches
> On Mon, 16 May 2022, Alexander Monakov wrote: > > > On Mon, 9 May 2022, Jan Hubicka wrote: > > > > > > On second thought, it might be better to keep the assert, and place the > > > > loop > > > > under 'if (optimize)'? > > >

Re: [PATCH] ipa-icf: skip variables with body_removed

2022-05-19 Thread Jan Hubicka via Gcc-patches
> Similarly to cgraph_nodes, it may happen that body_removed is set > during merging of symbols. > > PR ipa/105600 > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > * ipa-icf.cc

Re: [PATCH] lto-plugin: add support for feature detection

2022-05-16 Thread Jan Hubicka via Gcc-patches
> On 5/16/22 11:25, Jan Hubicka via Gcc-patches wrote: > >> > >> Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > >> is useful if bugs are discovered in old versions that you by definition > >> cannot > >> fix but can appl

Re: [PATCH] lto-plugin: add support for feature detection

2022-05-16 Thread Jan Hubicka via Gcc-patches
> > Sure having a 'plugin was compiled from sources of the GCC N.M compiler' > is useful if bugs are discovered in old versions that you by definition cannot > fix but can apply workarounds to. Note the actual compiler used might still > differ. Note that still isn't clean API documentation /

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-09 Thread Jan Hubicka via Gcc-patches
> On Mon, 2 May 2022, Alexander Monakov wrote: > > > > --- a/gcc/ipa-visibility.cc > > > > +++ b/gcc/ipa-visibility.cc > > > > @@ -872,6 +872,22 @@ function_and_variable_visibility (bool > > > > whole_program) > > > > } > > > > } > > > > } > > > > + FOR_EACH_VARIABLE

Re: [PATCH] Add operators / and * for profile_{count,probability}.

2022-05-05 Thread Jan Hubicka via Gcc-patches
Hi, > The patch simplifies usage of the profile_{count,probability} types. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? The reason I intentionally did not add * and / to the original API was to detect situations where values that should

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-05 Thread Jan Hubicka via Gcc-patches
> On Thu, 5 May 2022, Jan Hubicka wrote: > > > Also note that visibility pass is run twice (once at compile time before > > early optimizations and then again at LTO). Since LTO linking may > > promote public symbols to local/hidden, perhaps we want to do this only

Re: [PATCH] ipa-visibility: Optimize TLS access [PR99619]

2022-05-05 Thread Jan Hubicka via Gcc-patches
> > @@ -872,6 +872,22 @@ function_and_variable_visibility (bool whole_program) > > } > > } > > } > > + FOR_EACH_VARIABLE (vnode) > > +{ > > + tree decl = vnode->decl; > > + > > + /* Optimize TLS model based on visibility (taking into account > > +

Re: [PATCH] LTO plugin: add ld_plugin_version callback.

2022-05-02 Thread Jan Hubicka via Gcc-patches
> On Mon, May 2, 2022 at 10:51 AM Richard Biener > wrote: > > > > On Mon, May 2, 2022 at 10:19 AM Martin Liška wrote: > > > > > > On 5/2/22 10:09, Richard Biener wrote: > > > > On Mon, May 2, 2022 at 9:52 AM Martin Liška wrote: > > > >> > > > >> Hi. > > > >> > > > >> This in a new plug-in

Re: [PATCH] cgraph: Don't verify semantic_interposition flag for aliases [PR105399]

2022-04-28 Thread Jan Hubicka via Gcc-patches
> On Thu, Apr 28, 2022 at 01:54:51PM +0200, Jan Hubicka wrote: > > > --- gcc/cgraph.cc.jj 2022-04-20 09:24:12.194579146 +0200 > > > +++ gcc/cgraph.cc 2022-04-27 11:53:52.102173154 +0200 > > > @@ -3488,7 +3488,9 @@ cgraph_node::verify_node (void) &

Re: [PATCH] cgraph: Don't verify semantic_interposition flag for aliases [PR105399]

2022-04-28 Thread Jan Hubicka via Gcc-patches
Hello, > Hi! > > The following testcase ICEs, because the ctors during cc1plus all have > !opt_for_fn (decl, flag_semantic_interposition) - they have NULL > DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl) and optimization_default_node > is for -Ofast and so has flag_semantic_interposition cleared. >

Re: [PATCH] ipa: Release body of clone_of when removing its last clone (PR 100413)

2022-04-28 Thread Jan Hubicka via Gcc-patches
> Hi, > > In the PR, the verifier complains that we did not manage to remove the > body of a node and it is right. The node is kept for materialization > of two clones but after one is materialized, the other one is removed > as unneeded (as a part of delete_unreachable_blocks_update_callgraph).

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> On Wed, Apr 20, 2022 at 01:47:43PM +0200, Martin Jambor wrote: > > Hi, > > > > On Wed, Apr 20 2022, Jan Hubicka via Gcc-patches wrote: > > >> On Wed, 20 Apr 2022, Jakub Jelinek wrote: > > > > [...] > > >

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> > The cgraph.cc change was what I actually needed for the fix, the > cgraphclones.cc was only because I've noticed that it constructs a new > node (so is initialized to whatever random flag_semantic_interposition is > right now) and initializing it to what it is cloned from made more sense.

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> On Wed, Apr 20, 2022 at 10:45:53AM +0200, Jan Hubicka wrote: > > So this change should be unnecessary unless there are nodes that are > > missing finalization stage. It also is not good enough since frontends > > may change opt_for_fn between node creation and finalizati

Re: [PATCH] gcov-profile: Allow negavive counts of indirect calls [PR105282]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> From: Sergei Trofimovich > > TOPN metrics are histograms that contain overall count and per-bucket > count. Overall count can be nevative when two profiles merge and some > of per-bucket metrics are dropped. > > Noticed as an ICE on python PGO build where gcc crashes as: > > during IPA

Re: [PATCH][v2] tree-optimization/104912 - ensure cost model is checked first

2022-04-20 Thread Jan Hubicka via Gcc-patches
> The following makes sure that when we build the versioning condition > for vectorization including the cost model check, we check for the > cost model and branch over other versioning checks. That is what > the cost modeling assumes, since the cost model check is the only > one accounted for in

Re: [PATCH] cgraph: Fix up semantic_interposition handling [PR105306]

2022-04-20 Thread Jan Hubicka via Gcc-patches
> On Wed, 20 Apr 2022, Jakub Jelinek wrote: > > > Hi! > > > > cgraph_node has a semantic_interposition flag which should mirror > > opt_for_fn (decl, flag_semantic_interposition). But it actually is > > initialized not from that, but from flag_semantic_interposition in the > > explicit

Avoid overflow in ipa-modref-tree.cc

2022-04-10 Thread Jan Hubicka via Gcc-patches
ngeLog: 2022-04-11 Jan Hubicka * ipa-modref-tree.cc (modref_access_node::closer_pair_p): Use poly_offset_int to avoid overflow. (modref_access_node::update2): Likewise. gcc/testsuite/ChangeLog: 2022-04-11 Jan Hubicka * gcc.c-torture/compile/103818.c: New test.

Fix ICE with -fno-semantic-interposition added via option attribut

2022-04-09 Thread Jan Hubicka via Gcc-patches
Hi, This patch solves problem with FE first finalizing function and then adding -fno-semantic-interposition flag (by parsing optimization attribute). Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: 2022-04-09 Jan Hubicka PR ipa/103376 * cgraphunit.cc

Fix nondeterministic and side_effect propagation in ipa-modref

2022-04-09 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds logic to propagate nondeterministic and side_effects bits in modref when summary is updated after inlining. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2022-04-09 Jan Hubicka * ipa-modref.cc (ipa_merge_modref_summary_after_inlining): Propagate

Re: Fix wrong code in gnatmake

2022-04-07 Thread Jan Hubicka via Gcc-patches
> On Thu, 7 Apr 2022, Jan Hubicka wrote: > > > Hi, > > this patch fixes miscompilation of gnatmake. Modref attempts to track > > memory > > accesses relative to the base pointers which are parameters of functions. > > If it fails, it still makes diff

Re: Fix pure/const propagation in modref

2022-04-07 Thread Jan Hubicka via Gcc-patches
> On Thu, Apr 7, 2022 at 1:20 PM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > this patch fixes ipa-modref propagation of pure/const functions. When we > > inline > > function, the modref summary is updated to represent the function after > &g

Fix wrong code in gnatmake

2022-04-07 Thread Jan Hubicka via Gcc-patches
. Honza gcc/ChangeLog: 2022-04-07 Jan Hubicka PR 104303 * tree-ssa-alias.cc (ref_may_access_global_memory_p): Fix handling of refs. diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc index 50bd47b31f3..9e34f76c3cb 100644 --- a/gcc/tree-ssa-alias.cc +++ b/gcc

Fix pure/const propagation in modref

2022-04-07 Thread Jan Hubicka via Gcc-patches
. Honza gcc/ChangeLog: 2022-04-07 Jan Hubicka PR ipa/105160 * ipa-modref.cc (ipa_merge_modref_summary_after_inlining): gcc/testsuite/ChangeLog: 2022-04-07 Jan Hubicka PR ipa/105160 * gcc.dg/ipa/pr105160.c: New test. diff --git a/gcc/ipa-modref.cc b/gcc

Re: [PATCH] ipa-cp: Do not create clones for values outside known value range (PR 102513)

2022-03-31 Thread Jan Hubicka via Gcc-patches
> Hi, > > PR 102513 shows we emit bogus array access warnings when IPA-CP > creates clones specialized for values which it deduces from arithmetic > jump functions describing self-recursive calls. Those can however be > avoided if we consult the IPA-VR information that the same pass also > has.

Re: [PATCH] ipa: Create LOAD references when necessary during inlining (PR 103171)

2022-03-31 Thread Jan Hubicka via Gcc-patches
> Hi, > > in r12-2523-g13586172d0b70c ipa-prop tracking of jump functions during > inlining got the ability to remove ADDR references when inlining > discovered that they were not necessary or turn them into LOAD > references when we know that what was a function call argument passed > by

Re: [PATCH] ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

2022-03-31 Thread Jan Hubicka via Gcc-patches
> IPA_JF_ANCESTOR jump functions are constructed also when the formal > parameter of the caller is first checked whether it is NULL and left > as it is if it is NULL, to accommodate C++ casts to an ancestor class. > > The jump function type was invented for devirtualization and IPA-CP >

Disable gathers on zen3 for vectors with few elements

2022-03-27 Thread Jan Hubicka via Gcc-patches
tomorrow if there are no complains. Honza gcc/ChangeLog: 2022-03-28 Jan Hubicka * config/i386/i386-builtins.cc (ix86_vectorize_builtin_gather): Test TARGET_USE_GATHER_2PARTS and TARGET_USE_GATHER_4PARTS. * config/i386/i386.h (TARGET_USE_GATHER_2PARTS): New macro

Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-17 Thread Jan Hubicka via Gcc-patches
> +/* Returns whether the control parents of BB are preserved. */ > + > +static bool > +control_parents_preserved_p (basic_block bb) > +{ > + /* If we marked the control parents from BB they are preserved. */ > + if (bitmap_bit_p (visited_control_parents, bb->index)) > +return true; > + >

Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-15 Thread Jan Hubicka via Gcc-patches
> @@ -1272,7 +1275,7 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator > *gsi, > contributes nothing to the program, and can be deleted. */ > > static bool > -eliminate_unnecessary_stmts (void) > +eliminate_unnecessary_stmts (bool aggressive) > { >bool something_changed = false;

Re: [PATCH] internal_error - do not use leading capital letter

2022-01-27 Thread Jan Hubicka via Gcc-patches
> That's follow up patch based on the discussion with Jakub. > > Ready to be installed? > Thanks, > Martin > > gcc/ChangeLog: > > * config/rs6000/host-darwin.cc (segv_crash_handler): > Do not use leading capital letter. > (segv_handler): Likewise. > * ipa-sra.cc

Re: [PATCH] ipa/103989 - avoid IPA inlining of small functions with -Og

2022-01-18 Thread Jan Hubicka via Gcc-patches
> The following change avoids doing IPA inlining of small functions > into functions compiled with -Og - those functions will see almost no > followup scalar cleanups so that the benefit anticipated by the > inliner will not be realized and instead the late diagnostic code > will be confused by

Re: [PATCH] ipa/103989 - tame IPA optimizations at -Og

2022-01-18 Thread Jan Hubicka via Gcc-patches
> With -Og we are not prepared to do cleanup after IPA optimizations > and dead code exposed by those confuses late diagnostic passes. > This is a first patch removing unwanted IPA optimizations, namely > both late modref and pure-const analysis. > > Bootstrap and regtest running on

Re: [PATCH] Fix tree-optimization/101941: IPA splitting out function with error attribute

2022-01-14 Thread Jan Hubicka via Gcc-patches
> > > > > --- a/gcc/ipa-split.c > > > > > +++ b/gcc/ipa-split.c > > > > > @@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb, > > > > > gimple *stmt = gsi_stmt (bsi); > > > > > tree op; > > > > > ssa_op_iter iter; > > > > > - tree decl; > > > > > +

Re: [PATCH 1/6] ira: Add a ira_loop_border_costs class

2022-01-06 Thread Jan Hubicka via Gcc-patches
> The final index into (ira_)memory_move_cost is 1 for loads and > 0 for stores. Thus the combination: > > entry_freq * memory_cost[1] + exit_freq * memory_cost[0] > > is the cost of loading a register on entry to a loop and > storing it back on exit from the loop. This is the cost to > use

Re: Patch ping

2022-01-03 Thread Jan Hubicka via Gcc-patches
> Hi! > > I'd like to ping the > https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586553.html > symtab: Fold == to 0 if folding_initializer [PR94716] > > patch. Thanks. OK. Note that with LTO partitioning it may happen that alias is defined in one partition but used in another. We

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-29 Thread Jan Hubicka via Gcc-patches
> > > > From: Xiong Hu Luo > > > > gcc/ChangeLog: > > > > * loop-invariant.c (find_invariants_bb): Check profile count > > before motion. > > (find_invariants_body): Add argument. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/loop-invariant-2.c: New. OK, thanks! Honza

Re: [PATCH] Fix ICE in lsplit when built with -O3 -fno-guess-branch-probability [PR103793]

2021-12-28 Thread Jan Hubicka via Gcc-patches
> - /* Proportion second loop's bb counts except those dominated by false > -branch to avoid drop 1s down. */ > - basic_block bbi_copy = get_bb_copy (false_edge->dest); > - bbs2 = get_loop_body (loop2); > - for (j = 0; j < loop2->num_nodes; j++) > - if (bbs2[j] ==

Fix handling of deferred SSA names in modref

2021-12-19 Thread Jan Hubicka via Gcc-patches
Hi, in the testcase we fail to analyze SSA name because flag do_dataflow is set and thus triggers early exist in analyze_ssa_name. Fixed by disabling early exits when handling deferred names. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2021-12-20 Jan Hubicka PR

Fix early exit in modref_merge_call_site_flags

2021-12-19 Thread Jan Hubicka via Gcc-patches
/regtested x86_64-linux, comitted. gcc/ChangeLog: 2021-12-19 Jan Hubicka PR ipa/103766 * ipa-modref.c (modref_merge_call_site_flags): Fix early exit condition diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index d3590f0b62b..9c411a6297a 100644 --- a/gcc/ipa-modref.c +++ b/gcc

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-16 Thread Jan Hubicka via Gcc-patches
> > OK. Comments like? > > /* Don't move insn of cold BB out of loop to preheader to reduce calculations >and register live range in hot loop with cold BB. */ Looks good. > > > And maybe some dump log will help tracking in xxx.c.271r.loop2_invariant. > > --- a/gcc/loop-invariant.c >

Re: [PATCH 2/3] Fix incorrect loop exit edge probability [PR103270]

2021-12-16 Thread Jan Hubicka via Gcc-patches
> > > > > > ./contrib/analyze_brprob.py ~/workspace/tests/spec2017/dump_file_all > > HEURISTICS BRANCHES (REL) BR. HITRATE > > HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT > > branches (>10%) > > noreturn call

Re: Add -fipa-strict-aliasing

2021-12-13 Thread Jan Hubicka via Gcc-patches
Hi, this is a variant I comitted (with updated documentation as Richard requested). Honza gcc/ChangeLog: 2021-12-13 Jan Hubicka * common.opt: Add -fipa-strict-aliasing. * doc/invoke.texi: Document -fipa-strict-aliasing. * ipa-modref.c (modref_access_analysis

Re: [PATCH] ipa: Careful processing ANCESTOR jump functions and NULL pointers (PR 103083)

2021-12-13 Thread Jan Hubicka via Gcc-patches
> >>> + || (only_for_nonzero && !src_lats->bits_lattice.known_nonzero_p ())) > >>> + { > >>> + if (jfunc->bits) > >>> + return dest_lattice->meet_with (jfunc->bits->value, > >>> + jfunc->bits->mask, precision); > >>> + else > >>> + return

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-13 Thread Jan Hubicka via Gcc-patches
> > gcc/ChangeLog: > > > > * loop-invariant.c (find_invariants_bb): Check profile count > > before motion. > > (find_invariants_body): Add argument. > > --- > > gcc/loop-invariant.c | 10 +++--- > > 1 file changed, 7 insertions(+), 3 deletions(-) > > > > diff --git

Re: [PATCH 2/3] Fix incorrect loop exit edge probability [PR103270]

2021-12-13 Thread Jan Hubicka via Gcc-patches
> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in > profile-estimate when predict_extra_loop_exits, outer loop's exit edge > is marked as inner loop's extra loop exit and set with incorrect > prediction, then a hot inner loop will become cold loop finally through >

Re: [PATCH 1/3] loop-invariant: Don't move cold bb instructions to preheader in RTL

2021-12-13 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > * loop-invariant.c (find_invariants_bb): Check profile count > before motion. > (find_invariants_body): Add argument. > --- > gcc/loop-invariant.c | 10 +++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/gcc/loop-invariant.c

Do not ICE when computing value range of ternary expression

2021-12-12 Thread Jan Hubicka via Gcc-patches
it is so rare case, i guess we can just punt and give up on producing it. Bootstrapped/regtsted x86_64-linux, OK? gcc/ChangeLog: 2021-12-12 Jan Hubicka * ipa-fnsummary.c (evaluate_conditions_for_known_args): Do not ICE on ternary expression. gcc/testsuite/ChangeLog: 2021-12

Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
> > I think this is common pattern in C++ code originating from cast with > multiple inheritance. I would vote towards optimizing out the conditial > move in this case and I think it is correct. I crafted a testcse and filled in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103674 Honza > >

Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
> > > On 12/12/2021 3:49 AM, Jan Hubicka via Gcc-patches wrote: > > Hi, > > As discussed in the PR, we miss some optimization becuase > > gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds > > __builtin_trap after them. This is seen

Re: Add -fipa-strict-aliasing

2021-12-12 Thread Jan Hubicka via Gcc-patches
> On December 12, 2021 1:22:09 PM GMT+01:00, Jan Hubicka via Gcc-patches > wrote: > >Hi, > >ipa-modref is using TBAA to disambiguate memory accesses inter-procedurally. > >This sometimes breaks programs with TBAA violations including clang with LTO. > >To workaroun

Re: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
> >+ /* NULL memory accesses terminates BB. These accesses are known > >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them > >+ to volatile accesses and adds builtin_trap call which would > >+ confuse us otherwise. */ > >+ if

Add -fipa-strict-aliasing

2021-12-12 Thread Jan Hubicka via Gcc-patches
patch that controls only the TBAA based analysis in ipa-modref while keeping all other optimizations. Bootstrapped/regtested x86_64-linux, will commit it shortly. gcc/ChangeLog: 2021-12-12 Jan Hubicka * common.opt: Add -fipa-strict-aliasing. * doc/invoke.texi: Document -fipa

Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
disambiguations, 27571176 queries pt_solutions_intersect: 1594296 disambiguations, 15943975 queries Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2021-12-12 Jan Hubicka PR ipa/103665 * ipa-modref.c (modref_access_analysis::analyze

Distinguish global and unkonwn memory accesses in ipa-modref

2021-12-12 Thread Jan Hubicka via Gcc-patches
but conceptualy simple and handles a lot of common cases). gcc/ChangeLog: 2021-12-12 Jan Hubicka PR ipa/103585 * ipa-modref-tree.c (modref_access_node::range_info_useful_p): Handle MODREF_GLOBAL_MEMORY_PARM. (modref_access_node::dump): Likewise. (modref_access_node

Fix handling of histogram in ipa-profile

2021-12-11 Thread Jan Hubicka via Gcc-patches
BBs hot and hten unrolling or vectorization may reduce it to some fraction of the count that makes it cold. We may want to add some buffer and divide the value by, say 32, but that shoulid be done independently of speculative calls. gcc/ChangeLog: 2021-12-11 Jan Hubicka * ipa-profile.c

Fix ipa-modref handling of thunks

2021-12-11 Thread Jan Hubicka via Gcc-patches
, will commit it shortly. gcc/ChangeLog: 2021-12-11 Jan Hubicka * ipa-modref.c (get_modref_function_summary): Use ultimate_alias_target. (ignore_edge): Likewise. (compute_parm_map): Likewise. (modref_propagate_in_scc): Likewise

Re: [PATCH] inline: fix ICE with -fprofile-generate

2021-12-10 Thread Jan Hubicka via Gcc-patches
> Fixes ICE spotted by Honza where we have a better place where > to check for no_profile_instrument_function attribute. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, > Martin > > PR ipa/103636 > > gcc/ChangeLog: > >

Re: [PATCH] PR ipa/103601: ICE compiling CSiBE in ipa-modref's insert_kill

2021-12-10 Thread Jan Hubicka via Gcc-patches
> On Fri, Dec 10, 2021 at 2:30 AM Roger Sayle > wrote: > > > > > > This patch fixes PR ipa/103061 which is P1 regression that shows up as > > an ICE in ipa-modref-tree.c's insert_kill when compiling the CSiBE > > benchmark. I believe the underlying cause is that the new kill tracking > >

Re: [PATCH] c++, symtab: Support (x) == (y) in constant evaluation [PR103600]

2021-12-09 Thread Jan Hubicka via Gcc-patches
> > Ah, indeed, good idea. FYI, clang++ seems to constant fold > (x) != (y) already, so Jonathan could use it even for > clang++ in the constexpr operator==. But it folds even > extern int , > constexpr bool c = != > regardless of whether some other TU has > int a; > int b

Re: Limit inlining functions called once

2021-12-09 Thread Jan Hubicka via Gcc-patches
> > I plan to reduce the value during before christmas after bit more testing > > since > > it seems to be overall win even if we trade fatigue2 performance, but I > > would > > like to get more testing on larger C++ APPs first. > > Will this hurt -Os -finline-limit=0 ? Why do you use

Re: [PATCH] alias: Optimise call_may_clobber_ref_p

2021-12-07 Thread Jan Hubicka via Gcc-patches
> > Yeah, the fast summary array lookup itself seems fine. What slowed > this down for me was instead: > > /* A single function body may be represented by multiple symbols with > different visibility. For example, if FUNC is an interposable alias, > we don't want to return

Limit inlining functions called once

2021-12-07 Thread Jan Hubicka via Gcc-patches
Hi, as dicussed in PR ipa/103454 there are several benchmarks that regresses for -finline-functions-called once. Runtmes: - tramp3d with -Ofast. 31% - exchange2 with -Ofast 11-21% - roms O2 9%-10% - tonto 2.5-3.5% with LTO Build times: - specfp2006 41% (mostly wrf that builds 71% faster) -

Re: [PATCH] alias: Optimise call_may_clobber_ref_p

2021-12-07 Thread Jan Hubicka via Gcc-patches
> > Notice the ??? comment. The code does not set clobbers here because it > > assumes that tree-ssa-alias will do the right thing. > > So one may make builtins handling first, PTA next and only if both say > > "may alias" continue. Other option is to extend the code here to add > > propert

Re: [patch] lto: Don't run ipa-comdats pass during LTO

2021-12-07 Thread Jan Hubicka via Gcc-patches
> The attached patch fixes an ICE in lto1 at lto-partition.c:215 that > was reported by a customer. Unfortunately I have no test case for > this; the customer's application is a big C++ shared library with lots > of dependencies and proprietary code under NDA. I did try reducing it > with cvise

<    1   2   3   4   5   6   7   8   9   10   >