Workaround ICE in gimple_static_chain_flags

2021-11-04 Thread Jan Hubicka via Gcc-patches
periodic testers I am silencing the ICE for now (at expense of missed optimization) Honza gcc/ChangeLog: 2021-11-04 Jan Hubicka PR ipa/103058 * gimple.c (gimple_call_static_chain_flags): Handle case when nested function does not bind locally. diff --git a/gcc/gimple.c b

Implement intraprocedural dataflow for ipa-modref EAF analyser

2021-11-04 Thread Jan Hubicka via Gcc-patches
Hi, this patch implements the (long promised) intraprocedural dataflow for propagating eaf flags, so we can handle parameters that participate in loops in SSA graphs. Typical example are acessors that walk linked lists, for example. I implemented dataflow using the standard iteration over BBs in

Re: [PATCH] Rename predicate class to ipa_predicate

2021-11-03 Thread Jan Hubicka via Gcc-patches
> Hello. > > The renaming patch fixes a -Wodr warning seen and reported in the PR. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > PR bootstrap/102828 > > gcc/ChangeLog: > > * ipa-fnsummary.c (edge_predicate_pool): Rename predicate class to >

Fix wrong code caused by ipa-modref retslot handling

2021-11-02 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes (quite nasty) thinko in how I propagate EAF flags from callee to caller. In this case some flags needs to be changed. In particular - EAF_NOT_RETURNED in callee does not really mean EAF_NOT_RETURNED in caller since we speak of different return values - if callee

Re: ipa-modref cleanup

2021-11-02 Thread Jan Hubicka via Gcc-patches
> It broke GCC bootstrap: > > https://gcc.gnu.org/pipermail/gcc-regression/2021-November/075676.html > > In file included from ../../src-master/gcc/coretypes.h:474, > from ../../src-master/gcc/expmed.c:26: > In function ‘poly_uint16 mode_to_bytes(machine_mode)’, > inlined

ipa-modref cleanup

2021-11-02 Thread Jan Hubicka via Gcc-patches
Hi, this patch is a small refactoring of ipa-modref to make it bit more C++y by moving logic analyzing ssa name flags to a class and I also moved the anonymous namespace markers so we do not export unnecessary stuff. There are no functional changes. Bootstrapped/regtested x86_64-linux, will

Add EAF_NOT_RETURNED_DIRECTLY

2021-11-01 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds EAF_NOT_RETURNED_DIRECTLY which works similarly as EAF_NODIRECTESCAPE. Values pointed to by a given argument may be returned but not the argument itself. This helps PTA quite noticeably because we mostly care about tracking points to which given memory location can escape. I

Add static_chain support to ipa-modref

2021-11-01 Thread Jan Hubicka via Gcc-patches
Hi, this is patchs teaches ipa-modref about the static chain that is, like retslot, a hiden argument. The patch is pretty much symemtric to what was done for retslot handling and I verified it does the intended job for Ada LTO bootstrap. Bootstrapped/regtested x86_64-linux, OK? Honza

Handle retslot_flags in ipa-modref and PTA

2021-10-29 Thread Jan Hubicka via Gcc-patches
Hi, this patch extends modref and tree-ssa-structalias to handle retslot flags. Since retslot it essentially a hidden argument that is known to be write-only we can do pretty much the same stuff as we do for regular parameters. I plan to add static chain handling similar way. We do not handle IPA

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-10-27 Thread Jan Hubicka via Gcc-patches
> Hi, > > On 2021/9/28 20:09, Richard Biener wrote: > > On Fri, Sep 24, 2021 at 8:29 AM Xionghu Luo wrote: > >> > >> Update the patch to v3, not sure whether you prefer the paste style > >> and continue to link the previous thread as Segher dislikes this... > >> > >> > >> [PATCH v3] Don't move

Re: [PATCH v2 1/4] Fix loop split incorrect count and probability

2021-10-27 Thread Jan Hubicka via Gcc-patches
> On Wed, 27 Oct 2021, Jan Hubicka wrote: > > > > > > > gcc/ChangeLog: > > > > > > * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. > > > (do_split_loop_on_cond): Likewise. > > > --- > > > gcc/tree-ssa-

Re: [PATCH v2 1/4] Fix loop split incorrect count and probability

2021-10-27 Thread Jan Hubicka via Gcc-patches
> As discussed yesterday, for loop of form > > for (...) > if (cond) > cond = something(); > else > something2 > > Split as > Say "if (cond)" has probability p, then individual statements scale as follows: loop1: pfor (...) p if (true) 1cond = something(); 1

Re: [PATCH v2 1/4] Fix loop split incorrect count and probability

2021-10-27 Thread Jan Hubicka via Gcc-patches
> > gcc/ChangeLog: > > * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. > (do_split_loop_on_cond): Likewise. > --- > gcc/tree-ssa-loop-split.c | 25 - > 1 file changed, 16 insertions(+), 9 deletions(-) > > diff --git

Re: [PATCH] Fix loop split incorrect count and probability

2021-10-26 Thread Jan Hubicka via Gcc-patches
> > > That said, likely the profile update cannot be done uniformly > for all blocks of a loop? For the loop: for (i = 0; i < n; i = inc (i)) { if (ga) ga = do_something (); } to: for (i = 0; i < x; i = inc (i)) { if (true) ga = do_something (); if

Re: [PATCH] Fix loop split incorrect count and probability

2021-10-26 Thread Jan Hubicka via Gcc-patches
> On Tue, 26 Oct 2021, Xionghu Luo wrote: > > > > > > > On 2021/10/21 18:55, Richard Biener wrote: > > > On Thu, 21 Oct 2021, Xionghu Luo wrote: > > > > > >> > > >> > > >> On 2021/10/15 13:51, Xionghu Luo via Gcc-patches wrote: > > >>> > > >>> > > >>> On 2021/9/23 20:17, Richard Biener wrote:

Cleanup compute_points_to_sets

2021-10-19 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes two issues I noticed while proofreading the code. First is that I have added conditional around setting of nonlocal and escaped flags (since they may be set from solver) while keeping the variable in assignment that is confusing. Second is that we still do not set pt in the

Re: [Patch] (was: Re: [r12-4457 Regression] FAIL: gfortran.dg/deferred_type_param_6.f90 -Os execution test on Linux/x86_64)

2021-10-16 Thread Jan Hubicka via Gcc-patches
> > Fortran has for a long time 'character(len=5), allocatable" or > "character(len=*)". In the first case, the "5" can be ignored as both > caller and callee know the length. In the second case, the length is > determined by the argument, but it cannot be changed. > > Since a not-that-short

Re: [r12-4457 Regression] FAIL: gfortran.dg/deferred_type_param_6.f90 -Os execution test on Linux/x86_64

2021-10-16 Thread Jan Hubicka via Gcc-patches
Hi, > > FAIL: gfortran.dg/deferred_type_param_6.f90 -O1 execution test > FAIL: gfortran.dg/deferred_type_param_6.f90 -Os execution test Sorry for the breakage. This time it seems like bug in Fortran FE which was previously latent: __attribute__((fn spec (". . R "))) void subfunc

Fix wrong code in ldist-strlen-1.c

2021-10-16 Thread Jan Hubicka via Gcc-patches
Hi, while updating compute_points_to_sets I missed that the code not only sets the nonlocal/escaped flags but also initializes pt. With my previous change if uses_global_memory is false pt is not updated correctly which may lead to wrong code. This is fixed by the following patch I comitted to

Revert accidental change in ipa-modref-tree.h

2021-10-11 Thread Jan Hubicka via Gcc-patches
Hi, I managed to commit an unrelatd change that was sitting my tree that breaks bootstrap. I have reverted it now and checked bootstrap gets past the failing point (still waiting for full bootstrap to finish at x86_64-linux). Honza gcc/ChangeLog: * ipa-modref-tree.h (struct

Re: [PATCH 3/4] ipa-cp: Fix updating of profile counts and self-gen value evaluation

2021-10-08 Thread Jan Hubicka
> For non-local nodes which can have unknown callers, the algorithm just > takes half of the counts - we may decide that taking just a third or > some other portion is more reasonable, but I do not think we can > attempt anything more clever. Can't you just sum the calling edges and subtract it

Rewrite PTA constraint generation for function calls

2021-10-08 Thread Jan Hubicka
Hi, this patch commonizes the three paths to produce constraints for function call and makes it more flexible, so we can implement new features more easily. Main idea is to not special case pure and const since we can now describe all of pure/const via their EAF flags (implicit_const_eaf_flags

Re: [PATCH 2/4] ipa-cp: Propagation boost for recursion generated values

2021-10-07 Thread Jan Hubicka
> Hi, > > > > If you boost every self fed value by factor of 6, I wonder how quickly > > we run into exponential explosion of the cost (since the frequency > > should be close to 1 and 6^9=10077696 > > The factor of six is applied once for an entire SCC, so we'd reach this > huge number only

Fix ipa-modref ICE

2021-10-07 Thread Jan Hubicka
Hi, this patch fixes omitted case in contains_p which later trigger a sanity check since merging is not symmetric. Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: 2021-10-07 Jan Hubicka PR ipa/102581 * ipa-modref-tree.h (modref_access_node::contains_p

Re: [PATCH 2/4] ipa-cp: Propagation boost for recursion generated values

2021-10-06 Thread Jan Hubicka
> Recursive call graph edges, even when they are hot and important for > the compiled program, can never have frequency bigger than one, even > when the actual time savings in the next recursion call are not > realized just once but depend on the depth of recursion. The current > IPA-CP effect

Re: [PATCH 4/4] ipa-cp: Select saner profile count to base heuristics on

2021-10-06 Thread Jan Hubicka
> 2021-08-23 Martin Jambor > > * params.opt (param_ipa_cp_profile_count_base): New parameter. > * ipa-cp.c (max_count): Replace with base_count, replace all > occurrences too, unless otherwise stated. > (ipcp_cloning_candidate_p): identify mostly-directly called >

Improve merging of modref_access_node

2021-08-28 Thread Jan Hubicka
Hi, this should be final bit of the fancy access merging. We limit number of accesses to 16 and on the overflow we currently just throw away the whole table. This patch instead looks for closest pair of entries in the table and merge them (losing some precision). This is not very often during

Re: fix latent bootstrap-debug issue (modref, tree-inline, tree jump-threading)

2021-08-28 Thread Jan Hubicka
> On Aug 22, 2021, Jan Hubicka wrote: > > > OK, thanks for looking into this issue! > > Thanks, I've finally installed it in the trunk. > > > It seems that analye_stmt indeed does not skip debug stmts. It is very > > odd we got so far without hitting build d

Re: Merge stores/loads in modref summaries

2021-08-26 Thread Jan Hubicka
> > commit f075b8c5adcf9cb6336563c472c8d624c54184db > Author: Jan Hubicka > Date: Thu Aug 26 15:33:56 2021 +0200 > > Fix off-by-one error in try_merge_with > > gcc/ChangeLog: > > * ipa-modref-tree.h (modref_ref_node::verify): New

Improve handling of modref --params

2021-08-26 Thread Jan Hubicka
Hi, this patch makes insertion to modref access tree smarter when --param modref-max-bases and moredref-max-refs are hit. Instead of giving up we either give up on base alias set (make it equal to ref) or turn the alias set to 0. This lets us to track useful info on quite large functions, such

Re: Merge stores/loads in modref summaries

2021-08-26 Thread Jan Hubicka
> On 8/26/21 10:33, Christophe Lyon via Gcc-patches wrote: > > Can you have a look? > > Please create a PR for it. I have fix, so perhaps there is no need for PR :) I am testing the following - the problem was that try_merge_with missed some merges because how unoredered_remove handles the

Re: Merge stores/loads in modref summaries

2021-08-26 Thread Jan Hubicka
> > This patch is causing ICEs on arm: > FAIL: g++.dg/torture/pr89303.C -O1 (internal compiler error) > FAIL: g++.dg/torture/pr89303.C -O1 (test for excess errors) It happens on 32bit arches only it seems. For some reason we end up merging access: Parm 0 param offset:12 offset:0

Merge stores/loads in modref summaries

2021-08-25 Thread Jan Hubicka
Hi, this patch adds logic needed to merge neighbouring accesses in ipa-modref summaries. This helps analyzing array initializers and similar code. It is bit of work, since it breaks the fact that modref tree makes a good lattice for dataflow: the access ranges can be extended indefinitely. For

Re: [PATCH][v2] Remove --param vect-inner-loop-cost-factor

2021-08-24 Thread Jan Hubicka
> > > > I noticed loop-doloop.c use _int version and likely_max, maybe you want > > that here? > > > > est_niter = get_estimated_loop_iterations_int (loop); > > if (est_niter == -1) > > est_niter = get_likely_max_loop_iterations_int (loop) > > I think that are two different things -

Avoid redundant entries in modref's access lists

2021-08-23 Thread Jan Hubicka
(release checking), however code that does a lot of array/fields initialization may hit the limit easily. gcc/ChangeLog: 2021-08-23 Jan Hubicka * ipa-modref-tree.h (modref_access_node::range_info_useful_p): Improve range compare. (modref_access_node::contains): New member

Re: [PATCH][v2] Remove --param vect-inner-loop-cost-factor

2021-08-23 Thread Jan Hubicka
> > Any strong opinions? > > Richard. > > 2021-08-23 Richard Biener > > * doc/invoke.texi (vect-inner-loop-cost-factor): Remove > documentation. > * params.opt (--param vect-inner-loop-cost-factor): Remove. > * tree-vect-loop.c (_loop_vec_info::_loop_vec_info): >

Re: [PATCH] ipa/97565 - fix IPA PTA body availability check

2021-08-23 Thread Jan Hubicka
> Looks like the existing check using has_gimple_body_p isn't enough > at LTRANS time but I need to check in_other_partition as well. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > OK? > > Thanks, > Richard. > > 2021-08-23 Richard Biener > > PR ipa/97565 >

Re: [PATCH] IPA: MODREF should skip EAF_* flags for indirect calls

2021-08-23 Thread Jan Hubicka
Hi, > > Why does it "punish" -fno-ipa-pta? It merely "punishes" modref of > no longer claiming that we do not alter the instruction stream pointed > to by a->foo, sth that shouldn't be very common. For example struct a { void (*foo)(); void *bar; } fn(struct a *a) { a->foo(); } With

Re: [PATCH] IPA: MODREF should skip EAF_* flags for indirect calls

2021-08-23 Thread Jan Hubicka
> Hello. > > Thanks for working on that. But have really run the test-cases as the newly > added test still aborts as it used to before you installed this patch? Eh, sorry, I had earlier version of patch that did if (gimple_call_fn (use_stmt) == name) lattice[index].merge

Improve handling of return slots in ipa-modref

2021-08-23 Thread Jan Hubicka
Hi, while looking at Martin's patch I also noticed that return slots are handled but overactively. We only care if the SSA name we analyze is base of return slot. Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: * ipa-modref.c (analyze_ssa_name_flags): Improve

Re: [PATCH] IPA: MODREF should skip EAF_* flags for indirect calls

2021-08-22 Thread Jan Hubicka
ortant (or can be implemented by special casing in unified code). Honza gcc/ChangeLog: 2021-08-22 Jan Hubicka Martin Liska * ipa-modref.c (analyze_ssa_name_flags): Indirect call implies ~EAF_NOCLOBBER. gcc/testsuite/ChangeLog: 2021-08-22 Jan Hubicka

Re: fix latent bootstrap-debug issue (modref, tree-inline, tree jump-threading)

2021-08-22 Thread Jan Hubicka
> > for gcc/ChangeLog > > * ipa-modref.c (analyze_function): Skip debug stmts. > * tree-inline.c (estimate_num_insn): Consider builtins even > without a cgraph_node. OK, thanks for looking into this issue! (for mainline and release brances bit later) > --- > gcc/ipa-modref.c

Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

2021-08-22 Thread Jan Hubicka
> Good hint. I added hash based on object file name (I don't want to handle > proper string escaping) and -frandom-seed. > > What do you think about the patch? Sorry for taking so long - I remember I was sending reply earlier but it seems I only wrote it and never sent. > Thanks, > Martin > From

Re: [PATCH] IPA: MODREF should skip EAF_* flags for indirect calls

2021-08-22 Thread Jan Hubicka
> Hello. > > As showed in the PR, returning (EAF_NOCLOBBER | EAF_NOESCAPE) for an argument > that is a function pointer is problematic. Doing such a function call is a > clobber. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, >

Re: [PATCH] ipa: add debug counter for IPA MODREF PTA

2021-08-22 Thread Jan Hubicka
> Hi. > > We already have a IPA modref debug counter, but it's only used in > tree-ssa-alias, > which is only a part of what IPA modref does. I used the dbg counter in > isolation > of PR101949. > > Ready for master? OK, thanks! Honza > > gcc/ChangeLog: > > * dbgcnt.def

Re: [PATCH] ipa: "naked" attribute implies "noipa" attribute

2021-08-13 Thread Jan Hubicka
> Hi. > > This is a first part fixing the PR. It makes sense making "naked" functions > "noipa". > What's missing is IPA MOD pass support where the pass should not optimize fns > with "noipa" attributes. > > @Honza: Can you please implement that? Hmm, I had patch for that somewhere, will do

Re: [PATCH] ipa: do not make localaliases for target_clones [PR101261]

2021-08-13 Thread Jan Hubicka
> Hello. > > Right now, target_clone pass complains when a target_clone function is an > alias. > That happens when localalias is created by callgraph. I think we should not > create > such aliases as we won't benefit much from it in case of target_clones. > > Patch can bootstrap on

Introduce EAF_NOREAD and cleanup EAF_UNUSED + ipa-modref

2021-08-12 Thread Jan Hubicka
02 queries pt_solutions_intersect: 1391917 disambiguations, 14665265 queries I think it is mostly due to better heandling of EAF_NODIRECTESCAPE. Honza gcc/ChangeLog: 2021-08-12 Jan Hubicka * ipa-modref.c (dump_eaf_flags): Dump EAF_NOREAD. (implicit_const_eaf_flags, implicit_pure

Fix condition testing void functions in ipa-split

2021-08-12 Thread Jan Hubicka
x86_64-linux. Comitted. gcc/ChangeLog: 2021-08-12 Jan Hubicka * ipa-split.c (consider_split): Fix condition testing void functions. diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c index 5e918ee3fbf..c68577d04a9 100644 --- a/gcc/ipa-split.c +++ b/gcc/ipa-split.c @@ -546,8 +546,9

Re: ipa-modref: merge flags when adding escape

2021-08-11 Thread Jan Hubicka
atch I have bootstrapped/regtested x86_64-linux and I am collecting stats for (it should have minimal effect on overal effectivity of modref). Honza gcc/ChangeLog: 2021-08-11 Jan Hubicka Alexandre Oliva * ipa-modref.c (modref_lattice::dump): Fix escape_point's min_fla

Add EAF_NOT_RETURNED flag

2021-07-16 Thread Jan Hubicka
Jan Hubicka * ipa-modref.c (struct escape_entry): Use eaf_flags_t. (dump_eaf_flags): Dump EAF_NOT_RETURNED (eaf_flags_useful_p): Use eaf_fleags_t; handle const functions and EAF_NOT_RETURNED. (modref_summary::useful_p): Likewise. (modref_summary_lto

Re: [RFC] ipa: Adjust references to identify read-only globals

2021-07-15 Thread Jan Hubicka
> Hi, > > gcc/ChangeLog: > > 2021-06-29 Martin Jambor > > * cgraph.h (ipa_replace_map): New field force_load_ref. > * ipa-prop.h (ipa_param_descriptor): Reduce precision of move_cost, > aded new flag load_dereferenced, adjusted comments. >

Re: [PATCH] ipa-sra: Fix thinko when overriding safe_to_import_accesses (PR 101066)

2021-07-08 Thread Jan Hubicka
Hi, > 2021-06-16 Martin Jambor > > PR ipa/101066 > * ipa-sra.c (class isra_call_summary): New member > m_before_any_store, initialize it in the constructor. > (isra_call_summary::dump): Dump the new field. > (ipa_sra_call_summaries::duplicate): Copy it. >

Re: [PATCH 0.5/2] ipa-sra: Restructure how cloning and call redirection communicate (PR 93385)

2021-06-27 Thread Jan Hubicka
> > I was asked by Richi to split my fix for PR 93385 for easier review > into IPA-SRA materialization refactoring and the actual DCE addition. > Fortunately it was mostly natural except for a temporary weird > condition in ipa_param_body_adjustments::modify_call_stmt. > Additionally. In

Re: [PATCH] inline: do not inline when no_profile_instrument_function is different

2021-06-23 Thread Jan Hubicka
> Hello. > > Similarly to e.g. sanitizer attributes, we sould prevent inlining when one > function > is marked as not instrumented. We should do that with -fprofile-generate only. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? > Thanks, >

Re: Ping: [PATCH 1/2] correct BB frequencies after loop changed

2021-06-14 Thread Jan Hubicka
> > > On 5/6/2021 8:36 PM, guojiufu via Gcc-patches wrote: > > Gentle ping. > > > > Original message: > > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555871.html > I think you need a more aggressive ping  :-) > > OK for the trunk.  Sorry for the long delay.  I kept hoping someone

Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

2021-05-20 Thread Jan Hubicka
> On Thu, May 20, 2021 at 3:16 PM Richard Biener > wrote: > > > > On Thu, May 20, 2021 at 3:06 PM Martin Liška wrote: > > > > > > On 5/20/21 2:54 PM, Richard Biener wrote: > > > > So why did you go from applying this per-file to multiple files? > > > > > > When I did per-file for

Re: [PATCH] ipa: Get rid of IPA_NODE_REF and IPA_EDGE_REF

2021-05-10 Thread Jan Hubicka
> Hi, > > the node and edge summaries defined in ipa-prop.h are probably the > oldest in GCC and so it happened that they are the only ones using > macros to look them up and create them. With Honza and Martin we > agreed it is ugly and the macros should be removed and the ipa-prop > summaries

Re: [PATCH] Avoid DSE/DCE of pure call that throws

2021-05-03 Thread Jan Hubicka
> note that if you wrap foo () into another noinline > wrap_foo () { foo (); return 1; } function then we need to make > sure to not DCE this call either even though it only throws > externally. Now the question is whether this testcase is valid > (it should not abort). The documentation of

[wwwdocs] Move some misplaced entries in gcc11 changes.html

2021-04-27 Thread Jan Hubicka
Hi, I have noticed that some entries was incorrectly added to C familly while they are general improvements and I also think the option renaming should go to canevats since renaming is hardly an improvement per se. Since the change is rather obvious I plan to commit it after lunch so we do not

Re: [PATCH] Bump LTO_major_version to 11.

2021-04-23 Thread Jan Hubicka
> > That needs to be combined with the generated auto-host.h header file. > > From which locations do you want to build the hash? Any other $objdir > > files except auto-host.h? > > In fact for PCH just summing the gengtype generated files would be > good enough I guess ... I think one can, for

[wwwdocs] IPA/LTO/profile-feedback changes

2021-04-22 Thread Jan Hubicka
Hi, this patch adds changesentry for IPA/LTO and FDO. Honza diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html index 6f58cfe8..bba16ead 100644 --- a/htdocs/gcc-11/changes.html +++ b/htdocs/gcc-11/changes.html @@ -170,6 +170,37 @@ a work-in-progress. use -g together with

Re: [committed] gimple UIDs, LTO and -fanalyzer [PR98599]

2021-04-15 Thread Jan Hubicka
preffer it over your fix - I think both are fine in general for release branches. lto-bootstrapped/regtested x86_64-linux. Honza 2021-04-15 Jan Hubicka PR lto/98599 * lto.c (lto_wpa_write_files): Fix handling of clones. diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c index ceb61bb300b

Re: [committed] gimple UIDs, LTO and -fanalyzer [PR98599]

2021-04-13 Thread Jan Hubicka
Hi, stepping through the streaming process it turns out to be funny difference between gimple_has_body and node->has_gimple_body_p. While the first tests whether gimple body really exists in memory (by looking for DECL_STRUCT_FUNCTION) the second tests if gimple body can be made available via

Re: [committed] gimple UIDs, LTO and -fanalyzer [PR98599]

2021-04-12 Thread Jan Hubicka
Hello > > Thanks. > > > > I think my earlier analysis was wrong. Sorry for late reply. I was looking into it again yesterday but was bit confused about what is goin gon here. > > > > With the caveat that I'm not as familiar with the IPA code as other > > parts of the compiler, what I think is

Do not release body of declare_variant_alt

2021-04-10 Thread Jan Hubicka
declare_variant_alt has references. It would be more systematic to make this also a definition. I plan to clean this up next stage1 and also add a verifier that non-definitions do not have references. Honza gcc/ChangeLog: 2021-04-10 Jan Hubicka PR lto/99857 * tree.c (free_lang_data_in_decl

Re: [PATCH 2/3] x86: Update memcpy/memset inline strategies for Skylake family CPUs

2021-04-06 Thread Jan Hubicka
> > Do you know what of the three changes (preferring reps/stosb, > > CLEAR_RATIO and algorithm choice changes) cause the two speedups > > on eebmc? > > A extracted testcase from nnet_test in https://godbolt.org/z/c8KdsohTP > > This loop is transformed to builtin_memcpy and builtin_memset with

Re: [PATCH 2/3] x86: Update memcpy/memset inline strategies for Skylake family CPUs

2021-04-05 Thread Jan Hubicka
> > /* skylake_cost should produce code tuned for Skylake familly of CPUs. */ > > static stringop_algs skylake_memcpy[2] = { > > - {libcall, {{1024, rep_prefix_4_byte, true}, {-1, libcall, false}}}, > > - {libcall, {{16, loop, false}, {512, unrolled_loop, false}, > > - {-1,

Re: [gcc r11-7940] Make USES_COMDAT_LOCAL CIF_FINAL_NORMAL

2021-04-02 Thread Jan Hubicka
> This breaks bootstrap on riscv64: > > In function ‘alloca_type_and_limit alloca_call_type(range_query&, gimple*, > bool ’, > inlined from ‘virtual unsigned int pass_walloca::execute(function*)’ at > ../../gcc/gimple-ssa-warn-alloca.c:295:25: > ../../gcc/gimple-ssa-warn-alloca.c:206:13:

Re: [PATCH] x86: Don't generate uiret with -mcmodel=kernel

2021-04-01 Thread Jan Hubicka
> On Thu, Apr 1, 2021 at 6:54 PM H.J. Lu wrote: > > > > Since uiret should be used only for user interrupt handler return, don't > > generate uiret in interrupt handler return with -mcmodel=kernel even if > > UINTR is enabled. > > NAK, -mcmodel should not affect ISAs, in the same way it doesn't

Re: Small refactoring of cgraph_node::release_body

2021-04-01 Thread Jan Hubicka
> This patch is causing ICEs on arm and aarch64, and others according to > gcc-testresults: > on aarch64: > g++.dg/ipa/devirt-7.C -std=gnu++14 (internal compiler error) > g++.dg/ipa/devirt-7.C -std=gnu++17 (internal compiler error) > g++.dg/ipa/devirt-7.C -std=gnu++2a (internal

Re: [r11-7926 Regression] FAIL: libgomp.c/declare-variant-1.c (test for excess errors) on Linux/x86_64

2021-03-31 Thread Jan Hubicka
> On Linux/x86_64, > > d7145b4bb6c8729a1e782373cb6256c06ed60465 is the first bad commit > commit d7145b4bb6c8729a1e782373cb6256c06ed60465 > Author: Jan Hubicka > Date: Wed Mar 31 11:35:29 2021 +0200 > > Small refactoring of cgraph_node::release_body > >

Re: [PATCH v2 1/3] x86: Update memcpy/memset inline strategies for Ice Lake

2021-03-31 Thread Jan Hubicka
> > Reading through the optimization manual it seems that mosvb is fast for > > small block no matter if the size is hard wired. In that case you > > probably want to check whetehr max_size or expected_size is known to be > > small rather than max_size == min_size and both being small. > > > > But

Re: [PATCH v2 1/3] x86: Update memcpy/memset inline strategies for Ice Lake

2021-03-31 Thread Jan Hubicka
> > > > > > Patch is OK now. I was wondering about using avx256 for moves of known > > > > Done. X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB is in now. Can > > you take a look at the patch for Skylake: > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567096.html > > I was wondering,

Re: [PATCH v2 1/3] x86: Update memcpy/memset inline strategies for Ice Lake

2021-03-31 Thread Jan Hubicka
> > > > Patch is OK now. I was wondering about using avx256 for moves of known > > Done. X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB is in now. Can > you take a look at the patch for Skylake: > > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567096.html I was wondering, if CPU preffers

Re: znver3 tuning part 1

2021-03-31 Thread Jan Hubicka
> On 3/31/21 1:08 PM, Jan Hubicka wrote: > > > > > > 2021-03-15 Jan Hubicka > > > > > > * config/i386/i386-options.c (processor_cost_table): Add znver3_cost. > > > * config/i386/x86-tune-costs.h (znver3_cost): New gobal variable; copy &g

Re: znver3 tuning part 1

2021-03-31 Thread Jan Hubicka
> > 2021-03-15 Jan Hubicka > > * config/i386/i386-options.c (processor_cost_table): Add znver3_cost. > * config/i386/x86-tune-costs.h (znver3_cost): New gobal variable; copy > of znver2_cost. I have backported the pat

Small refactoring of cgraph_node::release_body

2021-03-31 Thread Jan Hubicka
Hi, in the dicussion on PR 99447 there was some confusion about release_body being used in context where call edges/references survive. This is not a valid use because it would leave stale pointers to ggc_freed memory location. By auditing code I did not find any however this patch moves the

Re: [PATCH v2 1/3] x86: Update memcpy/memset inline strategies for Ice Lake

2021-03-31 Thread Jan Hubicka
> > It looks like X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB is quite obviously > > benefical and independent of the rest of changes. I think we will need > > to discuss bit more the move ratio and the code size/uop cache polution > > issues - one option would be to use increased limits for -O3 only.

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2021-03-31 Thread Jan Hubicka
> [AMD Public Use] > > Hi Honza, > > > -Original Message----- > > From: Jan Hubicka > > Sent: Wednesday, March 31, 2021 1:15 AM > > To: Kumar, Venkataramanan > > Cc: Uros Bizjak ; gcc-patches@gcc.gnu.org > > Subject: Re: [PATCH] [X86_64]: Ena

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2021-03-30 Thread Jan Hubicka
. Atop of that I plan to backport the tuning patches with exception of gather which seems bit controversal and can wait for gcc11. Honza 2021-03-30 Jan Hubicka Backport Venkataramanan Kumar Sharavan Kumar * common/config/i386/cpuinfo.h (get_amd_cpu) recognize

Re: Fix typo in ipa-modref

2021-03-29 Thread Jan Hubicka
> > >gcc/testsuite/ChangeLog: > > > > > >2021-03-29 Jan Hubicka > > > > > > * gcc.c-torture/compile/pr99751.c: New test. > > > > Why compile torture? > > Ah, sorry, it was meant to be execute. I

Re: Fix typo in ipa-modref

2021-03-29 Thread Jan Hubicka
> >gcc/testsuite/ChangeLog: > > > >2021-03-29 Jan Hubicka > > > > * gcc.c-torture/compile/pr99751.c: New test. > > Why compile torture? Ah, sorry, it was meant to be execute. I will move the test. Honza

Fix typo in ipa-modref

2021-03-29 Thread Jan Hubicka
disambiguations, 13770732 queries gcc/ChangeLog: 2021-03-29 Jan Hubicka * ipa-modref.c (merge_call_lhs_flags): Correct handling of deref. (analyze_ssa_name_flags): Fix typo in comment. gcc/testsuite/ChangeLog: 2021-03-29 Jan Hubicka * gcc.c-torture/compile

Re: [PATCH] i386: fix -march=amd crash

2021-03-24 Thread Jan Hubicka
> It started with g:3e2ae3ee285a57455d5a23bd352a68c289130186 where > new entry was added to processor_alias_table after generic node: > > + {"amdfam19h", PROCESSOR_GENERIC, CPU_GENERIC, 0, > +M_CPU_TYPE (AMDFAM19H), P_NONE}, > > and then the following is violated: > > /* NB:

Re: [PATCH 1/3] x86: Update memcpy/memset inline strategies for Ice Lake

2021-03-22 Thread Jan Hubicka
> > gcc/ > > * config/i386/i386-expand.c (expand_set_or_cpymem_via_rep): > For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, don't convert QImode > to SImode. > (decide_alg): For TARGET_PREFER_KNOWN_REP_MOVSB_STOSB, use > "rep movsb/stosb" only for known sizes. > *

Re: znver3 tuning part 1

2021-03-22 Thread Jan Hubicka
> > Hi, > > I plan to commit some retuning of znver3 codegen that is based on real > > hardware benchmarks. It turns out that there are not too many changes > > necessary sinze Zen3 is quite smooth upgrade to Zen2. In summary: > > > > - some instructions (like idiv) have shorter latencies.

znver3 tuning part 3

2021-03-18 Thread Jan Hubicka
such examples. However in general it is better ot have actual latencies than random numbers. Bootstrapped/regtested x86_64-linux, commited. Honza gcc/ChangeLog: 2021-03-18 Jan Hubicka * config/i386/x86-tune-costs.h (struct processor_costs): Fix costs of integer divides1. diff

znver3 tuning part 2

2021-03-17 Thread Jan Hubicka
Hi, this patch enables gather on zen3 hardware. For TSVC it get used by 6 benchmarks with following runtime improvements: s4114: 1.424 -> 1.209 (84.9017%) s4115: 2.021 -> 1.065 (52.6967%) s4116: 1.549 -> 0.854 (55.1323%) s4117: 1.386 -> 1.193 (86.075%) vag: 2.741 -> 1.940 (70.7771%) and

znver3 tuning part 1

2021-03-15 Thread Jan Hubicka
. Honza 2021-03-15 Jan Hubicka * config/i386/i386-options.c (processor_cost_table): Add znver3_cost. * config/i386/x86-tune-costs.h (znver3_cost): New gobal variable; copy of znver2_cost. diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c

Re: [patch] Fix PR C++/90448

2021-03-09 Thread Jan Hubicka
> On Mon, Mar 8, 2021 at 6:19 PM Eric Botcazou wrote: > > > > Hi, > > > > this is a regression present on the mainline and 10 branch for architectures > > that pass all structure types by reference, e.g. 32-bit PowerPC or SPARC. > > > > Jakub posted a detailed analysis in the audit trail and this

Re: [PATCH] profiling: fix streaming of TOPN counters

2021-03-04 Thread Jan Hubicka
> .../gcc.dg/tree-prof/indir-call-prof-malloc.c | 2 +- > gcc/testsuite/gcc.dg/tree-prof/pr97461.c | 2 +- > libgcc/libgcov-driver.c | 56 --- > 3 files changed, 50 insertions(+), 10 deletions(-) > > diff --git

Re: [PATCH] profiling: fix streaming of TOPN counters

2021-03-03 Thread Jan Hubicka
> > libgcc/ChangeLog: > > PR gcov-profile/99105 > * libgcov-driver.c (write_top_counters): Rename to ... > (write_topn_counters): ... this. > (write_one_data): Pre-allocate buffer for number of items > in the corresponding linked lists. > * libgcov-merge.c

Re: [PATCH] gcov: use mmap pools for KVP.

2021-03-03 Thread Jan Hubicka
> Hello. > > AS mentioned here, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97461#c25, I > like > what Richard suggested. So instead of usage of malloc, we should use mmap > memory > chunks that serve as a memory pool for struct gcov_kvp. > > Malloc is used as a fallback when mmap is not

Re: [PATCH] ipa/97346 - fix leak of reference_vars_to_consider

2021-02-14 Thread Jan Hubicka
reference_vars_to_consider before re-allocating it. > (ipa_reference_write_optimization_summary): Use vec_free > and NULL reference_vars_to_consider. Hi, this is version I commited after discussion on the PR (it makes it more explicit that reference_vars_to_consider are used durin

Re: [PATCH] tree-optimization/98499 - fix modref analysis on RVO statements

2021-02-01 Thread Jan Hubicka
> From: Sergei Trofimovich > > Before the change RVO gimple statements were treated as local > stores by modres analysis. But in practice RVO escapes target. > > 2021-01-30 Sergei Trofimovich > > gcc/ChangeLog: > > PR tree-optimization/98499 > * ipa-modref.c: treat RVO

Re: [PATCH] rtl-optimization/98863 - tame i386 specific RPAD pass

2021-01-29 Thread Jan Hubicka
> On Fri, 29 Jan 2021, Jan Hubicka wrote: > > > > This removes adding very expensive DF problems which we do not > > > use and which somehow cause 5GB of memory to leak. Reading through the logs, isn't the leak just caused by tings going to memory pool that we

Re: [PATCH] rtl-optimization/98863 - tame i386 specific RPAD pass

2021-01-29 Thread Jan Hubicka
> This removes adding very expensive DF problems which we do not > use and which somehow cause 5GB of memory to leak. Impressive :) > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > 2021-01-29 Richard Biener > > PR rtl-optimization/98863 > *

Re: [PATCH] varpool: Restore GENERIC TREE_READONLY automatic var optimization [PR7260]

2021-01-26 Thread Jan Hubicka
> On Tue, Jan 26, 2021 at 10:55:35AM +0100, Jan Hubicka wrote: > > > On Tue, Jan 26, 2021 at 10:03:16AM +0100, Richard Biener wrote: > > > > > In 4.8 and earlier we used to fold the following to 0 during GENERIC > > > > > folding, > > > > &

Re: [PATCH] varpool: Restore GENERIC TREE_READONLY automatic var optimization [PR7260]

2021-01-26 Thread Jan Hubicka
> On Tue, Jan 26, 2021 at 10:03:16AM +0100, Richard Biener wrote: > > > In 4.8 and earlier we used to fold the following to 0 during GENERIC > > > folding, > > > but we don't do that anymore because ctor_for_folding etc. has been > > > turned into a > > > GIMPLE centric API, but as the testcase

<    1   2   3   4   5   6   7   8   9   10   >