Re: Enable ranger for ipa-prop

2023-06-28 Thread Jan Hubicka via Gcc-patches
> > On 6/27/23 12:24, Jan Hubicka wrote: > > > On 6/27/23 09:19, Jan Hubicka wrote: > > > > Hi, > > > > as shown in the testcase (which would eventually be useful for > > > > optimizing std::vector's push_back), ipa-prop can use context depende

Re: Enable ranger for ipa-prop

2023-06-27 Thread Jan Hubicka via Gcc-patches
> > On 6/27/23 09:19, Jan Hubicka wrote: > > Hi, > > as shown in the testcase (which would eventually be useful for > > optimizing std::vector's push_back), ipa-prop can use context dependent > > ranger > > queries for better value range info. > > &

Enable ranger for ipa-prop

2023-06-27 Thread Jan Hubicka via Gcc-patches
Hi, as shown in the testcase (which would eventually be useful for optimizing std::vector's push_back), ipa-prop can use context dependent ranger queries for better value range info. Bootstrapped/regtested x86_64-linux, OK? Honza gcc/ChangeLog: PR middle-end/110377 *

Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()

2023-06-26 Thread Jan Hubicka via Gcc-patches
Hi, playing with testcases for path isolation and const function, I noticed that we do not seem to even try to isolate out of range array accesses: int a[3]={0,1,2}; test(int i) { if (i > 3) return test2(a[i]); return a[i]; } Here call to test2 is dead, since a[i] will

Fix profile of forwardes produced by cd-dce

2023-06-26 Thread Jan Hubicka via Gcc-patches
Hi, compiling the testcase from PR109849 (which uses std:vector based stack to drive a loop) with profile feedbakc leads to profile mismatches introduced by tree-ssa-dce. This is the new code to produce unified forwarder blocks for PHIs. I am not including the testcase itself since checking it

Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()

2023-06-25 Thread Jan Hubicka via Gcc-patches
> > Also as discussed some time ago, the volatile loads between traps has > > effect of turning previously pure/const functions into non-const which > > is somewhat sad, so it is still on my todo list to change it this stage1 > > to something more careful. We discussed internal functions

Re: [PATCH] libstdc++: Use RAII in std::vector::_M_realloc_insert

2023-06-23 Thread Jan Hubicka via Gcc-patches
> I intend to push this to trunk once testing finishes. > > I generated the diff with -b so the whitespace changes aren't shown, > because there was some re-indenting that makes the diff look larger than > it really is. > > Honza, I don't think this is likely to make much difference for the PR >

Re: Tiny phiprop compile time optimization

2023-06-23 Thread Jan Hubicka via Gcc-patches
Hi, here is updated version with TODO_update_ssa_only_virtuals. bootstrapped/regtested x86_64-linux. OK? gcc/ChangeLog: * tree-ssa-phiprop.cc (propagate_with_phi): Compute post dominators on demand. (pass_phiprop::execute): Do not compute it here; return

Re: Do not account __builtin_unreachable guards in inliner

2023-06-23 Thread Jan Hubicka via Gcc-patches
> > So you need to feed it with extra info on the optimized out stmts because > as-is it will not remove __builtin_unreachable (). That means you're My plan was to add entry point to tree-ssa-dce that will take an set of stmts declared dead by external force and will do the usual mark stage

Re: Ping [PATCH v4] Add condition coverage profiling

2023-06-23 Thread Jan Hubicka via Gcc-patches
> > > > gcc/ChangeLog: > > > > * builtins.cc (expand_builtin_fork_or_exec): Check > > profile_condition_flag. > > * collect2.cc (main): Add -fno-profile-conditions to OBSTACK. > > * common.opt: Add new options -fprofile-conditions and > > * doc/gcov.texi: Add --conditions

Re: Do not account __builtin_unreachable guards in inliner

2023-06-23 Thread Jan Hubicka via Gcc-patches
> On Mon, Jun 19, 2023 at 12:15 PM Jan Hubicka wrote: > > > > > On Mon, Jun 19, 2023 at 9:52 AM Jan Hubicka via Gcc-patches > > > wrote: > > > > > > > > Hi, > > > > this was suggested earlier somewhere, but I can not fin

Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()

2023-06-22 Thread Jan Hubicka via Gcc-patches
> > > On 6/22/23 00:31, Richard Biener wrote: > > I think there's a difference in that __builtin_trap () is observable > > while __builtin_unreachable () is not and reaching __builtin_unreachable > > () invokes undefined behavior while reaching __builtin_trap () does not. > > > > So the

Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()

2023-06-21 Thread Jan Hubicka via Gcc-patches
> > If I manually add a __builtin_unreachable () to the above case > I see the *(int *)0 = 0; store DSEd. Maybe we should avoid > removing stores that might trap here? POSIX wise such a trap > could be a way to jump out of the path leading to unreachable () > via siglongjmp ... I am not sure

Re: [libstdc++] Improve M_check_len

2023-06-20 Thread Jan Hubicka via Gcc-patches
> > > > > > size_type > > > _M_check_len(size_type __n, const char* __s) const > > > { > > > const size_type __size = size(); > > > const size_type __max_size = max_size(); > > > > > > if (__is_same(allocator_type, allocator<_Tp>) > > > &&

Re: [libstdc++] Improve M_check_len

2023-06-20 Thread Jan Hubicka via Gcc-patches
> > > > size_type > > _M_check_len(size_type __n, const char* __s) const > > { > > const size_type __size = size(); > > const size_type __max_size = max_size(); > > > > if (__is_same(allocator_type, allocator<_Tp>) > > && __size > __max_size

Re: [libstdc++] Improve M_check_len

2023-06-19 Thread Jan Hubicka via Gcc-patches
> On Mon, 19 Jun 2023 at 12:20, Jakub Jelinek wrote: > > > On Mon, Jun 19, 2023 at 01:05:36PM +0200, Jan Hubicka via Gcc-patches > > wrote: > > > - if (max_size() - size() < __n) > > > - __throw_length_error(__N(__s)); > >

Re: [libstdc++] Improve M_check_len

2023-06-19 Thread Jan Hubicka via Gcc-patches
> > - if (max_size() - size() < __n) > > - __throw_length_error(__N(__s)); > > + // On 64bit systems vectors of small sizes can not > > + // reach overflow by growing by small sizes; before > > + // this happens, we will run out of memory. > > + if

Re: Do not account __builtin_unreachable guards in inliner

2023-06-19 Thread Jan Hubicka via Gcc-patches
> On Mon, Jun 19, 2023 at 9:52 AM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > this was suggested earlier somewhere, but I can not find the thread. > > C++ has assume attribute that expands int > > if (conditional) > > __builtin_un

Do not account __builtin_unreachable guards in inliner

2023-06-19 Thread Jan Hubicka via Gcc-patches
Hi, this was suggested earlier somewhere, but I can not find the thread. C++ has assume attribute that expands int if (conditional) __builtin_unreachable () We do not want to account the conditional in inline heuristics since we know that it is going to be optimized out.

Tiny phiprop compile time optimization

2023-06-19 Thread Jan Hubicka via Gcc-patches
Hi, this patch avoids unnecessary post dominator and update_ssa in phiprop. Bootstrapped/regtested x86_64-linux, OK? gcc/ChangeLog: * tree-ssa-phiprop.cc (propagate_with_phi): Add post_dominators_computed; compute post dominators lazilly. (const pass_data

Re: Extend fnsummary to predict SRA oppurtunities

2023-06-18 Thread Jan Hubicka via Gcc-patches
Hi, as noticed by Jeff, this patch also triggers warning in one of LTO testcases. The testcase is reduced and warning seems legit, triggered by extra inlining. So I have just silenced it. Honza gcc/testsuite/ChangeLog: * gcc.dg/lto/20091013-1_0.c: Disable stringop-overread warning.

Extend fnsummary to predict SRA oppurtunities

2023-06-18 Thread Jan Hubicka via Gcc-patches
Hi, this patch extends ipa-fnsummary to anticipate statements that will be removed by SRA. This is done by looking for calls passing addresses of automatic variables. In function body we look for dereferences from pointers of such variables and mark them with new not_sra_candidate condition.

[libstdc++] Improve M_check_len

2023-06-18 Thread Jan Hubicka via Gcc-patches
Hi, _M_check_len is used in vector reallocations. It computes __n + __s but does checking for case that (__n + __s) * sizeof (Tp) would overflow ptrdiff_t. Since we know that __s is a size of already allocated memory block if __n is not too large, this will never happen on 64bit systems since

Optimize std::max early

2023-06-18 Thread Jan Hubicka via Gcc-patches
Hi, we currently produce very bad code on loops using std::vector as a stack, since we fail to inline push_back which in turn prevents SRA and we fail to optimize out some store-to-load pairs (PR109849). I looked into why this function is not inlined and it is inlined by clang. We currently

Re: [PATCH] ipa-sra: Disable candidates with no known callers (PR 110276)

2023-06-16 Thread Jan Hubicka via Gcc-patches
> Hi, > > In IPA-SRA we use can_be_local_p () predicate rather than just plain > local call graph flag in order to figure out whether the node is a > part of an external API that we cannot change. Although there are > cases where this can allow more transformations, it also means we can >

Re: [PATCH] inline: improve internal function costs

2023-06-04 Thread Jan Hubicka via Gcc-patches
> On Thu, 1 Jun 2023, Andre Vieira (lists) wrote: > > > Hi, > > > > This is a follow-up of the internal function patch to add widening and > > narrowing patterns. This patch improves the inliner cost estimation for > > internal functions. > > I have no idea why calls are special in IPA

Re: [PATCH 1/2] ipa-cp: Avoid long linear searches through DECL_ARGUMENTS

2023-05-30 Thread Jan Hubicka via Gcc-patches
> On Mon, May 29, 2023 at 6:20 PM Martin Jambor wrote: > > > > Hi, > > > > there have been concerns that linear searches through DECL_ARGUMENTS > > that are often necessary to compute the index of a particular > > PARM_DECL which is the key to results of IPA-CP can happen often > > enough to be a

Re: Question on patch -fprofile-partial-training

2023-05-10 Thread Jan Hubicka via Gcc-patches
> Honza, > > Main motivation for this was profiling programs that contain specific > > code paths for different CPUs (such as graphics library in Firefox or Linux > > kernel). In the situation training machine differs from the machine > > program is run later, we end up optimizing for size all

Re: Question on patch -fprofile-partial-training

2023-05-09 Thread Jan Hubicka via Gcc-patches
> > > > > > > > From my understanding, -fprofile-partial-training is one important > > > > option for PGO performance. > > > > > > I don't think so, speed benefit would be rather small I guess. > > I saw some articles online to introduce this option for gcc10, > >

Re: Unloop no longer looping loops in loop-ch

2023-04-26 Thread Jan Hubicka via Gcc-patches
> > - if (precise) > > + if (precise > > + && get_max_loop_iterations_int (loop) == 1) > > + { > > + if (dump_file && (dump_flags & TDF_DETAILS)) > > + fprintf (dump_file, "Loop %d no longer loops.\n", loop->num); > > but max loop iterations is 1 ...? I first check for

Re: Unloop no longer looping loops in loop-ch

2023-04-25 Thread Jan Hubicka via Gcc-patches
> On 25 April 2023 17:12:50 CEST, Jan Hubicka via Gcc-patches > wrote: > > + fprintf (stderr, "Bingo\n"); > > You forgot to remove that.. > Do we prune Bingo in the testsuite? ;-) Ah, thanks :) I was curious how much I win with unloo

Unloop no longer looping loops in loop-ch

2023-04-25 Thread Jan Hubicka via Gcc-patches
Hi, I noticed this after adding sanity check that the upper bound on number of iterations never drop to -1. It seems to be relatively common case (happening few hundred times in testsuite and also during bootstrap) that loop-ch duplicates enough so the loop itself no longer loops. This is later

Re: [PATCH] tree-optimization/109609 - correctly interpret arg size in fnspec

2023-04-25 Thread Jan Hubicka via Gcc-patches
> By majority vote and a hint from the API name which is > arg_max_access_size_given_by_arg_p this interprets a memory access > size specified as given as other argument such as for strncpy > in the testcase which has "1cO313" as specifying the _maximum_ > size read/written rather than the exact

Re: [PATCH] rtl-optimization/109585 - alias analysis typo

2023-04-25 Thread Jan Hubicka via Gcc-patches
> When r10-514-gc6b84edb6110dd2b4fb improved access path analysis > it introduced a typo that triggers when there's an access to a > trailing array in the first access path leading to false > disambiguation. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > Honza, does this look OK?

Re: Fix loop-ch

2023-04-21 Thread Jan Hubicka via Gcc-patches
> Hi, > Ondrej Kubanek implemented profiling of loop histograms which sould be useful > to improve > i.e. quality of loop peeling of verctorization. However it turns out that > most of histograms > are lost on the way from profiling to loop peeling pass (about 90%). One > common case is the >

Fix loop-ch

2023-04-21 Thread Jan Hubicka via Gcc-patches
Hi, Ondrej Kubanek implemented profiling of loop histograms which sould be useful to improve i.e. quality of loop peeling of verctorization. However it turns out that most of histograms are lost on the way from profiling to loop peeling pass (about 90%). One common case is the following

Stabilize inliner Fibonacci heap

2023-04-21 Thread Jan Hubicka via Gcc-patches
Hi, This fixes another problem Michal noticed while working on incrmeental WHOPR. The Fibonacci heap can change its behaviour quite significantly for no good reasons when multiple edges with same key occurs. This is quite common for small functions. This patch stabilizes the order by adding

Stabilize temporary variable names

2023-04-21 Thread Jan Hubicka via Gcc-patches
Hi, Michal Jires implemented quite well working prototype of cache for WPA which makes it to re-use partitions from from earlier build when package is rebulit with smaller changes. It should be useful to improve edit/compile/debug cycles when one is forced to debug with LTO enabled but

Remove dead handling of label_decl in tree merging

2023-04-21 Thread Jan Hubicka via Gcc-patches
Hi, while working on incremental WHOPR with Michal Jires, we noticed that there is code hashing LABEL_DECL_UID in lto-streamer-out which would break the hash table, since label decls are not streamed and gets re-initialized later. The whole conditional is dead since LABEL_DECLs are not merged

Re: [PATCH] tree-optimization/109304 - properly handle instrumented aliases

2023-04-18 Thread Jan Hubicka via Gcc-patches
> > > > I do not think LTO is of any help here. You can allways call non-LTO > > const function from outer-world and that function can will end up > > calling back to instrumented const function in your unit which > > effectively makes the extenral const function non-const. > > Hmm, true. > >

Re: [PATCH] tree-optimization/109304 - properly handle instrumented aliases

2023-04-14 Thread Jan Hubicka via Gcc-patches
> On Tue, 4 Apr 2023, Jan Hubicka wrote: > > > > On Tue, 28 Mar 2023, Richard Biener wrote: > > > > > > > When adjusting calls to reflect instrumentation we failed to handle > > > > calls to aliases since they appear to have no body. In

Disable X86_TUNE_AVX256_MOVE_BY_PIECES and STORE_BY_PIECES for znver1-3

2023-04-14 Thread Jan Hubicka via Gcc-patches
/regtested x86_64-linux, will commit it shortly. Honza gcc/ChangeLog: 2023-04-14 Jan Hubicka PR target/109137 * config/i386/x86-tune.def (X86_TUNE_AVX256_MOVE_BY_PIECES): Remove znver1-3. (X86_TUNE_AVX256_STORE_BY_PIECES): Remove znver1-3. diff --git a/gcc/config/i386

Re: [PATCH] gcov: add info about "calls" to JSON output format

2023-04-14 Thread Jan Hubicka via Gcc-patches
> On 4/11/23 11:23, Richard Biener wrote: > > On Thu, Apr 6, 2023 at 3:58 PM Martin Liška wrote: > >> > >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > >> > >> Ready to be installed after stage1 opens? > > > > Did we release a compiler with version 1? If not we might

Re: Patch ping Re: [PATCH] ipa: Avoid another ICE when dealing with type-incompatibilities (PR 108959)

2023-04-04 Thread Jan Hubicka via Gcc-patches
> On Thu, Mar 23, 2023 at 11:09:19AM +0100, Martin Jambor wrote: > > Hi, > > > > PR 108959 shows one more example where undefined code with type > > incompatible accesses to stuff passed in parameters can cause an ICE > > because we try to create a VIEW_CONVERT_EXPR of mismatching sizes: > > > >

Re: [PATCH] tree-optimization/109304 - properly handle instrumented aliases

2023-04-04 Thread Jan Hubicka via Gcc-patches
> On Tue, Apr 04, 2023 at 01:21:40AM +0200, Jan Hubicka via Gcc-patches wrote: > > It is however really side case and I am worried about dropping > > pure/const from builtin declarations... > > Yeah, that can certainly break stuff. See e.g. the recently fixed > ICE

Re: [PATCH] tree-optimization/109304 - properly handle instrumented aliases

2023-04-03 Thread Jan Hubicka via Gcc-patches
> On Tue, 28 Mar 2023, Richard Biener wrote: > > > When adjusting calls to reflect instrumentation we failed to handle > > calls to aliases since they appear to have no body. Instead resort > > to symtab node availability. The patch also avoids touching > > internal function calls in a more

Re: [PATCH RFC] c++: lambda mangling alias issues [PR107897]

2023-03-30 Thread Jan Hubicka via Gcc-patches
> > How about moving it to symtab_node and using dyn_cast for the cgraph bits, > like this: > From 1d869ceb04573727e59be6518903133c8654069a Mon Sep 17 00:00:00 2001 > From: Jason Merrill > Date: Mon, 6 Mar 2023 15:33:45 -0500 > Subject: [PATCH] c++: lambda mangling alias issues [PR107897] > To:

Re: [PATCH] predict: Don't emit -Wsuggest-attribute=cold warning for functions which already have that attribute [PR105685]

2023-03-26 Thread Jan Hubicka via Gcc-patches
> Hi! > > In the following testcase, we predict baz to have cold > entry regardless of the user supplied attribute (as it call > unconditionally a cold function), but still issue > a -Wsuggest-attribute=cold warning despite it having that attribute > already. > > The following patch avoids that.

Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const

2023-03-24 Thread Jan Hubicka via Gcc-patches
> > > > Actually on second thought, I think I can break this either by making > > the wraping function to be thunk or alias or by moving it to different > > compilation unit. > > Also with LTO we will get body later. > > > > So I think we need to drop this optimization. > > It's the same

Re: [PATCH 2/2] [i386] Adjust costing of emulated vectorized gather/scatter

2023-03-24 Thread Jan Hubicka via Gcc-patches
> Emulated gather/scatter behave similar to strided elementwise > accesses in that they need to decompose the offset vector > and construct or decompose the data vector so handle them > the same way, pessimizing the cases with may elements. > > For pr88531-2c.c instead of > > .L4: > leaq

Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const

2023-03-24 Thread Jan Hubicka via Gcc-patches
> From d438a0d84cafced85c90204cba81de0f60ad0073 Mon Sep 17 00:00:00 2001 > From: Richard Biener > Date: Thu, 16 Mar 2023 13:51:19 +0100 > Subject: [PATCH] tree-optimization/106912 - clear const attribute from fntype > To: gcc-patches@gcc.gnu.org > > The following makes sure that after clearing

Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const

2023-03-24 Thread Jan Hubicka via Gcc-patches
> On Fri, 17 Mar 2023, Jakub Jelinek wrote: > > > On Fri, Mar 17, 2023 at 08:40:34PM +0100, Jan Hubicka wrote: > > > > + /* Drop the const attribute from the call type (the pure > > > > + attribute is not available on types). */

Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const

2023-03-17 Thread Jan Hubicka via Gcc-patches
> > The following is what I profile-bootstrapped and tested on > x86_64-unknown-linux-gnu. > > Richard. > > From d438a0d84cafced85c90204cba81de0f60ad0073 Mon Sep 17 00:00:00 2001 > From: Richard Biener > Date: Thu, 16 Mar 2023 13:51:19 +0100 > Subject: [PATCH] tree-optimization/106912 - clear

Re: [PATCH] tree-optimization/106912 - IPA profile and pure/const

2023-03-17 Thread Jan Hubicka via Gcc-patches
> > I have in the meantime briefly tested following. > > But if you want to the above way, then at least the testcase could be > useful. Though, not sure if the above is all that is needed. Shouldn't > set_const_flag_1 upon TREE_READONLY (node->decl) = 0; also adjust > TREE_TYPE on the

Fix ICE in profile_count::to_sreal_frequency

2023-03-14 Thread Jan Hubicka via Gcc-patches
its implementation with probability_in which does similar job but to determine relative probability. Bootstrpped/regtested x86_64-linux, and also profilebootstrapped. Comitted. gcc/ChangeLog: 2023-03-14 Jan Hubicka PR tree-optimization/106896 * profile-count.cc (profile_count

Re: [PATCH 2/2] ipa-cp: Improve updating behavior when profile counts have gone bad

2023-03-10 Thread Jan Hubicka via Gcc-patches
> Hi, > > Looking into the behavior of profile count updating in PR 107925, I > noticed that an option not considered possible was actually happening, > and - with the guesswork in place to distribute unexplained counts - > it simply can happen. Currently it is handled by dropping the counts >

Re: [PATCH 1/2] ipa-cp: Fix various issues in update_specialized_profile (PR 107925)

2023-03-10 Thread Jan Hubicka via Gcc-patches
> Hi, > > the patch below fixes various issues in function > update_specialized_profile. The main is removal of the assert which > is bogus in the case of recursive cloning. The division of > unexplained counts is guesswork, which then leads to updates of counts > of recursive edges, which then

Re: [PATCH RFC] c++: lambda mangling alias issues [PR107897]

2023-03-08 Thread Jan Hubicka via Gcc-patches
> Tested x86_64-pc-linux-gnu. Does this look good, or do we want to factor the > flag clearing into a symtab_node counterpart to cgraph_node::reset? > > -- 8< -- > > In 107897, by the time we are looking at the mangling clash, the > alias has already been removed from the symbol table by

Re: HELP: Questions on multiple PROGRAM_SUMMARY sections in a profiling data file

2023-03-08 Thread Jan Hubicka via Gcc-patches
> Hi, Jan, > > I am studying one profiling feedback ICE bug with GCC8 recently. > It’s an assertion failure inside the routine “compute_working_sets”of > gcov-io.c: > > gcov_nonruntime_assert (ws_ix == NUM_GCOV_WORKING_SETS); > > After some debugging and study, I found that the corresponding

Enable scatter for generic

2023-03-06 Thread Jan Hubicka via Gcc-patches
Hi, while adding tunes to siable scatters on znver4 I mistakely also disabled them on generic. This patch fixes it. Bootstraped/regtested x86_64, comitted. Honza gcc/ChangeLog: 2023-03-06 Jan Hubicka * config/i386/x86-tune.def (X86_TUNE_USE_SCATTER_2PARTS): Enable

Re: [PATCH] cgraphclones: Don't share DECL_ARGUMENTS between thunk and its artificial thunk [PR108854]

2023-02-24 Thread Jan Hubicka via Gcc-patches
> Hi! > > The following testcase ICEs on x86_64-linux with -m32. The problem is > we create an artificial thunk and because of -fPIC, ia32 and thunk > destination which doesn't bind locally can't use a mi thunk. > The ICE is because during expansion to RTL we see SSA_NAME for a PARM_DECL, > but

Re: [PATCH] tree-optimization/106722 - fix CD-DCE edge marking

2023-02-10 Thread Jan Hubicka via Gcc-patches
> The following fixes a latent issue when we mark control edges but > end up with marking a block with no stmts necessary. In this case > we fail to mark dependent control edges of that block. > > Bootstrapped and tested on x86_64-unknown-linux-gnu. > > Does this look OK? > > Thanks, >

Re: [PATCH] ipa: silent -Wodr notes with -w

2023-02-08 Thread Jan Hubicka via Gcc-patches
> > On 2/1/23 15:26, Martin Jambor wrote: > > > Hi, > > > > > > On Fri, Dec 02 2022, Martin Liška wrote: > > > > If -w is used, warn_odr properly sets *warned = false and > > > > so it should be preserved when calling warn_types_mismatch. > > > > > > > > Noticed that during a LTO reduction where

Re: [PATCH] cgraph: Handle simd clones in cgraph_node::set_{const,pure}_flag [PR106433]

2023-02-08 Thread Jan Hubicka via Gcc-patches
> On Wed, Feb 08, 2023 at 06:10:08PM +0100, Jan Hubicka wrote: > > My understanding of simd clones is bit limited, but I think you are > > right that they should have the same semantics as their caller. > > > > I think const may be one that makes compiler to ICE, b

Re: [PATCH] cgraph: Handle simd clones in cgraph_node::set_{const,pure}_flag [PR106433]

2023-02-08 Thread Jan Hubicka via Gcc-patches
> Hi! > > The following testcase ICEs, because we determine only in late pure const > pass that bar is const (the content of the function loses a store to a > global var during dse3 and read from it during cddce2) and local-pure-const2 > makes it const. The cgraph ordering is that post IPA (in

Re: [PATCH] ipa-split: Don't split returns_twice functions [PR106923]

2023-02-08 Thread Jan Hubicka via Gcc-patches
> Hi! > > As discussed in the PR, returns_twice functions are rare/special beasts > that need special treatment in the cfg, and inside of their bodies > we don't know which part actually works the weird returns twice way > (either in the fork/vfork sense, or in the setjmp) and aren't updating >

Re: [PATCH] ipa: silent -Wodr notes with -w

2023-02-06 Thread Jan Hubicka via Gcc-patches
> On 2/1/23 15:26, Martin Jambor wrote: > > Hi, > > > > On Fri, Dec 02 2022, Martin Liška wrote: > > > If -w is used, warn_odr properly sets *warned = false and > > > so it should be preserved when calling warn_types_mismatch. > > > > > > Noticed that during a LTO reduction where I used -w. > >

Enable AVX512 512bit vectors by default on Zen4

2023-02-04 Thread Jan Hubicka via Gcc-patches
Hi, this patch enables AVX512 by default on Zen4. While internally 512 registers are splits into two 256 halves, 512 bit vectors reduces number of instructions to retire and has chance to improve paralelism. There are few tsvc benchmarks that improves significantly: runtime benchmark

Re: Fix wrong code issues with ipa-sra

2023-01-30 Thread Jan Hubicka via Gcc-patches
> Hello, > > Coverity flagged a real issue in this patch: > > On Mon, 16 Jan 2023, Jan Hubicka via Gcc-patches wrote: > > --- a/gcc/ipa-utils.cc > > +++ b/gcc/ipa-utils.cc > [...] > > +bitmap > > +find_always_executed_bbs (function *fun, bool assume_

Re: [PATCH] tree-optimization/108449 - keep maybe_special_function_p behavior

2023-01-21 Thread Jan Hubicka via Gcc-patches
> When we have a static declaration without definition we diagnose > that and turn it into an extern declaration. That can alter > the outcome of maybe_special_function_p here and there's really > no point in doing that, so don't. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > >

Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jan Hubicka via Gcc-patches
> On Jan 18 2023, Michael Matz wrote: > > > The purest solution is to emit unwind tables for all functions that > > request it into .eh_frame and for those that don't request it put > > into .debug_frame (if also -g is on). > > The assembler does not allow switching back to .eh_frame once a >

Re: [PATCH] IPA: do not release body if still needed

2023-01-18 Thread Jan Hubicka via Gcc-patches
> The code removing function bodies when the last call graph clone of a > node is removed is too aggressive when there are nodes up the > clone_of chain which still need them. Fixed by expanding the check. > > gcc/ChangeLog: > > 2023-01-18 Martin Jambor > > PR ipa/107944 > *

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-18 Thread Jan Hubicka via Gcc-patches
> On Tue, 17 Jan 2023, Jan Hubicka wrote: > > > > > We don't use same argumentation about other control flow statements. > > > > The following: > > > > > > > > fn() > > > > { > > > > try { > > > >

Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jan Hubicka via Gcc-patches
> No unwind tables are generated, as if -funwind-tables is ignored. If > LTO is disabled, everything works as expected. I think it is because dwaf2out_do_eh_frame is called out of function context at the end of compilation. At that time cfun is NULL and the flag is read from global settings that

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-17 Thread Jan Hubicka via Gcc-patches
> > We don't use same argumentation about other control flow statements. > > The following: > > > > fn() > > { > > try { > > i_read_no_global_memory (); > > } catch (...) > > { > > reutrn 1; > > } > > return 0; > > } > > > > should be detected as const. Marking throw pure

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-17 Thread Jan Hubicka via Gcc-patches
> On Tue, 17 Jan 2023, Jan Hubicka wrote: > > > > The following fixes a long-standing bug with DSE removing stores as > > > dead even though they are live across non-call exceptional flow. > > > This affects both GIMPLE and RTL DSE and the fix is similar in

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-17 Thread Jan Hubicka via Gcc-patches
> The following fixes a long-standing bug with DSE removing stores as > dead even though they are live across non-call exceptional flow. > This affects both GIMPLE and RTL DSE and the fix is similar in > making externally throwing statements uses of non-local stores. > Note this doesn't fix the

Fix wrong code issues with ipa-sra

2023-01-16 Thread Jan Hubicka via Gcc-patches
. * ipa-utils.h (stmt_may_terminate_function_p): Declare. (find_always_executed_bbs): Declare. gcc/testsuite/ChangeLog: 2023-01-16 Jan Hubicka * g++.dg/tree-ssa/pr106077.C: New test. diff --git a/gcc/ipa-modref.cc b/gcc/ipa-modref.cc index 16e8bfb97a6..e3196df8aa9

Re: [PATCH] IPA: do not release body if still needed

2023-01-14 Thread Jan Hubicka via Gcc-patches
> Hi. > > Noticed during building of libbackend.a with the LTO partial linking. > > The function release_body is called even if clone_of is a clone > of a another function and thus it shares tree declaration. We should > preserve it in that situation. > > Patch can bootstrap on x86_64-linux-gnu

More znver4 x86-tune flags

2023-01-09 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds more tunes for zen4: - new tunes for avx512 scater instructions. In micro benchmarks these seems consistent loss compared to open-coded coe - disable use of gather for zen4 While these are win for a micro benchmarks (based on TSVC), enabling gather is a loss for

Avoid quadratic behaviour of symbol renaming

2023-01-04 Thread Jan Hubicka via Gcc-patches
of inline clones that is very slow and unnecesary since their bodies are never streamed. Bootstrapped/regtested x86_64-linux, comitted. gcc/lto/ChangeLog: 2023-01-04 Jan Hubicka * lto-partition.cc (may_need_named_section_p): Clones with no body need no remaning. diff --git a/g

Re: [PATCH][X86_64] Separate znver4 insn reservations from older znvers

2023-01-03 Thread Jan Hubicka via Gcc-patches
> > On Tue, 3 Jan 2023, Jan Hubicka wrote: > > > > * gcc/common/config/i386/i386-common.cc (processor_alias_table): > > > Use CPU_ZNVER4 for znver4. > > > * config/i386/i386.md: Add znver4.md. > > > * config/i386/znver4.md: New. > > OK

Re: [PATCH][X86_64] Separate znver4 insn reservations from older znvers

2023-01-03 Thread Jan Hubicka via Gcc-patches
> [Public] > > Hello, > > I have addressed all your comments in this revision of the patch, please find > attached and inlined. > > * I have updated all the latencies with Agner's measurements. > * Incorrect pipelines, loads/stores are addressed. > * The double pumped avx512 insns take one

Make -fwhole-program to work with incremental LTO linking

2022-12-21 Thread Jan Hubicka via Gcc-patches
the incremental link is de-facto fina binary and only some explicitly marked symbols needs to remain. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2022-12-21 Jan Hubicka * doc/invoke.texi: Fix documentation of -fwhole-program with LTO and document behaviour

Re: [PATCH] ipa-sra: Consider the first parameter of methods safe to dereference

2022-12-15 Thread Jan Hubicka via Gcc-patches
> On Wed, Dec 14, 2022 at 4:20 PM Jan Hubicka via Gcc-patches > wrote: > > > > > Hi, > > > > > > Honza requested this after reviewing the patch that taught IPA-SRA > > > that REFERENCE_TYPEs are always non-NULL that the pass also handles >

Re: [PATCH] ipa-sra: Fix address escape case when detecting Fortran descriptors

2022-12-14 Thread Jan Hubicka via Gcc-patches
> Hi, > > The discussion about scan_expr_access in ipa-sra.cc brought my > attention to a missing case of handling an ADDR_EXPR. As the added > testcase shows, the heuristics which looks for parameters which are > local variables that are only written to and passed by reference in > calls can

Re: [PATCH] ipa-sra: Consider the first parameter of methods safe to dereference

2022-12-14 Thread Jan Hubicka via Gcc-patches
> Hi, > > Honza requested this after reviewing the patch that taught IPA-SRA > that REFERENCE_TYPEs are always non-NULL that the pass also handles > the first parameters of methods, this pointers, in the same way. So > this patch does that. > > The patch is undergoing bootstrap and testing on

Re: PING^1 [PATCH v2] predict: Adjust optimize_function_for_size_p [PR105818]

2022-12-14 Thread Jan Hubicka via Gcc-patches
> > PR middle-end/105818 > > > > gcc/ChangeLog: > > > > * predict.cc (optimize_function_for_size_p): Further check > > optimize_size of fun->decl when it is valid but no cgraph node. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/powerpc/pr105818.c: New test. > > *

Re: [PATCH 8/9] ipa-sra: Make scan_expr_access bail out on uninteresting expressions

2022-12-12 Thread Jan Hubicka via Gcc-patches
> > Hi, > > > > I'm re-posting patches which I have posted at the end of stage 1 but > > which have not passed review yet. > > > > 8< > > > > I have noticed that scan_expr_access passes all the expressions it > > gets to

Re: [PATCH 9/9] ipa: Avoid looking for IPA-SRA replacements where there are none

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > While modifying the code, I realized that we do look into statements > even when there are no

Re: [PATCH 8/9] ipa-sra: Make scan_expr_access bail out on uninteresting expressions

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > I have noticed that scan_expr_access passes all the expressions it > gets to get_ref_base_and_extent

Re: [PATCH 7/9] ipa-sra: Forward propagation of sizes which are safe to dereference

2022-12-12 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > 2022-11-11 Martin Jambor > > * ipa-sra.cc (isra_param_desc): New fields safe_size, > conditionally_dereferenceable and safe_size_set. > (struct gensum_param_desc): New field conditionally_dereferenceable. > (struct isra_param_flow): Updated comment

Re: [PATCH 6/9] ipa-sra: Be optimistic about Fortran descriptors

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > Fortran descriptors are structures which are often constructed just > for a particular argument of a

Re: [PATCH 5/9] ipa-sra: Move caller->callee propagation before callee->caller one

2022-12-12 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > 2022-11-11 Martin Jambor > > * ipa-sra.c (ipa_sra_analysis): Move top-down analysis before > bottom-up analysis. Replace FOR_EACH_VEC_ELT with C++11 iteration. > > gcc/testsuite/ChangeLog: > > 2021-12-14 Martin Jambor > > *

Re: [PATCH 4/9] ipa-sra: Treat REFERENCE_TYPES as always dereferencable

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > C++ and especially Fortran pass data by references which are not > pointers potentially pointing

Re: [PATCH 3/9] ipa-cp: Leave removal of unused parameters to IPA-SRA

2022-12-12 Thread Jan Hubicka via Gcc-patches
> Hi, > > I'm re-posting patches which I have posted at the end of stage 1 but > which have not passed review yet. > > 8< > > Looking at some benchmarks I have noticed many cases when IPA-CP > cloned a function for all contexts

Re: [PATCH 1/9] ipa-cp: Write transformation summaries of all functions

2022-12-12 Thread Jan Hubicka via Gcc-patches
> gcc/ChangeLog: > > 2022-11-25 Martin Jambor > > * ipa-prop.cc (useful_ipcp_transformation_info_p): New function. > (write_ipcp_transformation_info): Added a parameter, simplified > given that is known not to be NULL. > (ipcp_write_transformation_summaries): Write out

Re: [PATCH 2/9] ipa: Better way of applying both IPA-CP and IPA-SRA (PR 103227)

2022-12-12 Thread Jan Hubicka via Gcc-patches
> 2022-11-11 Martin Jambor > > PR ipa/103227 > * ipa-param-manipulation.h (class ipa_param_adjustments): Removed > member function get_updated_index_or_split. > (class ipa_param_body_adjustments): New overload of > register_replacement, new member function

Re: [PATCH 1/9] ipa-cp: Write transformation summaries of all functions

2022-12-12 Thread Jan Hubicka via Gcc-patches
> -void > -write_ipcp_transformation_info (output_block *ob, cgraph_node *node) > +/* Return true if the IPA-CP transformation summary TS is non-NULL and > contains > + useful info. */ > +static bool > +useful_ipcp_transformation_info_p (ipcp_transformation *ts) > { > - int node_ref; > -

Re: Zen4 tuning part 1 - cost tables

2022-12-06 Thread Jan Hubicka via Gcc-patches
> > - COSTS_N_INSNS (5), /* cost of FADD and FSUB insns. */ > > - COSTS_N_INSNS (5), /* cost of FMUL instruction. */ > > + COSTS_N_INSNS (7), /* cost of FADD and FSUB insns. */ > > + COSTS_N_INSNS (7), /* cost of FMUL

<    1   2   3   4   5   6   7   8   9   10   >