Fix points_to_local_or_readonly_memory_p wrt TARGET_MEM_REF

2024-05-16 Thread Jan Hubicka
Hi, TARGET_MEM_REF can be used to offset constant base into a memory object (to produce lea instruction). This confuses points_to_local_or_readonly_memory_p which treats the constant address as a base of the access. Bootstrapped/regtsted x86_64-linux, comitted. Honza gcc/ChangeLog: PR

[gcc r15-581] Fix points_to_local_or_readonly_memory_p wrt TARGET_MEM_REF

2024-05-16 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:96d53252aefcbc2fe419c4c3b4bcd3fc03d4d187 commit r15-581-g96d53252aefcbc2fe419c4c3b4bcd3fc03d4d187 Author: Jan Hubicka Date: Thu May 16 15:33:55 2024 +0200 Fix points_to_local_or_readonly_memory_p wrt TARGET_MEM_REF TARGET_MEM_REF can be used to offset

[gcc r15-512] Avoid pointer compares on TYPE_MAIN_VARIANT in TBAA

2024-05-15 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:9b7cad5884f21cc5783075be0043777448db3fab commit r15-512-g9b7cad5884f21cc5783075be0043777448db3fab Author: Jan Hubicka Date: Wed May 15 14:14:27 2024 +0200 Avoid pointer compares on TYPE_MAIN_VARIANT in TBAA while building more testcases for ipa-icf I noticed

Re: [Bug libstdc++/109442] Dead local copy of std::vector not removed from function

2024-05-14 Thread Jan Hubicka via Gcc-bugs
This patch attempts to add __builtin_operator_new/delete. So far they are not optimized, which will need to be done by extra flag of BUILT_IN_ code. also the decl.cc code can be refactored to be less of cut and I guess has_builtin hack to return proper value needs to be moved to C++ FE. However

Re: [PATCH 6/7] lto: squash order of symbols in partitions

2024-05-14 Thread Jan Hubicka
> This patch squashes order of symbols in individual partitions, so that > their relative order is conserved, but is not influenced by symbols in > other partitions. > Order of cloned symbols is set to 0. This should be fine because order > specifies order of symbols in input files, which cloned

Re: [PATCH 7/7] lto: partition specific lto_clone_numbers

2024-05-14 Thread Jan Hubicka
> Replaces "lto_priv.$clone_number" by > "lto_priv.$partition_hash.$partition_specific_clone_number". > To reduce divergence for incremental LTO. > > Bootstrapped/regtested on x86_64-pc-linux-gnu OK, thanks! Honza > > gcc/lto/ChangeLog: > > * lto-partition.cc

Re: [PATCH 5/7] lto: Implement cache partitioning

2024-05-14 Thread Jan Hubicka
> gcc/ChangeLog: > > * common.opt: Add cache partitioning. > * flag-types.h (enum lto_partition_model): Likewise. > > gcc/lto/ChangeLog: > > * lto-partition.cc (new_partition): Use new_partition_no_push. > (new_partition_no_push): New. > (free_ltrans_partition):

Re: [PATCH 4/7] lto: Implement ltrans cache

2024-05-14 Thread Jan Hubicka
> This patch implements Incremental LTO as ltrans cache. > > The cache is active when directory $GCC_LTRANS_CACHE is specified and exists. > Stored are pairs of ltrans input/output files and input file hash. > File locking is used to allow multiple GCC instances to use to same cache. > >

Re: Fwd: [PATCH 2/7 v2] lto: Remove random_seed from section name.

2024-05-14 Thread Jan Hubicka
> This patch removes suffixes from section names during LTO linking. > > These suffixes were originally added for ld -r to work (PR lto/44992). > They were added to all LTO object files, but are only useful before WPA. > After that they waste space, and if kept random, make LTO caching >

[gcc r15-482] Reduce recursive inlining of always_inline functions

2024-05-14 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:1ec49897253e093e1ef6261eb104ac0c111bac83 commit r15-482-g1ec49897253e093e1ef6261eb104ac0c111bac83 Author: Jan Hubicka Date: Tue May 14 12:58:56 2024 +0200 Reduce recursive inlining of always_inline functions this patch tames down inliner on (mutiply) self

Avoid TYPE_MAIN_VARIANT compares in TBAA

2024-05-14 Thread Jan Hubicka
Hi, while building more testcases for ipa-icf I noticed that there are two places in aliasing code where we still compare TYPE_MAIN_VARIANT for pointer equality. This is not good idea for LTO since type merging may not happen for example when in one unit pointed to type is forward declared while

GNU Tools Cauldron 2024

2024-05-09 Thread Jan Hubicka via Gcc
Hello, we are pleased to invite you all to the next GNU Tools Cauldron, taking place in Prague, Czech Republic, on September 14-16, 2024. As for the previous instances, we have setup a wiki page for details: https://gcc.gnu.org/wiki/cauldron2024 Like last year, we are having to charge

Re: [PATCH] [RFC] Add function filtering to gcov

2024-05-08 Thread Jan Hubicka
> > > > For JSON output I suppose there's a way to "grep" without the line oriented > > issue? I suppose we could make the JSON more hierarchical by adding > > an outer function object? > > Absolutely, yes, this is much less useful for JSON. The filtering works, > which may be occasionally

gcc-wwwdocs branch master updated. 3aee4b0adcb86280ecdaec41447e7ff4f8d8c0a7

2024-05-07 Thread Jan Hubicka via Gcc-cvs-wwwdocs
--- commit 3aee4b0adcb86280ecdaec41447e7ff4f8d8c0a7 Author: Jan Hubicka Date: Tue May 7 15:16:35 2024 +0200 Add GNU Tools Cauldron 2024 diff --git a/htdocs/index.html b/htdocs/index.html index aa4683da..de5cca7b 100644 --- a/htdocs/index.html +++ b/htdocs/index.html @@ -54,6 +54,9 @@ mission statement. New

[wwwdocs] Add some more stuff into GCC14 changes.html

2024-05-07 Thread Jan Hubicka
Hi, I realize that I am late for the release (sorry for that). But here are few things which I think may be added to changes.html at least for those who will look later. OK? Honza diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html index ca5174de..b390db51 100644 ---

[wwwdocs] Add Cauldron2024

2024-05-07 Thread Jan Hubicka
Hi, this adds Cauldron2024 to main page. OK? diff --git a/htdocs/index.html b/htdocs/index.html index aa4683da..de5cca7b 100644 --- a/htdocs/index.html +++ b/htdocs/index.html @@ -54,6 +54,9 @@ mission statement. News +https://gcc.gnu.org/wiki/cauldron2024;>GNU Tools Cauldron 2024 +

Re: How stable is the CFG and basic block IDs?

2024-04-30 Thread Jan Hubicka via Gcc
> > The problem is testing. If gcc would re-number the basic blocks then > > tests comparing hard-coded test paths would break, even though the path > > coverage itself would be just fine (and presumably the change to the > > basic block indices), which would add an unreasonable maintenance > >

[gcc r14-10093] Remove repeated information in -ftree-loop-distribute-patterns doc

2024-04-23 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:6f0a646dd2fc59e9c9cde63718b36085f84a19ba commit r14-10093-g6f0a646dd2fc59e9c9cde63718b36085f84a19ba Author: Jan Hubicka Date: Tue Apr 23 15:51:42 2024 +0200 Remove repeated information in -ftree-loop-distribute-patterns doc We have: -ftree

Fix documentation of -ftree-loop-distibutive-patterns

2024-04-23 Thread Jan Hubicka
Hi, we have: -ftree-loop-distribute-patterns Perform loop distribution of patterns that can be code generated with calls to a library. This flag is enabled by default at -O2 and higher, and by -fprofile-use and -fauto-profile. This pass distributes the

Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]

2024-04-17 Thread Jan Hubicka
> > Ah, you're right. > If I compile (the one line modified) pr113208_0.C with > -O -fno-early-inlining -fdisable-ipa-inline -std=c++20 > it does have just _ZN6vectorI12QualityValueEC2ERKS1_ in > _ZN6vectorI12QualityValueEC2ERKS1_ > comdat and no _ZN6vectorI12QualityValueEC1ERKS1_ > and

Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]

2024-04-17 Thread Jan Hubicka
> > I've tried to see what actually happens during linking without LTO, so > compiled > pr113208_0.C with -O1 -fkeep-inline-functions -std=c++20 with vanilla trunk > (so it has those 2 separate comdats, one for C2 and one for C1), though I've > changed the > void m(k); > line to >

Re: [PATCH] c++: Retry the aliasing of base/complete cdtor optimization at import_export_decl time [PR113208]

2024-04-17 Thread Jan Hubicka
> Hi! Hello, > The reason the optimization doesn't trigger when the constructor is > constexpr is that expand_or_defer_fn is called in that case much earlier > than when it is not constexpr; in the former case it is called when we > try to constant evaluate that constructor. But

Re: [PATCH] lto/114655 - -flto=4 at link time doesn't override -flto=auto at compile time

2024-04-09 Thread Jan Hubicka
> The following adjusts -flto option processing in lto-wrapper to have > link-time -flto override any compile time setting. > > LTO-boostrapped on x86_64-unknown-linux-gnu, testing in progress. > > OK for trunk and branches? GCC 11 seems to be unaffected by this. > > Thanks, > Richard. > >

Re: [Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41

2024-04-09 Thread Jan Hubicka via Gcc-bugs
There is still problem with loop bounds. I am testing patch on that and then we should be (finally) finally safe.

Re: [PATCH 3/3] tree-optimization/114052 - niter analysis from undefined behavior

2024-04-05 Thread Jan Hubicka
> + /* When there's a call that might not return the last iteration > + is possibly partial. This matches what we check in invariant > + motion. > + ??? For the call argument evaluation it would be still OK. */ > + if (!may_have_exited > + &&

Re: [PATCH] ipa: Force args obtined through pass-through maps to the expected type (PR 113964)

2024-04-05 Thread Jan Hubicka
> Hi, > > interactions of IPA-CP and IPA-SRA on the same data is a rather big > source of issues, I'm afraid. PR 113964 is a situation where IPA-CP > propagates an unsigned short in a union parameter into a function > which itself calls a different function which has a same union > parameter and

Re: [PATCH] ipa: Avoid duplicate replacements in IPA-SRA transformation phase

2024-04-04 Thread Jan Hubicka
> > gcc/ChangeLog: > > > > 2024-03-15 Martin Jambor > > > > PR ipa/111571 > > * ipa-param-manipulation.cc > > (ipa_param_body_adjustments::common_initialization): Avoid creating > > duplicate replacement entries. > > > > gcc/testsuite/ChangeLog: > > > > 2024-03-15 Martin Jambor

Re: [PATCH v10 1/2] Add condition coverage (MC/DC)

2024-04-04 Thread Jan Hubicka
> > > diff --git a/gcc/ipa-inline.cc b/gcc/ipa-inline.cc > > > index dc120e6da5a..9502a21c741 100644 > > > --- a/gcc/ipa-inline.cc > > > +++ b/gcc/ipa-inline.cc > > > @@ -682,7 +682,7 @@ can_early_inline_edge_p (struct cgraph_edge *e) > > > } > > > gcc_assert (gimple_in_ssa_p

Re: [PATCH v10 1/2] Add condition coverage (MC/DC)

2024-04-04 Thread Jan Hubicka
> gcc/ChangeLog: > > * builtins.cc (expand_builtin_fork_or_exec): Check > condition_coverage_flag. > * collect2.cc (main): Add -fno-condition-coverage to OBSTACK. > * common.opt: Add new options -fcondition-coverage and > -Wcoverage-too-many-conditions. > *

Re: [PATCH v3] tree-profile: Disable indirect call profiling for IFUNC resolvers

2024-04-03 Thread Jan Hubicka
> We can't profile indirect calls to IFUNC resolvers nor their callees as > it requires TLS which hasn't been set up yet when the dynamic linker is > resolving IFUNC symbols. > > Add an IFUNC resolver caller marker to cgraph_node and set it if the > function is called by an IFUNC resolver.

Re: PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread Jan Hubicka
> > I am bit worried about commonly used functions getting "infected" by > > being called once from ifunc resolver. I think we only use thread local > > storage for indirect call profiling, so we may just disable indirect > > call profiling for these functions. > > Will change it. > > > Also

Re: PING: [PATCH v2] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-04-02 Thread Jan Hubicka
> On Tue, Mar 5, 2024 at 1:45 PM H.J. Lu wrote: > > > > We can't instrument an IFUNC resolver nor its callees as it may require > > TLS which hasn't been set up yet when the dynamic linker is resolving > > IFUNC symbols. > > > > Add an IFUNC resolver caller marker to cgraph_node and set it if the

Re: [PATCH] profile-count: Avoid overflows into uninitialized [PR112303]

2024-03-28 Thread Jan Hubicka
Hi, so what goes wrong with the testcase is the fact that after recursive inliing we have large loop nest and consequently invalid profile since every loop is predicted to iterate quite a lot. Rebuild_frequences should take care of the problem, but it doesn't since there is: if (freq_max <

Re: [PATCH] profile-count: Avoid overflows into uninitialized [PR112303]

2024-03-28 Thread Jan Hubicka
> __attribute__((noipa)) void > bar (void) > { > __builtin_exit (0); > } > > __attribute__((noipa)) void > foo (void) > { > for (int i = 0; i < 1000; ++i) > for (int j = 0; j < 1000; ++j) > for (int k = 0; k < 1000; ++k) > for (int l = 0; l < 1000; ++l) > for (int m = 0; m < 1000;

[gcc r14-9705] Hash operands of PHI in ipa-icf

2024-03-28 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:0923fe2d4808c16b72c1d1bfe28220dd326d8b76 commit r14-9705-g0923fe2d4808c16b72c1d1bfe28220dd326d8b76 Author: Jan Hubicka Date: Thu Mar 28 13:24:54 2024 +0100 Hash operands of PHI in ipa-icf This patch fixes cache colision on function whose body differs only

[gcc r14-9516] Add missing config/i386/zn4zn5.md file

2024-03-18 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:dfc9d1cc8353bdd7fbc37bc10bb3fd40f49fa4af commit r14-9516-gdfc9d1cc8353bdd7fbc37bc10bb3fd40f49fa4af Author: Jan Hubicka Date: Mon Mar 18 14:24:10 2024 +0100 Add missing config/i386/zn4zn5.md file gcc/ChangeLog: * config/i386/zn4zn5.md: Add

Re: RE: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model

2024-03-18 Thread Jan Hubicka
> Hello, > > Le 22/02/2024 à 19:29, Anbazhagan, Karthiban a écrit : > (...) > > gcc/config/i386/{znver4.md => zn4zn5.md} | 858 +- > > looks like the patch pushed to master lost the file rename. > I get a bootstrap failure caused by the missing zn4zn5.md file. > > Can you

[gcc r14-9515] Add AMD znver5 processor enablement with scheduler model

2024-03-18 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:d0aa0af9a9b7dd709a8c7ff6604ed6b7da0fc23a commit r14-9515-gd0aa0af9a9b7dd709a8c7ff6604ed6b7da0fc23a Author: Jan Hubicka Date: Mon Mar 18 10:22:44 2024 +0100 Add AMD znver5 processor enablement with scheduler model 2024-02-14 Jan Hubicka

Re: [PATCH] ipa: Fix C++ member ptr indirect inlining (PR 114254, PR 108802)

2024-03-18 Thread Jan Hubicka
> gcc/ChangeLog: > > 2024-03-06 Martin Jambor > > PR ipa/108802 > PR ipa/114254 > * ipa-prop.cc (ipa_get_stmt_member_ptr_load_param): Fix case looking > at COMPONENT_REFs directly from a PARM_DECL, also recognize loads from > a pointer parameter. >

Re: Patch ping Re: [PATCH] icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged functions [PR113907]

2024-03-15 Thread Jan Hubicka
> > + if (POINTER_TYPE_P (TREE_TYPE (t1))) > > +{ > > + if (SSA_NAME_PTR_INFO (t1)) > > + { > > + if (!SSA_NAME_PTR_INFO (t2) > > + || SSA_NAME_PTR_INFO (t1)->align != SSA_NAME_PTR_INFO (t2)->align > > + || SSA_NAME_PTR_INFO (t1)->misalign != SSA_NAME_PTR_INFO > >

Re: [PATCH] Fix PR ipa/113996

2024-03-14 Thread Jan Hubicka
> > Patch is still OK, but ipa-ICF will only identify the functions if > > static chain is unused. Perhaps just picking the winning candidate to be > > version without static chain and making ipa-inline to not ICE when calls > > with static chain lands to function with no static chain would help

Re: Patch ping Re: [PATCH] icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged functions [PR113907]

2024-03-14 Thread Jan Hubicka
> > Otherwise > > I will add your testcase for this patch and commit this one. > > Statistically we almost never merge functions with different value > > ranges (three in testsuite, 0 during bootstrap, 1 during LTO bootstrap > > and probably few in LLVM build - there are 15 cases reported, but

Re: Patch ping Re: [PATCH] icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged functions [PR113907]

2024-03-14 Thread Jan Hubicka
> > We have wrong code with LTO, too. > > I know. > > > The problem is that IPA passes (and > > not only that, loop analysis too) does analysis at compile time (with > > value numbers in) and streams the info separately. > > And that is desirable, because otherwise it simply couldn't derive any

Re: Patch ping Re: [PATCH] icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged functions [PR113907]

2024-03-14 Thread Jan Hubicka
> > int test (int a) > > { > > return a>0 ? CST1: CST2; > > } > > > > gets same hash value no matter what CST1/CST2 is. I added hasher and I > > am re-running stats. > > The hash should be commutative here at least. It needs to match what comparator is doing later, and sadly it does not try

Re: Patch ping Re: [PATCH] icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged functions [PR113907]

2024-03-13 Thread Jan Hubicka
> On Wed, Mar 13, 2024 at 10:55:07AM +0100, Jan Hubicka wrote: > > > > So the ipa_jump_func are I think the only thing that actually can differ > > > > on the ICF merging candidates from value range POV. > > > > > > I agree. Btw, I would have app

Re: Patch ping Re: [PATCH] icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged functions [PR113907]

2024-03-13 Thread Jan Hubicka
> On Tue, 12 Mar 2024, Jakub Jelinek wrote: > > > On Tue, Mar 12, 2024 at 05:21:58PM +0100, Jakub Jelinek wrote: > > > On Tue, Mar 12, 2024 at 10:46:42AM +0100, Jan Hubicka wrote: > > > > I am sorry for delaying this. I made the variant that simply compares

Re: [PATCH] Fix PR ipa/113996

2024-03-12 Thread Jan Hubicka
> > > On 3/11/24 4:38 PM, Eric Botcazou wrote: > > Hi, > > > > this is a regression present on all active branches: the attached Ada > > testcase > > triggers an assertion failure when compiled with -O2 -gnatp -flto: > > > >/* Initialize the static chain. */ > >p =

Re: Patch ping Re: [PATCH] icf: Reset SSA_NAME_{PTR,RANGE}_INFO in successfully merged functions [PR113907]

2024-03-12 Thread Jan Hubicka
> Hi! Hi, > > On Thu, Feb 15, 2024 at 08:29:24AM +0100, Jakub Jelinek wrote: > > 2024-02-15 Jakub Jelinek > > > > PR middle-end/113907 > > * ipa-icf.cc (sem_item_optimizer::merge_classes): Reset > > SSA_NAME_RANGE_INFO and SSA_NAME_PTR_INFO on successfully ICF merged > >

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model

2024-03-11 Thread Jan Hubicka
> [Public] > > > Hi all, > > > > PFA, the patch that enables support for the next generation AMD Zen5 CPU via > -march=znver5 with basic znver5 scheduler Model. > > We may update the scheduler model going forward. > > > > Good for trunk? > > Thanks and Regards > Karthiban > > > Patch

Re: [PATCH] ipa: Avoid excessive removing of SSAs (PR 113757)

2024-03-07 Thread Jan Hubicka
> On Thu, Feb 08 2024, Martin Jambor wrote: > > Hi, > > > > PR 113757 shows that the code which was meant to debug-reset and > > remove SSAs defined by LHSs of calls redirected to > > __builtin_unreachable can trigger also when speculative > > devirtualization creates a call to a noreturn function

Re: [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function

2024-03-07 Thread Jan Hubicka via Gcc-bugs
> Note GCC has not retuned its -Os heurstics for a long time because it has been > decent enough for most folks and corner cases like this is almost never come > up. There were quite few changes to -Os heuristics :) One of bigger challenges is that we do see more and more C++ code built with -Os

Re: [Bug target/114232] [14 regression] ICE when building rr-5.7.0 with LTO on x86

2024-03-05 Thread Jan Hubicka via Gcc-bugs
Looking at the prototype patch, why need to change also the splitters? My original goal was to use splitters to expand to faster code sequences while having patterns necessary for both variants. This makes it possible to use optimize_insn_for_size/speed and make decisions using BB profile, since

Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]

2024-02-29 Thread Jan Hubicka
> On Thu, Feb 29, 2024 at 03:15:30PM +0100, Jan Hubicka wrote: > > I am not wed to the idea (just it appeared to me as an option to > > disabling this optimization by default). I still think it may make sense. > > Maybe I misunderstood your idea. > So, you are ba

Re: [PATCH] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-02-29 Thread Jan Hubicka
> > I am worried about scenario where ifunc selector calls function foo > > defined locally and foo is also used from other places possibly in hot > > loops. > > > > > > > So it is not really reliable fix (though I guess it will work a lot of > > > > common code). I wonder what would be

Re: [PATCH] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-02-29 Thread Jan Hubicka
> On Thu, Feb 29, 2024 at 5:39 AM Jan Hubicka wrote: > > > > > We can't instrument an IFUNC resolver nor its callees as it may require > > > TLS which hasn't been set up yet when the dynamic linker is resolving > > > IFUNC symbols. Add an IFUNC r

Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]

2024-02-29 Thread Jan Hubicka
> On Thu, Feb 29, 2024 at 02:31:05PM +0100, Jan Hubicka wrote: > > I agree that debugability of user core dumps is important here. > > > > I guess an ideal solution would be to change codegen of noreturn functions > > to callee save all registers. Performance of prolog

Re: [PATCH] tree-profile: Don't instrument an IFUNC resolver nor its callees

2024-02-29 Thread Jan Hubicka
> We can't instrument an IFUNC resolver nor its callees as it may require > TLS which hasn't been set up yet when the dynamic linker is resolving > IFUNC symbols. Add an IFUNC resolver caller marker to symtab_node to > avoid recursive checking. > > gcc/ChangeLog: > > PR

Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]

2024-02-29 Thread Jan Hubicka
> > > > The problem is that it doesn't help in this case. > > If some optimization makes debugging of some function harder, normally it is > > enough to recompile the translation unit that defines it with -O0/-Og, or > > add optimize attribute on the function. > > While in this case, the

Re: Patch ping^2

2024-02-26 Thread Jan Hubicka
> Hi! > > I'd like to ping 2 patches: > > https://gcc.gnu.org/pipermail/gcc-patches/2024-January/644580.html > > > PR113617 P1 - Handle private COMDAT function symbol reference

Re: [PATCH v9 2/2] Add gcov MC/DC tests for GDC

2024-02-22 Thread Jan Hubicka
> This is a mostly straight port from the gcov-19.c tests from the C test > suite. The only notable differences from C to D are that D flips the > true/false outcomes for loop headers, and the D front end ties loop and > ternary conditions to slightly different locus. > > The test for >64

Re: [PATCH v9 1/2] Add condition coverage (MC/DC)

2024-02-22 Thread Jan Hubicka
Hello, > This patch adds support in gcc+gcov for modified condition/decision > coverage (MC/DC) with the -fcondition-coverage flag. MC/DC is a type of > test/code coverage and it is particularly important for safety-critical > applicaitons in industries like aviation and automotive. Notably, MC/DC

Re: [PATCH] profile-count: Don't dump through a temporary buffer [PR111960]

2024-02-22 Thread Jan Hubicka
> Hi! > > The profile_count::dump (char *, struct function * = NULL) const; > method has a single caller, the > profile_count::dump (FILE *f, struct function *fun) const; > method and for that going through a temporary buffer is just slower > and opens doors for buffer overflows, which is exactly

Re: [PATCH] ipa: Convert lattices from pure array to vector (PR 113476)

2024-02-20 Thread Jan Hubicka
> On Tue, Feb 13 2024, Martin Jambor wrote: > > On Mon, Feb 12 2024, Jan Hubicka wrote: > >>> Believe it or not, even though I have re-worked the internals of the > >>> lattices completely, the array itself is older than my involvement with > >>> GCC (o

Fix ICE in loop splitting

2024-02-14 Thread Jan Hubicka
Hi, as demonstrated in the testcase, I forgot to check that profile is present in tree-ssa-loop-split. Bootstrapped and regtested x86_64-linux, comitted. PR tree-optimization/111054 gcc/ChangeLog: * tree-ssa-loop-split.cc (split_loop): Check for profile being present.

Re: [Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-02-14 Thread Jan Hubicka via Gcc-bugs
> > I guess PTA gets around by tracking points-to set also for non-pointer > > types and consequently it also gives up on any such addition. > > It does. But note it does _not_ for POINTER_PLUS where it treats > the offset operand as non-pointer. > > > I think it is

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model

2024-02-14 Thread Jan Hubicka
znver5? > We will combine znver4 and znver5 scheduler descriptions into one Thanks! Honza > > Thanks and Regards > Karthiban > > -Original Message- > From: Jan Hubicka > Sent: Monday, February 12, 2024 9:30 PM > To: Anbazhagan, Karthiban > Cc

Re: [PATCH] ipa: call destructors on lattices before freeing them (PR 113476)

2024-02-12 Thread Jan Hubicka
> Believe it or not, even though I have re-worked the internals of the > lattices completely, the array itself is older than my involvement with > GCC (or at least with ipa-cp.c ;-). > > So it being an array and not a vector is historical coincidence, as far > as I am concerned :-). But that may

Re: [PATCH] ipa: call destructors on lattices before freeing them (PR 113476)

2024-02-12 Thread Jan Hubicka
> Hi, > > In PR 113476 we have discovered that ipcp_param_lattices is no longer > a POD and should be destructed. This patch does that, calling > destructor on each element of the array containing them when the > corresponding summary of a node is freed. An alternative would be to > change the

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen5 CPU with znver5 scheduler Model

2024-02-12 Thread Jan Hubicka
Hi, > gcc/ChangeLog: > * common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver5. > * common/config/i386/i386-common.cc (processor_names): Add znver5. > (processor_alias_table): Likewise. > * common/config/i386/i386-cpuinfo.h (processor_types): Add new zen >

Re: [PATCH] ipa: Self-DCE of uses of removed call LHSs (PR 108007)

2024-01-22 Thread Jan Hubicka
> Hi, > > PR 108007 is another manifestation where we rely on DCE to clean-up > after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA > can leave behind statements which are fed uninitialized values and > trap, even though their results are themselves never used. > > I have already

Re: [PATCH] ipa-cp: Fix check for exceeding param_ipa_cp_value_list_size (PR 113490)

2024-01-22 Thread Jan Hubicka
> Hi, > > When the check for exceeding param_ipa_cp_value_list_size limit was > modified to be ignored for generating values from self-recursive > calls, it should have been changed from equal to, to equals toor is > greater than. This omission manifests itself as PR 113490. > > When I examined

Re: [PATCH v2 2/2] x86: Don't save callee-saved registers in noreturn functions

2024-01-22 Thread Jan Hubicka
> I compared GCC master branch bootstrap and test times on a slow machine > with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 > with the backported patch. The performance data isn't precise since the > measurements were done on different days with different GCC sources under

Remove accidental hack in ipa_polymorphic_call_context::set_by_invariant

2024-01-17 Thread Jan Hubicka
Hi, I managed to commit a hack setting offset to 0 in ipa_polymorphic_call_context::set_by_invariant. This makes it to give up on multiple inheritance, but most likely won't give bad code since the ohter base will be of different type. Bootstrapped/regtested x86_64-linux, comitted.

Fix handling of X86_TUNE_AVOID_512FMA_CHAINS

2024-01-17 Thread Jan Hubicka
Hi, I have noticed quite bad pasto in handling of X86_TUNE_AVOID_512FMA_CHAINS. At the moment it is ignored, but X86_TUNE_AVOID_256FMA_CHAINS controls 512FMA too. This patch fixes it, we may want to re-check how that works on AVX512 machines. Bootstrapped/regtested x86_64-linux, will commit it

Re: Disable FMADD in chains for Zen4 and generic

2024-01-17 Thread Jan Hubicka
> Can we backport the patch(at least the generic part) to > GCC11/GCC12/GCC13 release branch? Yes, the periodic testers has took the change and as far as I can tell, there are no surprises. Thanks, Honza > > > > > > > > /* X86_TUNE_AVOID_512FMA_CHAINS: Avoid creating loops with tight > > > >

Re: Add -falign-all-functions

2024-01-17 Thread Jan Hubicka
> On Wed, 17 Jan 2024, Jan Hubicka wrote: > > > > > > > I meant the new option might be named -fmin-function-alignment= > > > rather than -falign-all-functions because of how it should > > > override all other options. > > > > I was also

Re: Add -falign-all-functions

2024-01-17 Thread Jan Hubicka
> > I meant the new option might be named -fmin-function-alignment= > rather than -falign-all-functions because of how it should > override all other options. I was also pondering about both names. -falign-all-functions has the advantage that it is similar to all the other alignment flags that

Re: Fix merging of value predictors

2024-01-17 Thread Jan Hubicka
value. */ > > s/determinging/determining/ Fixed. I am re-testing the following and will commit if it succeeds (on x86_64-linux) 2024-01-17 Jan Hubicka Jakub Jelinek PR tree-optimization/110852 gcc/ChangeLog: * predict.cc (expr_expected_

Re: Add -falign-all-functions

2024-01-17 Thread Jan Hubicka
> > +falign-all-functions > > +Common Var(flag_align_all_functions) Optimization > > +Align the start of functions. > > all functions > > or maybe "of every function."? Fixed, thanks! > > +@opindex falign-all-functions=@var{n} > > +@item -falign-all-functions > > +Specify minimal alignment for

Fix merging of value predictors

2024-01-17 Thread Jan Hubicka
testcases). Bootstrapped/regtested x86_64-linux, will commit it tomorrow if there are no complains. 2024-01-17 Jan Hubicka Jakub Jelinek PR tree-optimization/110852 gcc/ChangeLog: * predict.cc (expr_expected_value_1): (get_predictor_value

Add -falign-all-functions

2024-01-04 Thread Jan Hubicka
Hi, this patch adds new option -falign-all-functions which works like -falign-functions, but applies to all functions including those in cold regions. As discussed in the PR log, this is needed for atomically patching function entries in the kernel. An option would be to make -falign-function

Re: [Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread Jan Hubicka via Gcc-bugs
> Confirm. But option save/restore has been always implemented: > > .section.gnu.lto_.opts,"",@progbits > .ascii "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection" > .ascii "=none' '-mabi=lp64d' '-march=loongarch64' '-mfpu=64' '-m" > .ascii "simd=lasx'

Re: [PATCH v7] Add condition coverage (MC/DC)

2023-12-31 Thread Jan Hubicka
> > This seems good. Profile-arcs is rarely used by itself - most of time it > > is implied by -fprofile-generate and -ftest-coverage and since > > condition coverage is more associated to the second, I guess > > -fcondition-coverage is better name. > > > > Since -fcondition-coverage now affects

Re: skip vector profiles multiple exits

2023-12-29 Thread Jan Hubicka
> Hi Honza, Hi, > > I wasn't sure what to do here so I figured I'd ask. > > In adding support for multiple exits to the vectorizer I didn't know how to > update this bit: > > https://github.com/gcc-mirror/gcc/blob/master/gcc/tree-vect-loop-manip.cc#L3363 > > Essentially, if skip_vector (i.e.

Re: [PATCH 3/7] Lockfile.

2023-12-29 Thread Jan Hubicka
Hi, > This patch implements lockfile used for incremental LTO. > > Bootstrapped/regtested on x86_64-pc-linux-gnu > > gcc/ChangeLog: > > * Makefile.in: Add lockfile.o. > * lockfile.cc: New file. > * lockfile.h: New file. I can't approve it, but overall it looks good to me. We

Re: [PATCH 2/7] lto: Remove random_seed from section name.

2023-12-29 Thread Jan Hubicka
> Bootstrapped/regtested on x86_64-pc-linux-gnu > > gcc/ChangeLog: > > * lto-streamer.cc (lto_get_section_name): Remove random_seed in WPA. This is also OK. (since it lacks explanation - the random suffixes are added for ld -r to work. This never happens between WPA and ltrans, so they

Re: [PATCH 1/7] lto: Skip flag OPT_fltrans_output_list_.

2023-12-29 Thread Jan Hubicka
Hi, > Bootstrapped/regtested on x86_64-pc-linux-gnu > > gcc/ChangeLog: > > * lto-opts.cc (lto_write_options): Skip OPT_fltrans_output_list_. OK, thanks, Honza > --- > gcc/lto-opts.cc | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/gcc/lto-opts.cc b/gcc/lto-opts.cc > index

Re: [PATCH v7] Add condition coverage (MC/DC)

2023-12-29 Thread Jan Hubicka
> gcc/ChangeLog: > > * builtins.cc (expand_builtin_fork_or_exec): Check > condition_coverage_flag. > * collect2.cc (main): Add -fno-condition-coverage to OBSTACK. > * common.opt: Add new options -fcondition-coverage and > -Wcoverage-too-many-conditions. > *

Re: Disable FMADD in chains for Zen4 and generic

2023-12-13 Thread Jan Hubicka
> > The diffrerence is that Cores understand the fact that fmadd does not need > > all three parameters to start computation, while Zen cores doesn't. > > > > Since this seems noticeable win on zen and not loss on Core it seems like > > good > > default for generic. > > > > I plan to commit the

Re: Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Jan Hubicka
> > This came up in a separate thread as well, but when doing reassoc of a > chain with > multiple dependent FMAs. > > I can't understand how this uarch detail can affect performance when > as in the testcase > the longest input latency is on the multiplication from a memory load. > Do we

Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Jan Hubicka
Hi, this patch disables use of FMA in matrix multiplication loop for generic (for x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U. For Intel this is neutral both on the matrix multiplication microbenchmark (attached) and spec2k17 where the difference was within noise for

Re: [PATCH] strub: add note on attribute access

2023-12-12 Thread Jan Hubicka
> On Dec 7, 2023, Alexandre Oliva wrote: > > > Thanks for raising the issue. Maybe there should be at least a comment > > there, and perhaps some asserts to check that pointer and reference > > types don't make to indirect_parms. > > Document why attribute access doesn't need the same

Re: [PATCH] ipa/92606 - properly handle no_icf attribute for variables

2023-12-12 Thread Jan Hubicka
> The following adds no_icf handling for variables where the attribute > was rejected. It also fixes the check for no_icf by checking both > the source and the targets decl. > > Bootstrap / regtest running on x86_64-unknown-linux-gnu. > > This would solve the AVR issue with merging of "progmem"

Re: [PATCH v5] Introduce strub: machine-independent stack scrubbing

2023-12-06 Thread Jan Hubicka
Hi, I am sorry for sending this late. I think the ipa changes are generally fine. There are few things which was not clear to me. > for gcc/ChangeLog > > * Makefile.in (OBJS): Add ipa-strub.o. > (GTFILES): Add ipa-strub.cc. > * builtins.def (BUILT_IN_STACK_ADDRESS): New. >

Re: [PATCH v7] Introduce attribute sym_alias

2023-12-06 Thread Jan Hubicka
> On Nov 30, 2023, Jan Hubicka wrote: > > >> + if (VAR_P (replaced)) > >> + varpool_node::create_alias (sym_node->decl, replacement); > >> + else > >> + cgraph_node::create_alias (sym_node->decl, replacement); > > Unfortunate

Re: libgcov, fork, and mingw (and other targets without the full POSIX set)

2023-12-01 Thread Jan Hubicka via Gcc
> On Dez 01 2023, Richard Biener via Gcc wrote: > > > Hmm, so why's it then referenced and not "GCed"? > > This has nothing to do with garbage collection. It's just the way > libgcc avoids having too many source files. It would be exactly the > same if every function were in its own file. THe

Re: [PATCH v6] Introduce attribute sym_alias

2023-11-30 Thread Jan Hubicka
> On Nov 22, 2023, Jan Hubicka wrote: > > > I wonder why you use same body aliases, which are kind of special to C++ > > frontend (and come with fixup code working around its quirks you had to > > disable above). > > TBH, I don't recall whether I had any reason

Re: [PATCH] tree-sra: Avoid returns of references to SRA candidates

2023-11-29 Thread Jan Hubicka
> Hi, > > On Tue, Nov 28 2023, Jan Hubicka wrote: > >> On Tue, 28 Nov 2023, Martin Jambor wrote: > >> > >> > On Tue, Nov 28 2023, Richard Biener wrote: > >> > > On Mon, 27 Nov 2023, Martin Jambor wrote: > >> > > > >>

Re: [PATCH] tree-sra: Avoid returns of references to SRA candidates

2023-11-28 Thread Jan Hubicka
> > > > Am 28.11.2023 um 17:59 schrieb Jan Hubicka : > > > >  > >> > >>> On Tue, 28 Nov 2023, Martin Jambor wrote: > >>> > >>> On Tue, Nov 28 2023, Richard Biener wrote: > >>>> On Mon, 27 Nov 2023, Martin Jam

Re: [PATCH] tree-sra: Avoid returns of references to SRA candidates

2023-11-28 Thread Jan Hubicka
> On Tue, 28 Nov 2023, Martin Jambor wrote: > > > On Tue, Nov 28 2023, Richard Biener wrote: > > > On Mon, 27 Nov 2023, Martin Jambor wrote: > > > > > >> Hi, > > >> > > >> The enhancement to address PR 109849 contained an importsnt thinko, > > >> and that any reference that is passed to a

  1   2   3   4   5   6   7   8   9   10   >