[Bug c++/101480] [11 Regression] Miscompiled code involving operator new

2021-10-11 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101480 --- Comment #21 from hubicka at kam dot mff.cuni.cz --- Hi, note that also tree-ssa-structalias has: /* If the call is to a replaceable operator delete and results from a delete expression as opposed to a direct call to

[Bug tree-optimization/102646] large performance changes between 1932e1169a236849f5e7f1cd386da100d9af470f and 9cfb95f9b92326e86e99b50350ebf04fa9cd2477 (probably jump threading)

2021-10-11 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102646 --- Comment #2 from hubicka at kam dot mff.cuni.cz --- > I think most of the regressions are fixed, we get even better numbers now. Because we enabled vectorization. I would say they should still reproduce with -fno-tree-vectorize, right? Honza

[Bug tree-optimization/103592] fatigue2 benchmarks on zen runs 43% faster with -fno-tree-vectorize -fno-tree-slp-vectorize

2021-12-06 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103592 --- Comment #1 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 > [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95) note that fatigue2 is polyhedron, not spec...

[Bug ipa/103766] [12 Regression] Initialization of variable passed via static chain is lost.

2021-12-19 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103766 --- Comment #12 from hubicka at kam dot mff.cuni.cz --- > Even trying to find a Fortran testsuite friendly version is hard because the > issue can only happen with print, I tried even doing internal write to a > string > it is passing without

[Bug ipa/103766] [12 Regression] Initialization of variable passed via static chain is lost.

2021-12-19 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103766 --- Comment #8 from hubicka at kam dot mff.cuni.cz --- > > I would welcome a testuite friendly version of the fortran testcase > > Both Andrew and I failed to make a C reproducer - what about just taking the > -fdump-tree-gimple, as input would

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-22 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #9 from hubicka at kam dot mff.cuni.cz --- > recip pass happens after vectorization > I don't know/understand why though. Yep, I suppose we want to either special case this in vectorizer or make it earlier... I also wonder why

[Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

2022-01-03 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943 --- Comment #33 from hubicka at kam dot mff.cuni.cz --- With the inliner tweaks (which I hope to get bit more aggressive this week) we "solved" the wrf compile time with LTO by simply not building the gigantic functions. However we still have

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-23 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #16 from hubicka at kam dot mff.cuni.cz --- > > > > It could be done, but I was under impression that the sequence to load 1.0f > > into topmost elements nullifies the benefit of operation to divide two > > Sure, so perhaps we

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-22 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #2 from hubicka at kam dot mff.cuni.cz --- > Can you please attach a reduced test-case? Do you know how to produce one with a reasonable effort? The declaratoins are quite convoluted, but the function is well isolated and easy to

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-22 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #4 from hubicka at kam dot mff.cuni.cz --- > -E and remove not needed code. > > > The > > declaratoins are quite convoluted, but the function is well isolated and > > easy to inspect from full one... > > Do we speak about: >

[Bug tree-optimization/103797] Clang vectorized LightPixel while GCC does not

2021-12-22 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103797 --- Comment #5 from hubicka at kam dot mff.cuni.cz --- Created attachment 52042 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52042=edit b.slp1

[Bug tree-optimization/103168] Value numbering for PRE of pure functions can be improved

2021-11-22 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103168 --- Comment #14 from hubicka at kam dot mff.cuni.cz --- This is bit modified patch I am testing. I added pre-computation of the number of accesses, enabled the path for const functions (in case they have memory operand), initialized alias sets

[Bug driver/100937] configure: Add --enable-default-semantic-interposition

2021-11-22 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937 --- Comment #12 from hubicka at kam dot mff.cuni.cz --- > (The -fno-semantic-interposition thing is probably the biggest performance gap > between gcc -fpic and clang -fpic.) Yep, it is often confusing to users (who do not understand what ELF

[Bug tree-optimization/103168] Value numbering for PRE of pure functions can be improved

2021-11-22 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103168 --- Comment #15 from hubicka at kam dot mff.cuni.cz --- The patch passed testing on x86_64-linux.

[Bug tree-optimization/103300] [12 Regression] wrong code at -O3 on x86_64-linux-gnu

2021-11-17 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103300 --- Comment #2 from hubicka at kam dot mff.cuni.cz --- Needs -O2 -floop-unroll-and-jam --param early-inlining-insns=14 to fail, so I guess it may be issue with unrol-and-jam.

[Bug ipa/103246] [12 Regression] 416.gamess miscompare with -O2 -g -flto=auto since r12-5223-gecdf414bd89e6ba251f6b3f494407139b4dbae0e

2021-11-17 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103246 --- Comment #14 from hubicka at kam dot mff.cuni.cz --- > Thanks! Great you found it so quickly. It is bit stupid code since everything is duplicated twice (for LTO and non-LTO). I have to refactor it: we could have common base of the two

[Bug ipa/103227] [12 Regression] 58% exchange2 regression with -Ofast -march=native on zen3 since r12-5223-gecdf414bd89e6ba251f6b3f494407139b4dbae0e

2021-11-19 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227 --- Comment #5 from hubicka at kam dot mff.cuni.cz --- > I like the idea of transformation phases better than putting > everything into tree-inline (and by extension ipa-param-manipulation) > but perhaps we have to do aggregate constant

[Bug ipa/97403] Ancestor jump function should be generalized

2021-11-10 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97403 --- Comment #4 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97403 > > --- Comment #3 from Martin Jambor --- > (In reply to Jan Hubicka from comment #2) > > Martin, > > I think we can close this

[Bug tree-optimization/103175] [12 Regression] internal compiler error: in handle_call_arg, at tree-ssa-structalias.c:4139

2021-11-11 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103175 --- Comment #5 from hubicka at kam dot mff.cuni.cz --- The sanity check verifies that functions acessing parameter indirectly also reads the parameter (otherwise the indirect reference can not happen). This patch moves the check earlier and

[Bug ipa/103164] -fipa-pta degrades aliasing oracle for tramp3d

2021-11-10 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103164 --- Comment #2 from hubicka at kam dot mff.cuni.cz --- Yep, it only shows that we want to run ipa-pta and local oracle in parallel since ipa-pta can not be realistically assumed to subsume all of local PTA (for example due to being necessarily

[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

2021-11-12 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 --- Comment #31 from hubicka at kam dot mff.cuni.cz --- > It likely was the loop header copying missing on cold loops then. Yep. It is good we worked that out.

[Bug tree-optimization/103223] [12 regression] Access attribute dropped when ipa-sra is applied

2021-11-15 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103223 --- Comment #8 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103223 > > --- Comment #5 from Martin Sebor --- > (In reply to Martin Jambor from comment #4) > > (In reply to Jan Hubicka from comment #0) >

[Bug testsuite/103264] [12 regression] gcc.dg/tree-prof/merge_block.c fails after r12-5236

2021-11-15 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103264 --- Comment #1 from hubicka at kam dot mff.cuni.cz --- What breaks in the testcase is updating profile after complete loop unroling. I suspect the unrolling is enabled by the extra DSE.

[Bug ipa/103267] Wrong code with ipa-sra

2021-11-16 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103267 --- Comment #5 from hubicka at kam dot mff.cuni.cz --- Works for me even with the 3 warnings. hubicka@lomikamen:/aux/hubicka/trunk/build-lto2/gcc$ cat >tt.c __attribute__ ((noinline,const)) infinite (int p) { if (p) while (1); return p;

[Bug ipa/103267] Wrong code with ipa-sra

2021-11-16 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103267 --- Comment #6 from hubicka at kam dot mff.cuni.cz --- Aha, but here is better example (reproduces same way). In the former one I forgot const attribute which makes it invalid. The testcase tests that ipa-sra is missing ECF_LOOPING_CONST_OR_PURE

[Bug ipa/103267] Wrong code with ipa-sra

2021-11-16 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103267 --- Comment #9 from hubicka at kam dot mff.cuni.cz --- > @@ -1,4 +1,3 @@ > -static int > __attribute__ ((noinline,const)) > infinite (int p) > { Just for a record, it crahes with or without static int here for me :) I run across it because

[Bug ipa/103230] ipa-modref-tree.h:550:33: runtime error: load of value 255, which is not a valid value for type 'bool'

2021-11-14 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103230 --- Comment #1 from hubicka at kam dot mff.cuni.cz --- > Happens with UBSAN compiler for: > > $ gcc gcc/testsuite/gcc.c-torture/execute/pr71494.c -O1 -flto > ... > /home/marxin/Programming/gcc/gcc/ipa-modref-tree.h:550:33: runtime error: load

[Bug ipa/103230] ipa-modref-tree.h:550:33: runtime error: load of value 255, which is not a valid value for type 'bool'

2021-11-14 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103230 --- Comment #3 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103230 > > --- Comment #2 from Martin Liška --- > > How do you build ubsan compiler? > > F="-O0 -g -fsanitize=undefined" ; make -j16

[Bug tree-optimization/103231] ICE (nondeterministic) on valid code at -O1 on x86_64-linux-gnu: Segmentation fault

2021-11-14 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103231 --- Comment #1 from hubicka at kam dot mff.cuni.cz --- > [659] % > [659] % gcctk -O0 -w small.c > [660] % > [660] % gcctk -O1 -w small.c > [661] % gcctk -O1 -w small.c > [662] % gcctk -O1 -w small.c > gcctk: internal compiler error:

[Bug ipa/101941] [12 Regression] Linux kernel build failure due to retaining fnsplit fragment with __attribute__((__error__))

2021-11-16 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101941 --- Comment #19 from hubicka at kam dot mff.cuni.cz --- > > * special case function splitting such that a BB that contains a function > > call which has either warning or error attribute on it; not to split out to > > a different function. > >

[Bug tree-optimization/103266] [12 regression] llvm-13 miscompilation: __builtin_assume_aligned causes over-aggressive dce

2021-11-16 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103266 --- Comment #5 from hubicka at kam dot mff.cuni.cz --- > I think 'X' means simply not dereferenced or escaping since this was all > PTA based. 'S' would still eventually allow escaping. But yes, PTA > simply takes '1' literally. So the patch

[Bug tree-optimization/103195] [12 Regression] tfft2 text grows by 70% with -Ofast since r12-5113-gd70ef65692fced7a

2021-11-12 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103195 --- Comment #3 from hubicka at kam dot mff.cuni.cz --- > > threader stuff would be my bet, but we need to bisect this (tfft2 is also > > quite small) > > Bad bet ;) It's caused by r12-5113-gd70ef65692fced7a. Hehe, that was my guess yeterday.

[Bug tree-optimization/103423] [12 Regression] 19% cpu2006 wrf compile time regression with -flto since r12-2353-g8da8ed435e9f01b3

2021-11-26 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103423 --- Comment #3 from hubicka at kam dot mff.cuni.cz --- > Oh, you are right, then it started with r12-2353-g8da8ed435e9f01b3. OK so mine, (as I sort of suspected :) If it is easy for you to get -ftime-report of before and after build, it would be

[Bug ipa/103432] [12 regression] libjxl-0.5 is miscompiled, works fine with -fno-ipa-modref

2021-11-26 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103432 --- Comment #3 from hubicka at kam dot mff.cuni.cz --- Caused by stupid thinko (also present in gcc11). I compute right min_flags but then use wrong value (without dereference applied). I am testing the following. diff --git a/gcc/ipa-modref.c

[Bug ipa/103441] [12 Regression] ICE in cgraph_node::verify_node() building libgo on powerpc64le-linux-gnu (--with-cpu=power9)

2021-11-26 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103441 --- Comment #3 from hubicka at kam dot mff.cuni.cz --- > #0 gimple_set_bb (stmt=0x3fffb01a2be0, bb=0x0) at ../../gcc/gimple.c:1772 > #1 0x107209b0 in gsi_remove (i=0x3fffd7c8, > remove_permanently=) at

[Bug tree-optimization/103409] [12 Regression] 18% WRF compile-time regression with -O2 -flto between g:264f061997c0a534 and g:3e09331f6aeaf595

2021-11-25 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409 --- Comment #2 from hubicka at kam dot mff.cuni.cz --- > The two main changes during that time period was jump threading and modref. > modref seems might be more likely with wrf being fortran code and even using > nested functions and such.

[Bug tree-optimization/103223] [12 regression] Access attribute dropped when ipa-sra is applied

2021-11-21 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103223 --- Comment #11 from hubicka at kam dot mff.cuni.cz --- > Xeon(R) Platinum 8358 (IceLake) (64C 128T 512G): > BenchMarks Copies RunTime1RunTime2Rate1 Rate2 > Compare > 548.exchange2_r 128 479 913 700 367

[Bug tree-optimization/103168] Value numbering for PRE of pure functions can be improved

2021-11-22 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103168 --- Comment #9 from hubicka at kam dot mff.cuni.cz --- > so indeed that's an issue. So it's a bug fixed, not an optimization > regression. I know, but the bug was fixed in unnecesarily generous way preventing a lot of valid tranforms

[Bug tree-optimization/103168] Value numbering for PRE of pure functions can be improved

2021-11-22 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103168 --- Comment #12 from hubicka at kam dot mff.cuni.cz --- > unsigned p; > unsigned __attribute__((noinline)) test (void) > { > return p; > } > > modref analyzing 'test' (ipa=0) (pure) > - Analyzing load: p >- Recording base_set=0 ref_set=0

[Bug ipa/103227] [12 Regression] 58% exchange2 regression with -Ofast -march=native on zen3 since r12-5223-gecdf414bd89e6ba251f6b3f494407139b4dbae0e

2021-11-20 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227 --- Comment #9 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227 > ... fixing this problem properly. > I just loked into thi again and we already have code that preserves > propagates bits on pointer

[Bug ipa/103227] [12 Regression] 58% exchange2 regression with -Ofast -march=native on zen3 since r12-5223-gecdf414bd89e6ba251f6b3f494407139b4dbae0e

2021-11-20 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227 --- Comment #8 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103227 > > --- Comment #7 from Martin Jambor --- > (In reply to hubicka from comment #5) > > > I like the idea of transformation phases

[Bug ipa/103211] [12 Regression] 416.gamess crashes after r12-5177-g494bdadf28d0fb35

2021-11-12 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103211 --- Comment #3 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103211 > > --- Comment #2 from Martin Liška --- > Optimized dump differs for couple of functions in the same way: > > diff -u good bad > ---

[Bug ipa/103277] [12 Regression] ICE in branch_prob with -O1 -fbranch-probabilities -fno-ipa-pure-const since r12-5236-g5aa91072e24c1e16

2021-11-18 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103277 --- Comment #4 from hubicka at kam dot mff.cuni.cz --- > Btw. started with r12-5236-g5aa91072e24c1e16. Yep, I know - it is modref based DSE that lets us to enable that call as dead. So the bug is technically mine if Richi decides to pass it to

[Bug tree-optimization/103168] Value numbering for PRE of pure functions can be improved

2021-11-18 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103168 --- Comment #5 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103168 > > --- Comment #4 from Richard Biener --- > (In reply to Jan Hubicka from comment #3) > > This is simple (and fairly common) case we

[Bug tree-optimization/103168] Value numbering for PRE of pure functions can be improved

2021-11-18 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103168 --- Comment #6 from hubicka at kam dot mff.cuni.cz --- > bool unknown_memory_access = false; > if (summary = get_modref_function_summary (stmt, NULL)) > { > /* First search if we can do someting useful. > Like for dse it

[Bug tree-optimization/103423] [12 Regression] 19% cpu2006 wrf compile time regression with -flto since r12-3903-g0288527f47cec669

2021-11-25 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103423 --- Comment #1 from hubicka at kam dot mff.cuni.cz --- Martin, My original report here was on regression at July 17 2021 (range g:0b7a11874d4eb428 and g:704e8a825c78b9a8) which seems unrelated to g:r12-3903-g0288527f47cec669 which is in Sep 21

[Bug tree-optimization/103409] [12 Regression] 18% SPEC2017 WRF compile-time regression with -O2 -flto since r12-3903-g0288527f47cec669

2021-11-25 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409 --- Comment #6 from hubicka at kam dot mff.cuni.cz --- > Started with r12-3903-g0288527f47cec669. This is September change (for which we have PR102943) however the regression range was g:1ae8edf5f73ca5c3 (or g:264f061997c0a534 on second plot)

[Bug d/103040] [12 Regression] gdc.dg/torture/pr101273.d FAILs

2021-11-02 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103040 --- Comment #13 from hubicka at kam dot mff.cuni.cz --- > See above comments from Iain, even if that pre-initialization is removed it is > still miscompiled. And, the testcase fails not because of the padding bits > not > being zero, but

[Bug d/103040] [12 Regression] gdc.dg/torture/pr101273.d FAILs

2021-11-02 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103040 --- Comment #16 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103040 > > --- Comment #15 from Iain Buclaw --- > Got it. The difference between D and C++ is a matter of early inlining. > > The C++

[Bug d/103040] [12 Regression] gdc.dg/torture/pr101273.d FAILs

2021-11-02 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103040 --- Comment #17 from hubicka at kam dot mff.cuni.cz --- > Great, I will take a look now (I was travelling that is why i did not > started earlier) Found it - there is a thinko in way NOT_RETURNED flag is handled in the call statement analysis.

[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast between ce4d1f632ff3f680550d3b186b60176022f41190 and 6fca1761a16c68740f875fc487b98b6bde8e9be7 on Z

2021-10-29 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 --- Comment #4 from hubicka at kam dot mff.cuni.cz --- > Not seen on Haswell (but w/o PGO). Is this PGO specific? There's another > large jump visible end of 2019. This is kabylake LTO+PGO+march=native

[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast between ce4d1f632ff3f680550d3b186b60176022f41190 and 6fca1761a16c68740f875fc487b98b6bde8e9be7 on Z

2021-10-29 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 --- Comment #5 from hubicka at kam dot mff.cuni.cz --- > Not seen on Haswell (but w/o PGO). Is this PGO specific? There's another > large jump visible end of 2019. It is between 2019-11-15 and 18 but the revisions does not exist at git -

[Bug tree-optimization/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2021-10-29 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #16 from hubicka at kam dot mff.cuni.cz --- > It will only help for V2DF I think, so no, not really. But an IPA idea of > whether there's cross-call STLF issues might be nice. > > Generally doing wider stores is fine but of course

[Bug tree-optimization/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2021-10-28 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #12 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 > > --- Comment #11 from Richard Biener --- > -mtune-ctrl=^sse_unaligned_load_optimal fixes the observed regression. Interesting. I

[Bug tree-optimization/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2021-10-28 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #9 from hubicka at kam dot mff.cuni.cz --- > Not inlining ray_sphere at -O2 is of course what makes it overall slow. ray_spehere is not at all that small function. We already play tricks at -O3 to inline it by detecting that some

[Bug ipa/102982] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)

2021-10-28 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102982 --- Comment #4 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102982 > > Richard Biener changed: > >What|Removed |Added >

[Bug ipa/102982] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs 11.2.0)

2021-10-28 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102982 --- Comment #6 from hubicka at kam dot mff.cuni.cz --- > > fixup_cfg already removes write-only stores so that seems fit for that > purpose. > > Btw, > > static int x = 1; > > int main() > { > x = 1; > } > > should ideally be handled as

[Bug tree-optimization/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2021-10-28 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #10 from hubicka at kam dot mff.cuni.cz --- >| b = 2.0 * ray.dir.x * (ray.orig.x - sph->pos.x) + > # >| movupd (%rdi),%xmm5 > # >| 2.0

[Bug ipa/103073] [12 Regression] ICE in insert_access, at ipa-modref-tree.h:578 since r12-4401-gfecd145359fc981b

2021-11-05 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103073 --- Comment #8 from hubicka at kam dot mff.cuni.cz --- > Well, the usual thing to do is to check max_size_known_p () and > if maybe_ne (max_size, size) then use [offset, max_size] for > disambiguation. I think for modref you can do the same -

[Bug ipa/103073] [12 Regression] ICE in insert_access, at ipa-modref-tree.h:578 since r12-4401-gfecd145359fc981b

2021-11-05 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103073 --- Comment #12 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103073 > > --- Comment #10 from Martin Liška --- > > This bootstraps/regtests and fixes the testcase. Does it look sane to > > you? > >

[Bug ipa/103073] [12 Regression] ICE in insert_access, at ipa-modref-tree.h:578 since r12-4401-gfecd145359fc981b

2021-11-05 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103073 --- Comment #13 from hubicka at kam dot mff.cuni.cz --- > > diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h > > index 9976e489697..1b51323175b 100644 > > --- a/gcc/ipa-modref-tree.h > > +++ b/gcc/ipa-modref-tree.h > > @@ -813,6

[Bug fortran/103058] [12 Regression] ICE in gimple_call_static_chain_flags, at gimple.c:1669 when building 527.cam4_r

2021-11-05 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103058 --- Comment #10 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103058 > > --- Comment #9 from Martin Liška --- > And WPA cgraph dump tells: > > quick_sort_1.1/213 (quick_sort_1) @0x774c2550 >

[Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

2021-11-04 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943 --- Comment #19 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943 > > Aldy Hernandez changed: > >What|Removed |Added >

[Bug ipa/103058] [12 Regression] ICE in gimple_call_static_chain_flags, at gimple.c:1669 when building 527.cam4_r

2021-11-03 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103058 --- Comment #1 from hubicka at kam dot mff.cuni.cz --- > One can see it with -O2 -flto=auto -march=znver2: > > radsw.fppized.f90:39:19: internal compiler error: in > gimple_call_static_chain_flags, at gimple.c:1669 >39 | subroutine

[Bug ipa/103055] [12 Regression] ICE: in get_ssa_name_flags, at ipa-modref.c:1660 with --param=modref-max-depth=0 since r12-4852-g18f0873d1e595dc2

2021-11-03 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103055 --- Comment #2 from hubicka at kam dot mff.cuni.cz --- > Confirmed, started with r12-4852-g18f0873d1e595dc2. Depth=0 means that we do no analysis at all and the assert test that some analysis was done. I suppose we could ignore depth 0 and

[Bug ipa/103058] [12 Regression] ICE in gimple_call_static_chain_flags, at gimple.c:1669 when building 527.cam4_r

2021-11-04 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103058 --- Comment #4 from hubicka at kam dot mff.cuni.cz --- Hi, I am testing the following to unbreak fortran. However the real bug is that binds_to_current_def should work on whole WPA and be independent of partitioning. I remember I had patch

[Bug ipa/103080] LTO alters the ordering of static constructors/destructors in pass_ipa_cdtor_merge.

2021-11-04 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103080 --- Comment #1 from hubicka at kam dot mff.cuni.cz --- The cdtor merging code is predating LTO - it is also used for collect2 path on targets w/o cdtor sections. I guess the DECL_UID compare is not very safe things to do since it depends on the

[Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

2021-11-07 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943 --- Comment #27 from hubicka at kam dot mff.cuni.cz --- > > This PR is still open, at least for slowdown in the threader with LTO. The > issue is ranger wide, so it may also cause slowdowns on non-LTO builds for > WRF, though I haven't

[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

2021-11-08 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 --- Comment #19 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 > > --- Comment #18 from Aldy Hernandez --- > > > If I read it correctly, for a path that enters the loop and later leaves > > it

[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

2021-11-08 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 --- Comment #16 from hubicka at kam dot mff.cuni.cz --- Note that it still seems to me that the crossed_loop_header handling is overly conservative. We have: @ -2771,6 +2771,7 @@ jt_path_registry::cancel_invalid_paths (vec ) bool seen_latch

[Bug tree-optimization/103117] uncprop produces harder to analyze but not better code

2021-11-08 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103117 --- Comment #2 from hubicka at kam dot mff.cuni.cz --- > I suppose modref could (for pointer returns) use ranger to query its range > and see if it ever is non-NULL? I'm not sure if we reliably propagate > null pointer constants everywhere. I

[Bug tree-optimization/103117] uncprop produces harder to analyze but not better code

2021-11-08 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103117 --- Comment #4 from hubicka at kam dot mff.cuni.cz --- > > I don't know - this way we have separate dumps etc. I think mistake was > > scheduling pure-const and later modref too late. > > Maybe. If you move them please put a comment before

[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

2021-11-08 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 --- Comment #21 from hubicka at kam dot mff.cuni.cz --- > to also allow to thread through a loop path not crossing the latch but > at least for the issue of "breaking loops" the loops_crossed stuff shouldn't > be necessary. It might still

[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

2021-11-08 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 --- Comment #23 from hubicka at kam dot mff.cuni.cz --- > We verify that by simply looking at the loop depth relation of > the entry and exit of the path. Which seem wrong for the path leaving loop and entering another... > > > It seems to me

[Bug middle-end/102997] [12 Regression] 45% 454.calculix regression with LTO+PGO -march=native -Ofast on Zen since r12-4526-gd8edfadfc7a9795b65177a50ce44fd348858e844

2021-11-01 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102997 --- Comment #10 from hubicka at kam dot mff.cuni.cz --- > > Hmmm, this commit disables problematic threads we've agreed are detrimental to > loop form. So it's not something the threader did, but something it's not > allowed to do. This PR

[Bug tree-optimization/103409] [12 Regression] 18% SPEC2017 WRF compile-time regression with -O2 -flto since r12-5228-gb7a23949b0dcc4205fcc2be6b84b91441faa384d

2021-12-01 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103409 --- Comment #13 from hubicka at kam dot mff.cuni.cz --- > I've fixed the threading slowdown. Can someone verify and close this PR if > all > the slowdown has been accounted for? If not, then someone needs to explore > any > slowdown

[Bug ipa/103636] [12 Regression] Clang build fails with -flto -fno-strict-aliaisng -flifetime-dse=1 -fprofile-generate

2021-12-10 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103636 --- Comment #7 from hubicka at kam dot mff.cuni.cz --- I use cmake -G "Unix Makefiles" /home/jh/llvm-project/llvm -DCLANG_TABLEGEN=/home/jh/llvm-project/llvm/out/stage1/bin/clang-tblgen -DCMAKE_BUILD_TYPE=Release

[Bug ipa/103601] [12 Regression] ICE in insert_kill, at ipa-modref-tree.c:84 since r12-5244-g64f3e71c302b4a13

2021-12-10 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103601 --- Comment #5 from hubicka at kam dot mff.cuni.cz --- Thanks Roger and Andrew! It was on my TODO for weekend and I am very happy you beat me :)

[Bug gcov-profile/103652] Producing profile with -O2 -flto and trying to consume it with -O3 -flto leads to ICEs on indirect call profiling

2021-12-13 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103652 --- Comment #2 from hubicka at kam dot mff.cuni.cz --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103652 > > --- Comment #1 from Martin Liška --- > (In reply to Jan Hubicka from comment #0) > > Building clang in the funny way (training

[Bug gcov-profile/103652] Producing profile with -O2 -flto and trying to consume it with -O3 -flto leads to ICEs on indirect call profiling

2021-12-13 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103652 --- Comment #4 from hubicka at kam dot mff.cuni.cz --- > > Well, I'm specifically speaking about: > error: the control flow of function ‘BZ2_compressBlock’ does not match its > profile data (counter ‘arcs’) > > this type of errors should not

[Bug fortran/103662] TBAA problem in Fortran FE triggering in gfortran.dg/unlimited_polymorphic_3.f03

2021-12-14 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103662 --- Comment #4 from hubicka at kam dot mff.cuni.cz --- > Can you explain in simple words why adding > > if (ptr1%k .ne. 42) print * > > before the line > > if (ptr1%k .ne. 42) STOP 2 > > makes the test succeed, but adding it after that

[Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2021-12-14 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782 --- Comment #24 from hubicka at kam dot mff.cuni.cz --- > Awesome! thanks! > > I wonder if we can get rid of the final magic parameter too, we run with > --param ipa-cp-unit-growth=80 too which seems to have no more effect on > exchange, though

[Bug rtl-optimization/98782] [11/12 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2021-12-14 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782 --- Comment #26 from hubicka at kam dot mff.cuni.cz --- > It's with LTO, I'll see if non-LTO has the same benefit. In terms of > code-size > it looks like it accounts for a 20% increase for binary size, but the hot > function shrinks approx 6x.

[Bug ipa/103734] IPA-CP opportunity for imagick in SPECCPU 2017

2021-12-15 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103734 --- Comment #1 from hubicka at kam dot mff.cuni.cz --- I think ipa-cp heuristics still needs some work. It is nice that we got it to do something, but I just checked and with LTO+PGO build of clang it produces cca 30 clones that are not "for

[Bug ipa/103830] [12 Regression] null pointer access optimized away by removing function call at -Og

2022-01-04 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103830 --- Comment #5 from hubicka at kam dot mff.cuni.cz --- > I think the recent modref change made the function const. > > And no, we shouldn't DSE any volatile store and generally we don't. It's > probably some side-effect of modref that we do.

[Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103989 --- Comment #10 from hubicka at kam dot mff.cuni.cz --- > And I'm intentionally not doing this because -Og should still remove > abstraction during early inlining (for functions marked 'inline'), we > just don't want to spend the extra compile

[Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103989 --- Comment #7 from hubicka at kam dot mff.cuni.cz --- > --- Comment #6 from Richard Biener --- > Honza, -Og was supposed to not do so much work, I intended to disable IPA > inlining but there's no knob for that. I wonder where to best put

[Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103989 --- Comment #8 from hubicka at kam dot mff.cuni.cz --- > You can not disable an IPA pass becasuse then we will mishandle > optimize attributes. I think you simply want to set > > flag_inline_small_functions = 0 >

[Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103989 --- Comment #12 from hubicka at kam dot mff.cuni.cz --- > Yeah, and since we inline all always inline and also flatten during > early inline the IPA inliner should really do nothing. OK, can_inline_edge_p will do that but we will still walk the

[Bug tree-optimization/103989] [12 regression] std::optional and bogus -Wmaybe-unitialized at -Og since r12-1992-g6feb628a706e86eb

2022-01-13 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103989 --- Comment #14 from hubicka at kam dot mff.cuni.cz --- > > Sure - I just remember (falsely?) that we finally decided to do it :) I do not recall this, but I may have forgotten :)) > If we don't run IPA inline we don't figure we failed to

[Bug rtl-optimization/98782] [11 Regression] Bad interaction between IPA frequences and IRA resulting in spills due to changes in BB frequencies

2022-01-11 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98782 --- Comment #42 from hubicka at kam dot mff.cuni.cz --- on zen2 and 3 with -flto the speedup seems to be cca 12% for both -O2 and -Ofast -march=native which is both very nice! Zen1 for some reason sees less improvement, about 6%. With PGO it is

[Bug fortran/103662] [12 Regression] TBAA problem in Fortran FE triggering in gfortran.dg/unlimited_polymorphic_3.f03

2022-01-17 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103662 --- Comment #9 from hubicka at kam dot mff.cuni.cz --- > I'm inclined to make this P1 even though it is gfortran only. As a last > resort > it should work to make the receiver side a ref-all pointer. Yes, I also think this is important bug

[Bug ipa/95558] [9/10/11/12 Regression] Invalid IPA optimizations based on weak definition

2022-01-17 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95558 --- Comment #8 from hubicka at kam dot mff.cuni.cz --- > > Do weak aliases fall under some implicit ODR here? > > The whole definition of "weak" is that it entitles you to make a definition > that will be exempt from ODR, where a non-weak

[Bug ipa/95558] [9/10/11/12 Regression] Invalid IPA optimizations based on weak definition

2022-01-17 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95558 --- Comment #13 from hubicka at kam dot mff.cuni.cz --- > Result pure looping 0 > Function found to be pure: foo/4 This is good - we are supposed to find it to be pure and walk all aliases and update noninterposable ones > Declaration updated to

[Bug tree-optimization/102943] [12 Regression] Jump threader compile-time hog with 521.wrf_r

2022-03-17 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102943 --- Comment #50 from hubicka at kam dot mff.cuni.cz --- > It helps quite a bit, the worst case is now > > tree VRP : 5.14 ( 7%) 0.02 ( 3%) 5.15 ( > 7%) >2 > 9M ( 3%) > backwards jump threading

[Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

2022-01-27 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 --- Comment #13 from hubicka at kam dot mff.cuni.cz --- > > According to znver2_cost > > > > Cost of sse_to_integer is a little bit less than fp_store, maybe increase > > sse_to_integer cost(more than fp_store) can helps RA to choose memory > >

[Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

2022-01-27 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 --- Comment #16 from hubicka at kam dot mff.cuni.cz --- > > Yep, we also have code like > > - movabsq $0x3ff03db8fde2ef4e, %r8 > ... > - vmovq %r8, %xmm11 It is loading random constant to xmm11. Since reg<->xmm moves are

[Bug rtl-optimization/102178] [12 Regression] SPECFP 2006 470.lbm regressions on AMD Zen CPUs after r12-897-gde56f95afaaa22

2022-01-27 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102178 --- Comment #21 from hubicka at kam dot mff.cuni.cz --- > I would say so. It saves code size and also uop space unless the two > can magically fuse to a immediate to %xmm move (I doubt that). I made simple benchmark double a=10; int main() {

[Bug tree-optimization/103423] [12 Regression] 19% cpu2006 wrf compile time regression with -flto since r12-2353-g8da8ed435e9f01b3

2022-01-18 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103423 --- Comment #6 from hubicka at kam dot mff.cuni.cz --- > Fixed, the links now show better than ever numbers. It is only fixed by not inlining enough (since I added --param max-inline-functions-called-once). Without LTO we still have quite

[Bug tree-optimization/103195] [12 Regression] tfft2 text grows by 70% with -Ofast since r12-5113-gd70ef65692fced7a

2022-01-18 Thread hubicka at kam dot mff.cuni.cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103195 --- Comment #6 from hubicka at kam dot mff.cuni.cz --- > So nothing to see? I guess our unit growth limit doesn't trigger because it's > a small (benchmark) unit? Yep, unit growths do not apply for very small units. ipa-cp heuristics still IMO

  1   2   >