[Bug tree-optimization/114774] Missed DSE in simple code due to interleaving sotres

2024-04-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114774 --- Comment #5 from Jan Hubicka --- > > Looking into it, instead of having simple outer loop it needs to > > maintain worklist of defs to proceed each annotated with live bitmap, > > rigt? > > Yeah, I have some patch on some branch somewhere

[Bug tree-optimization/114774] Missed DSE in simple code due to interleaving sotres

2024-04-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114774 --- Comment #3 from Jan Hubicka --- > Yes, DSE walking doesn't "branch" but goes to some length handling some > trivial > branches only. Mainly to avoid compile-time issues. It needs larger > re-structuring to fix that, but in principle it

[Bug ipa/114703] Missed devirtualization in rather simple case

2024-04-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114703 --- Comment #3 from Jan Hubicka --- > Yep, 'new' memory escapes. Yep, this is blocking a lot of propagation in common C++ code. Here it may help to do speculative devirtualization during IPA stage that will let the late optimization to get rid

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41

2024-04-09 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #76 from Jan Hubicka --- There is still problem with loop bounds. I am testing patch on that and then we should be (finally) finally safe.

[Bug gcov-profile/114115] xz-utils segfaults when built with -fprofile-generate (bad interaction between IFUNC and binding?)

2024-04-03 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114115 --- Comment #15 from Jan Hubicka --- > Fixed for GCC 14 so far It is simple patch, so backporting is OK after a week in mainline.

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41

2024-04-02 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #70 from Jan Hubicka --- Hello, over easter I did some analysis of the cases where ICF is now disabled due to jump function miscompare. Most common case (seen also on GCC) is the situation where function is originally static inline

[Bug tree-optimization/112303] [14 Regression] ICE on valid code at -O3 on x86_64-linux-gnu: verify_flow_info failed since r14-3459-g0c78240fd7d519

2024-03-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112303 --- Comment #14 from Jan Hubicka --- > This patch fixes the ICE for me. > Seems we already did something like that in other spots (e.g. in apply_scale). In general if the overflow happens, some pass must have misbehaved and do something crazy

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-03-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #64 from Jan Hubicka --- > Are you going to apply this patch, even if it just helps partially with some > tests and not others? I think we should fix this completely, since it is source of very suprising bugs. I discussed it with

[Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-03-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #57 from Jan Hubicka --- > So, we can punt on differences there (that is desirable for backporting and > maybe GCC 14 too), or we could at that point populate an int vector, which > maps Yep, that is what I do. I had bug in that so

[Bug ipa/114317] Missing optimization for multiple condition statements

2024-03-12 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114317 --- Comment #2 from Jan Hubicka --- > (it would need to elide the stores of course). We do have way to elide stores, since we can optimize out write-only values. What we do not have readilly available is the value written to a reference

[Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function

2024-03-07 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114262 --- Comment #6 from Jan Hubicka --- > Note GCC has not retuned its -Os heurstics for a long time because it has been > decent enough for most folks and corner cases like this is almost never come > up. There were quite few changes to -Os

[Bug lto/114241] False-positive -Wodr warning when using -flto and -fno-semantic-interposition

2024-03-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114241 --- Comment #2 from Jan Hubicka --- This indeed looks like bug caused by fact that the class is keyed into one of the two units. Outputting translation unit names is unfortunately hard, since they are object files and often comming from .a

[Bug target/114232] [14 regression] ICE when building rr-5.7.0 with LTO on x86

2024-03-05 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114232 --- Comment #26 from Jan Hubicka --- > I think optimize_function_for_size_p (cfun) isn't always true if > optimize_size is since it looks at the function-specific setting > of that flag, so you'd have to use opt_for_fn (cfun, optimize_size).

[Bug target/114232] [14 regression] ICE when building rr-5.7.0 with LTO on x86

2024-03-05 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114232 --- Comment #21 from Jan Hubicka --- Looking at the prototype patch, why need to change also the splitters? My original goal was to use splitters to expand to faster code sequences while having patterns necessary for both variants. This makes

[Bug target/114232] [14 regression] ICE when building rr-5.7.0 with LTO on x86

2024-03-05 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114232 --- Comment #18 from Jan Hubicka --- optimize_function_for_size_p is not really affected by LTO or non-LTO. It does take into account node->count and node->frequency, which is updated during IPA, so it may change between early opts and late

[Bug tree-optimization/114052] [11/12/13/14 Regression] Wrong code at -O2 for well-defined infinite loop

2024-02-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114052 --- Comment #7 from Jan Hubicka --- > I see it doesn't do anything if mark_dfs_back_edges returns false, so it > will claim the function is finite even when it calls a non-finite function? > So I assume this is local analysis only and call

[Bug ipa/111960] [14 Regression] ICE: during GIMPLE pass: rebuild_frequencies: SIGSEGV (Invalid read of size 4) with -fdump-tree-rebuild_frequencies-all

2024-02-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111960 --- Comment #13 from Jan Hubicka --- > Should be fixed now. Thanks! I was testing with stage3 compiler, so that is the reason. Indeed dropping the buffer is a good idea.

[Bug middle-end/113907] [12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #45 from Jan Hubicka --- > > "Once legacy evrp is removed, this won't be an issue, as ranges in the IL > > will tell the truth. However, this will mean that we will no longer > > remove the first __builtin_unreachable combo. But

[Bug middle-end/113907] [12/13/14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #43 from Jan Hubicka --- > // See discussion here: > // https://gcc.gnu.org/pipermail/gcc-patches/2021-June/571709.html Discussion says: "Once legacy evrp is removed, this won't be an issue, as ranges in the IL will tell the

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #37 from Jan Hubicka --- > Also remember we like to have a fix that's easily backportable, and > that's probably going to be resetting the info. We can do something > more fancy for GCC 15 Rejecting to merge function with

[Bug middle-end/113907] [14 regression] ICU miscompiled since on x86 since r14-5109-ga291237b628f41

2024-02-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113907 --- Comment #36 from Jan Hubicka --- > > Having a testcase is great. I was just playing with crafting one. > > I am still concerned about value ranges in ipa-prop's jump functions. > > Maybe my imagination is too limited, but if the ipa-prop's

[Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-02-14 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787 --- Comment #19 from Jan Hubicka --- > Note I didn't check if it helps the testcase .. I will check. > > > > > > > A "nicer" solution might be to add a informational operand > > > to TARGET_MEM_REF, representing the base pointer to be used

[Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-02-14 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787 --- Comment #17 from Jan Hubicka --- > > I guess PTA gets around by tracking points-to set also for non-pointer > > types and consequently it also gives up on any such addition. > > It does. But note it does _not_ for POINTER_PLUS where it

[Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-02-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113787 --- Comment #15 from Jan Hubicka --- > > IVOPTs does the above but it does it (or should) as > > offset = (uintptr) - (uintptr) > val = *((T *)((uintptr)base1 + i + offset)) > > which is OK for points-to as no POINTER_PLUS_EXPR is

[Bug gcov-profile/113646] PGO hurts run-time of 538.imagick_r as much as 68% at -Ofast -march=native

2024-02-01 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113646 --- Comment #4 from Jan Hubicka --- > > With -fprofile-partial-training the znver4 LTO vs LTOPGO regression (on a > newer > master) goes down from 66% to 54%. > > So far I did not find a way to easily train with the reference run (when I

[Bug ipa/113665] [11/12/13/14 regression] Regular for Loop results in Endless Loop with -O2 since r11-4987-g602c6cfc79ce4a

2024-01-30 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113665 --- Comment #8 from Jan Hubicka --- > Honza - ICF seems to fixup points-to sets when merging variables, so there > should be a way to kill off flow-sensitive info inside prevailing bodies > as well. But would that happen before inlining the

[Bug gcov-profile/113646] PGO hurts run-time of 538.imagick_r as much as 68% at -Ofast -march=native

2024-01-29 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113646 --- Comment #2 from Jan Hubicka --- > Did you try with -fprofile-partial-training (is that default on? it probably > should ...). Can you please try training with the rate data instead of train It is not on by default - the problem of

[Bug ipa/113478] -Os does not inline single instruction function

2024-01-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113478 --- Comment #4 from Jan Hubicka --- > Possibly, at least when we know it doesn't expand to a libatomic call? OTOH > even then a function just wrapping such call should probably be inlined, > so the question is whether the problem that > is

[Bug ipa/113478] -Os does not inline single instruction function

2024-01-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113478 --- Comment #2 from Jan Hubicka --- Probably is_inexpensive_bulitin_p should return true here?

[Bug c++/109753] [13/14 Regression] pragma GCC target causes std::vector not to compile (always_inline on constructor)

2024-01-11 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109753 --- Comment #14 from Jan Hubicka --- > I think the issue might be that whoever is creating > __static_initialization_and_destruction_0 fails to honor the active > target pragma. Which means back to my suggestion to have multiple ones > when

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 --- Comment #14 from Jan Hubicka --- > I thought the goal was to handle what is in predict-18.c, i.e. > b * __builtin_expect (c, 0) > or similar. If it is about > __builtin_expect_with_probability (b, 42, 0.25) * >

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 --- Comment #11 from Jan Hubicka --- > > + int p1 = get_predictor_value (*predictor, *probability); > > + int p2 = get_predictor_value (predictor2, probability2); > > + /* If both predictors agrees, it does not matter

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 --- Comment #9 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 > > --- Comment #7 from Jakub Jelinek --- > So, what about following patch (which also fixes the ICE, would of course need > to add the testcase) and

[Bug tree-optimization/110852] [14 Regression] ICE: in get_predictor_value, at predict.cc:2695 with -O -fno-tree-fre and __builtin_expect() since r14-2219-geab57b825bcc35

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110852 --- Comment #6 from Jan Hubicka --- > which fixes the ICE by preferring PRED_BUILTIN_EXPECT* over others. > At least in this case when one operand is a constant and another one is > __builtin_expect* result that seems like the right choice to

[Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113233 --- Comment #3 from Jan Hubicka --- > Confirm. But option save/restore has been always implemented: > > .section.gnu.lto_.opts,"",@progbits > .ascii "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection" > .ascii "=none'

[Bug middle-end/88345] -Os overrides -falign-functions=N on the command line

2023-12-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 --- Comment #20 from Jan Hubicka --- > > Live patching (user-space) doesn't depend on any particular alignment of > functions, on x86-64 at least. (The plan for other architectures wouldn't > need > any specific alignment either). Note that

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-11-29 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849 --- Comment #32 from Jan Hubicka --- > /tmp/build/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/stl_algobase.h:437: > warning: 'void* __builtin_memcpy(void*, const void*, long unsigned int)' > writing between 2 and 9223372036854775806 bytes

[Bug middle-end/112653] PTA should handle correctly escape information of values returned by a function

2023-11-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653 --- Comment #15 from Jan Hubicka --- Thanks a lot for working on this! I think it is quite importnat part of the puzzle of making libstdc++ vector working reasonably well.

[Bug tree-optimization/112706] missed simplification in FRE

2023-11-24 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112706 --- Comment #3 from Jan Hubicka --- Thanks, new pattern looks like noticeable improvement :) Base+offset is effective for alias analysis and I suppose it happens reasonably enough for compares as well. > _76 = _71 + 4; > # .MEM_154 = VDEF

[Bug tree-optimization/112678] [14 regression] Massive slowdown of compilation time with PGO

2023-11-23 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112678 --- Comment #2 from Jan Hubicka --- Seems we changed default to locking increments. jh@ryzen4:/tmp> cat t.C void test() { } jh@ryzen4:/tmp> ~/trunk-install/bin/g++ -O2 -fprofile-generate t.C -S ; grep lock t.s lock addl $1,

[Bug middle-end/112653] We should optimize memmove to memcpy using alias oracle

2023-11-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112653 --- Comment #5 from Jan Hubicka --- > but the issue is that test2 escapes which makes this conflict: It is passed to memmove which is noescape and returned. Why local PTA considers returned values to escape?

[Bug tree-optimization/111498] 951% profile quality regression between g:93996cfb308ffc63 (2023-09-18 03:40) and g:95d2ce05fb32e663 (2023-09-19 03:22)

2023-09-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111498 --- Comment #2 from Jan Hubicka --- > That just might cause a tid more early threading. That is, expose latent > profile updating issues elsewhere. Looking at the graph we're also still very > good compared to July. Early threading should

[Bug ipa/111157] [14 Regression] 416.gamess fails with a run-time abort when compiled with -O2 -flto after r14-3226-gd073e2d75d9ed4

2023-08-29 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57 --- Comment #8 from Jan Hubicka --- > This is what I wanted to ask about. Looking at the dumps, ipa-modref > knows it is "killed." Is that enough or does it need to be also not > read to be know to be useless? The killed info means that the

[Bug tree-optimization/110628] [14 regression] gcc.dg/tree-ssa/update-threading.c fails after r14-2383-g768f00e3e84123

2023-08-24 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628 --- Comment #8 from Jan Hubicka --- patch posted https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628231.html

[Bug ipa/111088] useless 'xor eax,eax' inserted when a value is not returned and icf

2023-08-21 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111088 --- Comment #3 from Jan Hubicka --- > But adds a return with a value. And then the inliner inlines foo into foo2 but > we still have the return with a value around ... I guess ICF can special case unused return value, but why this is not taken

[Bug tree-optimization/110628] [14 regression] gcc.dg/tree-ssa/update-threading.c fails after r14-2383-g768f00e3e84123

2023-08-17 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628 --- Comment #6 from Jan Hubicka --- The mismatch happens on: void foo (unsigned int x) { if (x != 0x800 && x != 0x810) abort (); } It is bug in reassoc turning: void foo (unsigned int x) { ;; basic block 2, loop depth 0, count

[Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2023-07-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106293 --- Comment #19 from Jan Hubicka --- > This heuristic wants to catch > > > if (foo) abort (); > > > and avoid sinking "too far" across a path with "similar enough" > execution count (I think the original motivation was to fix some >

[Bug middle-end/110832] 14% capacita -O2 regression between g:9fdbd7d6fa5e0a76 (2023-07-26 01:45) and g:ca912a39cccdd990 (2023-07-27 03:44) on zen3 and core

2023-07-27 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110832 --- Comment #2 from Jan Hubicka --- I tested that the profile change makes no difference.

[Bug target/110758] [14 Regression] 8% hmmer regression on zen1/3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c9bc5

2023-07-21 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110758 --- Comment #2 from Jan Hubicka --- > I suspect this is most likely the profile updates changes ... Quite possibly. The goal of this excercise is to figure out if there are some bugs in profile estimate or whether passes somehow preffer broken

[Bug tree-optimization/110628] [14 regression] gcc.dg/tree-ssa/update-threading.c fails after r14-2383-g768f00e3e84123

2023-07-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110628 --- Comment #3 from Jan Hubicka --- > -fdump-tree-all-blocks-details produced more than 100 dump files. Which > one(s) > do you want? Can you just zip them an attach all? Thank you! Honza

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-07-11 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #23 from Jan Hubicka --- But it would be nice to see why the functions are not early inlined.

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-07-11 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #22 from Jan Hubicka --- I will cook up the patch to keep multiple variants of nodes pre-inline and we will see how much that affects compile time & how hard it will be to get unit size esitmates right.

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #16 from Jan Hubicka --- > > We already have plenty of GF_CALL_ flags, so adding one should be easy? > > We have 3 bits left :/ I was hoping that cgraph_edge lives long > enough? But I suppose we're not keeping them across the

[Bug tree-optimization/109689] [14 Regression] ICE at -O1 with "-ftree-vectorize": in check_loop_closed_ssa_def, at tree-ssa-loop-manip.cc:645 since r14-301-gf2d6beb7a4ddf1

2023-06-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109689 --- Comment #10 from Jan Hubicka --- > > So perhaps simply: > > rewrite_into_loop_closed_ssa (NULL, 0); > > in case we unlooped in loop closed ssa form (which is not that common). > > Would that be acceptable? > > Yes, we do that in other

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #14 from Jan Hubicka --- > > why disallow caller->indirect_calls? See testcase in comment #9 > > > + return false; > > + for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee) > > I don't think this flys -

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-26 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #11 from Jan Hubicka --- Hi, what about this. It should make at least quite basic inlining to happen to always_inline. I do not think many critical always_inlines have indirect calls in them. The test for lto is quite bad and I can

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-23 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #9 from Jan Hubicka --- Just so it is somewhere, here is a testcase that we can't inline leaf functions to always_inlines unless we do some tracking of what calls were formerly indirect calls. We really overloaded always_inline

[Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-23 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110334 --- Comment #8 from Jan Hubicka --- > > I was playing with the idea of warning when at lto time when comdats have > > different command line options, but this triggers way too often in practice. > > Really? :/ Yep, for example firefox consist

[Bug libstdc++/110287] _M_check_len is expensive

2023-06-19 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287 --- Comment #7 from Jan Hubicka --- > > There is no guarantee that std::vector::max_size() is PTRDIFF_MAX. It > depends on the Allocator type, A. A user-defined allocator could have > max_size() == 100. If inliner we see path to the throw

[Bug libstdc++/110287] _M_check_len is expensive

2023-06-18 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110287 --- Comment #5 from Jan Hubicka --- > Do you mean something like this? I sent my own version, but yours looks nicer. > > diff --git a/libstdc++-v3/include/bits/stl_vector.h > b/libstdc++-v3/include/bits/stl_vector.h > index

[Bug target/109812] GraphicsMagick resize is a lot slower in GCC 13.1 vs Clang 16 on Intel Raptor Lake

2023-05-31 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812 --- Comment #12 from Jan Hubicka --- > /home/sdp/jun/btl0/install/bin/ld: /tmp/ccnX75zI.ltrans0.ltrans.o: in > function `main': > :(.text.startup+0x1): undefined reference to `GMCommand' I wonder if your plugin is configured correctly. Can

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-05-14 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849 --- Comment #5 from Jan Hubicka --- > Actually why didn't we copy the loop header in the first place? Because it is considered to be do-while loop already (thanks to the in-loop conitional, do_while_loop_p is happy).

[Bug middle-end/109849] suboptimal code for vector walking loop

2023-05-14 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849 --- Comment #4 from Jan Hubicka --- > Rather, because store-motion out of a loop that might iterate zero times would > create a data race. Good point. If we did copy loop headers all the way to the store the problem will go away. Also I

[Bug c++/106943] GCC building clang/llvm with LTO flags causes ICE in clang

2023-05-12 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943 --- Comment #19 from Jan Hubicka --- > > Is there any need to over-engineer this like that? I would hope enabling > > -fno-lifetime-dse globally would not be controversial for LLVM It would be really nice to have the ranger bug fixed. Since

[Bug c++/106943] GCC building clang/llvm with LTO flags causes ICE in clang

2023-05-12 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943 --- Comment #15 from Jan Hubicka --- > > Indeed it is quite long time problem with clang not building with lifetime > > DSE and strict aliasing. I wonder why this is not fixed on clang side? > > Because the problems were not communicated? I

[Bug c++/108887] [13 Regression] ICE in process_function_and_variable_attributes since r13-3601

2023-03-09 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108887 --- Comment #8 from Jan Hubicka --- > Also, reset() is only defined in cgraph_node, and I need it to work on both > functions and variables. Aha, this is a good point. I forgot that. I will make reset() working on symbols in general. I think

[Bug c++/101118] coroutines: unexpected ODR warning for coroutine frame type in LTO builds

2023-03-07 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101118 --- Comment #15 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101118 > > --- Comment #14 from Iain Sandoe --- > (In reply to Jan Hubicka from comment #13) > > > So .. for promotion of target expression temporaries to

[Bug c++/101118] coroutines: unexpected ODR warning for coroutine frame type in LTO builds

2023-03-07 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101118 --- Comment #13 from Jan Hubicka --- > So .. for promotion of target expression temporaries to frame vars, one of: > - a) we need to find a different way to name them I think we can just count number of fields within a given frame type? Honza

[Bug c++/108887] [13 Regression] ICE in process_function_and_variable_attributes since r13-3601

2023-03-03 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108887 --- Comment #5 from Jan Hubicka --- > Perhaps, but shouldn't we also unlink_from_assembler_name_hash (node, false);? > I think the point of the current removal is that we've discovered the mangling > alias clashes with some other symbol.

[Bug c++/101118] coroutines: unexpected ODR warning for coroutine frame type in LTO builds

2023-03-03 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101118 --- Comment #8 from Jan Hubicka --- > > the synthesised functions (actor, destroy) are intended to be TU-local. > the ramp function is what remains of the user's original function after the > coroutine body is outlined - so that has the

[Bug ipa/107931] [12/13 Regression] -Og causes always_inline to fail since r12-6677-gc952126870c92cf2

2023-02-21 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107931 --- Comment #20 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107931 > > --- Comment #17 from rguenther at suse dot de --- > On Mon, 20 Feb 2023, jakub at gcc dot gnu.org wrote: > > >

[Bug tree-optimization/108552] Linux i386 kernel 5.14 memory corruption for pre_compound_page() when gcov is enabled

2023-01-28 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552 --- Comment #39 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108552 > > --- Comment #35 from Vladimir Makarov --- > (In reply to Jakub Jelinek from comment #34) > > Seems right now DECL_NONALIASED is only used on these

[Bug bootstrap/107950] partial LTO linking of libbackend.a: gcc/gcc-rich-location.cc:207: undefined reference to `range_label_for_type_mismatch::get_text(unsigned int) const'

2023-01-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107950 --- Comment #9 from Jan Hubicka --- > > Feel free to grab my initial patch in c#0 and upstream it. I tried that some > time ago in the following email thread: > https://gcc.gnu.org/pipermail/gcc/2021-May/236096.html Actually I was shooting

[Bug tree-optimization/107467] [12/13 Regression] Miscompilation involing -Os , -flto and -fno-strict-aliasing since r12-656-ga564da506f52be66

2023-01-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107467 --- Comment #9 from Jan Hubicka --- > > so it's ICFed compare_pairs having modref TBAA info that makes the > stores dead. I suppose ICF needs to reset / alter the modref summaries? Well, matching that ICF does should be enough to verify that

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #9 from Jan Hubicka --- > > Do you mean we should fix modeling of divisions there as well? I don't have > latency/throughput measurements for those CPUs, nor access so I can run > experiments myself, unfortunately. > > I guess you

[Bug tree-optimization/107715] TSVC s161 for double runs at zen4 30 times slower when vectorization is enabled

2022-11-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107715 --- Comment #2 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107715 > > --- Comment #1 from Richard Biener --- > Because store data races are allowed with -Ofast masked stores are not used so > we instead get > >

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #7 from Jan Hubicka --- > 53730 r btver2_fp_min_issue_delay > 53760 r znver1_fp_transitions > 93960 r bdver3_fp_transitions > 106102 r lujiazui_core_check > 106102 r lujiazui_core_transitions > 196123 r lujiazui_core_min_issue_delay

[Bug c++/107597] LTO causes static inline variables to get a non-uniqued global symbol

2022-11-11 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107597 --- Comment #2 from Jan Hubicka --- Hi, What happens is that we read the symbol as: Visibility: externally_visible semantic_interposition prevailing_def_ironly_exp public weak comdat comdat_group:_ZN12NonTemplated1xE one_only While in

[Bug ipa/106991] new+delete pair not optimized by g++ at -O3 but optimized at -Os

2022-09-21 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106991 --- Comment #2 from Jan Hubicka --- > Looks like inlining decisions decide to inline new but not delete but for -Os > we inline none and elide the new/delete pair. > > Maybe we can devise some inline hints to keep pairs? Inliner is mostly

[Bug middle-end/106408] PRE with infinite loops

2022-07-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106408 --- Comment #2 from Jan Hubicka --- > + /* If block is a loop that is possibly infinite we should not > +hoist across it. */ > + if (block->loop_father->header == block > + && !finite_loop_p (block->loop_father)) > +

[Bug ipa/102581] [12 Regression] ice in forced_merge, at ipa-modref-tree.h:352 with -fno-strict-aliasing and -O2 since r12-3202-gf5ff3a8ed4ca9173

2021-10-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102581 --- Comment #8 from Jan Hubicka --- Actually, this is shorter patch - we already should notice that one range is contained in other, but we give up too early. Honza diff --git a/gcc/ipa-modref-tree.h b/gcc/ipa-modref-tree.h index

[Bug ipa/102581] [12 Regression] ice in forced_merge, at ipa-modref-tree.h:352 with -fno-strict-aliasing and -O2 since r12-3202-gf5ff3a8ed4ca9173

2021-10-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102581 --- Comment #7 from Jan Hubicka --- Hi, the problem is that we assume that merge is symmetric (merging a to b succeeds if and only if merging b to a succeeds). There was one symetrical path missing in the (fancy and bit ugly) logic on what we

[Bug tree-optimization/102446] [9/10/11/12 Regression] wrong code at -O3 on x86_64-linux-gnu

2021-09-22 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102446 --- Comment #4 from Jan Hubicka --- > Started with r5-6477-g3620b606822f80863488ca4883542d848d41f9f9 This only affects early inlining decisions, so it may be useful to bisect this with --param early-inlining-insns=14 Honza

[Bug tree-optimization/45178] CDDCE doesn't eliminate conditional code in infinite loop

2021-08-26 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45178 --- Comment #4 from Jan Hubicka --- > and that marks a condition that has nothing to do with loop control. I > suppose > we can elide this when the loop has no exit (we are already marking backedges > of irreducible loops). > > But I never

[Bug tree-optimization/101909] 73% regression on tfft benchmark for -O2 -ftree-loop-vectorize compared to -O2 on zen hardware

2021-08-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101909 --- Comment #2 from Jan Hubicka --- > So that's znver1 (split AVX IIRC) compared to znver2? Martin will know how to decode machine names. I am never sure. It is with generic, so split AVX does not make difference. Honza

[Bug testsuite/101902] [12 regression] g++.dg/warn/uninit-1.C has excess errors after r12-2898

2021-08-15 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101902 --- Comment #1 from Jan Hubicka --- Hi, i am testing diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c index 5d7bc800419..d89ab5423cd 100644 --- a/gcc/tree-ssa-uninit.c +++ b/gcc/tree-ssa-uninit.c @@ -641,7 +641,7 @@

[Bug middle-end/101829] problems with inline + __attribute__ ((malloc (deallocator)))

2021-08-09 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101829 --- Comment #2 from Jan Hubicka --- > It might be possible to inline such functions by creating a "stub" call either > after or before the inlined function body where the "stub" would just be there > to represent the attributes. > > Say,

[Bug fortran/100724] -fwhole-program breaks module use

2021-05-25 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100724 --- Comment #6 from Jan Hubicka --- > Could the manual entry for -fwhole-program just be amended to clarify that > it's > a fallback for when a linker plugin isn't available for -flto. That may be > what it was intended to say, but it's not

[Bug c/100483] Extend -fno-semantic-interposition to global variables

2021-05-16 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100483 --- Comment #6 from Jan Hubicka --- > Thanks for the clarification. I misinterpreted the documentation. > Then it seems that -fno-semantic-interposition is a very safe optimization for > distributions to default to. Closing as intended.

[Bug analyzer/98599] [11 Regression] fatal error: Cgraph edge statement index out of range with -Os -flto -fanalyzer

2021-04-13 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98599 --- Comment #15 from Jan Hubicka --- > Should be fixed by the above patch, though it's more of a workaround for now; > am still not sure about what's going on with clones. I undestand it now. The problem is that fixup is missed for one gimple

[Bug middle-end/99857] [11 Regression] FAIL: libgomp.c/declare-variant-1.c (test for excess errors) by r11-7926

2021-04-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99857 --- Comment #4 from Jan Hubicka --- > Honza stated that he's "looking into it", > . I do just got distracted by easter. Problem has to be release_body happening mid offloading

[Bug lto/99898] Possible LTO object incompatibility on gcc-10 branch

2021-04-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99898 --- Comment #10 from Jan Hubicka --- > Many of the *.opt changes are target specific, so you'd need to test it also > across all targets, and furthermore it depends on what exactly is being > saved/restored, many options might be at the same

[Bug lto/99898] Possible LTO object incompatibility on gcc-10 branch

2021-04-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99898 --- Comment #8 from Jan Hubicka --- > Any *.opt changes can break the streaming of optimization or target option > nodes. > And from experience with gcc plugins we have such changes ~ each month even on > release branches. It may make sense to

[Bug lto/99898] Possible LTO object incompatibility on gcc-10 branch

2021-04-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99898 --- Comment #6 from Jan Hubicka --- > I only reacall backporting the streaming fixes early in gcc10 timeframe > (August) that was reason for the September bump. > Didn't we backport some new command line options/params breaking > streaming of

[Bug lto/99898] Possible LTO object incompatibility on gcc-10 branch

2021-04-06 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99898 --- Comment #5 from Jan Hubicka --- > The LTO minor saw a bump around Sep 10 last year already so the object files > must be younger or LTO should complain. > > I'm not aware of any specific change where we forgot the bumping but there > were

[Bug ipa/99309] [10/11 Regression] Segmentation fault with __builtin_constant_p usage at -O2

2021-03-31 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99309 --- Comment #7 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99309 > > --- Comment #6 from Jakub Jelinek --- > (In reply to Jan Hubicka from comment #5) > > As discussed, I can prepare patch to make inliner to redirect >

[Bug ipa/99447] [11 Regression] ICE (segfault) in lookup_page_table_entry

2021-03-31 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99447 --- Comment #25 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99447 > > --- Comment #23 from Jan Hubicka --- > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99447 > > > > --- Comment #21 from Matthias Klose --- > >

[Bug ipa/99447] [11 Regression] ICE (segfault) in lookup_page_table_entry

2021-03-31 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99447 --- Comment #23 from Jan Hubicka --- > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99447 > > --- Comment #21 from Matthias Klose --- > building with trunk 20210330 using these parameters didn't succeed: > > make[1]: Entering directory

[Bug lto/99828] inlining failed in call to ‘always_inline’ ‘memcpy’: --param max-inline-insns-auto limit reached

2021-03-31 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99828 --- Comment #7 from Jan Hubicka --- > Yeah, and then maybe diagnose this "ODR violation". Still I think we do have this kinds of divergence (like glibcs fortification), so I am not sure we want to warn by default. > >

[Bug ipa/99835] missed optimization for dead code elimination at -O3 (vs. -O1)

2021-03-31 Thread hubicka at ucw dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99835 --- Comment #4 from Jan Hubicka --- > But inside a SCC the order is arbitrary anyway. Note I'd only re-order SCCs > and keep the postordering the same otherwise. We compile leaf functions first to be able to propagated to their callers. In

  1   2   3   4   5   6   7   8   9   10   >