Re: loop-ch improvements, part 5

2023-07-21 Thread Jan Hubicka via Gcc-patches
> > The patch requires bit of testsuite changes > > - I disabled ch in loop-unswitch-17.c since it tests unswitching of > >loop invariant conditional. > > - pr103079.c needs ch disabled to trigger vrp situation it tests for > >(otherwise we optimize stuff earlier and better) > > -

loop-ch improvements, part 5

2023-07-21 Thread Jan Hubicka via Gcc-patches
Hi, currently loop-ch skips all do-while loops. But when loop is not do-while in addition to original goal of turining it to do-while it can do additional things: 1) move out loop invariant computations 2) duplicate loop invariant conditionals and eliminate them in loop body. 3) prove that

Fix profile_count::to_sreal_scale

2023-07-26 Thread Jan Hubicka via Gcc-patches
Hi, this patch makes profile_count::to_sreal_scale consider the scale unknown when in is 0. This fixes the case where loop has 0 executions in profile feedback and thus we can't determine its trip count. Bootstrapped/regtested x86_64-linux, comitted. Honza gcc/ChangeLog: *

Fix profile update in tree_transform_and_unroll_loop

2023-07-27 Thread Jan Hubicka via Gcc-patches
Hi, This patch fixes profile update in tree_transform_and_unroll_loop which is used by predictive comming. I stared by attempt to fix gcc.dg/tree-ssa/update-unroll-1.c I xfailed last week, but it turned to be harder job. Unrolling was never fixed for changes in duplicate_loop_body_to_header_edge

Fix profile update in tree-ssa-loop-im.cc

2023-07-27 Thread Jan Hubicka via Gcc-patches
Hi, this fixes two bugs in tree-ssa-loop-im.cc. First is that cap probability is not reliable, but it is constructed with adjusted quality. Second is that sometimes the conditional has wrong joiner BB count. This is visible on testsuite/gcc.dg/pr102385.c however the testcase triggers another

Fix profile_count::apply_probability

2023-07-27 Thread Jan Hubicka via Gcc-patches
Hi, profile_count::apply_probability misses check for uninitialized probability which leads to completely random results on applying uninitialized probability to initialized scale. This can make difference when i.e. inlining -fno-guess-branch-probability function to -fguess-branch-probability

Make store likely in optimize_mask_stores

2023-07-27 Thread Jan Hubicka via Gcc-patches
Hi, as discussed with Richard, we want store to be likely in optimize_mask_stores. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * tree-vect-loop.cc (optimize_mask_stores): Make store likely. diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index

Fix profile update after RTL unrolling

2023-07-27 Thread Jan Hubicka via Gcc-patches
This patch fixes profile update after RTL unroll, that is now done same way as in tree one. We still produce (slightly) corrupted profile for multiple exit loops I can try to fix incrementally. I also updated testcases to look for profile mismatches so they do not creep back in again.

Fix profile update after loop versioning in vectorizer

2023-07-29 Thread Jan Hubicka via Gcc-patches
Hi, Vectorizer while loop versioning produces a versioned loop guarded with two conditionals of the form if (cond1) goto scalar_loop else goto next_bb next_bb: if (cond2) godo scalar_loop else goto vector_loop It wants the combined test to be prob (whch is set to likely)

Fix sreal::to_int and implement sreal::to_nearest_int

2023-07-21 Thread Jan Hubicka via Gcc-patches
Fix sreal::to_int and implement sreal::to_nearest_int while exploring new loop estimate dumps, I noticed that loop iterating 1.8 times by profile is etimated as iterating once instead of 2 by nb_estimate. While nb_estimate should really be a sreal and I will convert it incrementally, I found

Implement flat loop profile detection

2023-07-21 Thread Jan Hubicka via Gcc-patches
Hi, this patch adds maybe_flat_loop_profile which can be used in loop profile udpate to detect situation where the profile may be unrealistically flat and should not be dwonscalled after vectorizing, unrolling and other transforms that assume that loop has high iteration count even if the CFG

Re: Fix optimize_mask_stores profile update

2023-07-21 Thread Jan Hubicka via Gcc-patches
> On Mon, Jul 17, 2023 at 12:36 PM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > While looking into sphinx3 regression I noticed that vectorizer produces > > BBs with overall probability count 120%. This patch fixes it. > > Richi, I don't know how

Re: [Bug target/110758] [14 Regression] 8% hmmer regression on zen1/3 with -Ofast -march=native -flto between g:8377cf1bf41a0a9d (2023-07-05 01:46) and g:3a61ca1b9256535e (2023-07-06 16:56); g:d76d19c

2023-07-21 Thread Jan Hubicka via Gcc-bugs
> I suspect this is most likely the profile updates changes ... Quite possibly. The goal of this excercise is to figure out if there are some bugs in profile estimate or whether passes somehow preffer broken profile or if it is just back luck. Looking at sphinx and fatigue it seems that LRA

Fix gcc.dg/tree-ssa/copy-headers-9.c and gcc.dg/tree-ssa/dce-1.c failures

2023-07-21 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes template in the two testcases so it matches the output correctly. I did not re-test after last changes in the previous patch, sorry for that. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/copy-headers-9.c: Fix template for tree-ssa-loop-ch.cc changes. *

Re: [PATCH]AArch64 fix regexp for live_1.c sve test

2023-07-21 Thread Jan Hubicka via Gcc-patches
Avoid scaling flat loop profiles of vectorized loops As discussed, when vectorizing loop with static profile, it is not always good idea to divide the header frequency by vectorization factor because the profile may not realistically represent the expected number of iterations. Since in such

Fix undefined behaviour in profile_count::differs_from_p

2023-08-10 Thread Jan Hubicka via Gcc-patches
Hi, This patch avoid overflow in profile_count::differs_from_p and also makes it to return false from one of the values is undefined while other is defined. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: * profile-count.cc (profile_count::differs_from_p): Fix overflow and

Fix profile update in duplicat_loop_body_to_header_edge for loops with 0 count_in

2023-08-10 Thread Jan Hubicka via Gcc-patches
Hi, this patch makes duplicate_loop_body_to_header_edge to not drop profile counts to uninitialized when count_in is 0. This happens because profile_probability in 0 count is undefined. Bootstrapped/regtested x86_64-linux, committed. gcc/ChangeLog: * cfgloopmanip.cc

Fix profile updating bug in tree-ssa-threadupdate

2023-08-10 Thread Jan Hubicka via Gcc-patches
Hi, ssa_fix_duplicate_block_edges later calls update_profile to correct profile after threading. In the testcase this does not work since we lose track of the duplicated edge. This happens because redirect_edge_and_branch returns NULL if the edge already has correct destination which is the

Fix division by zero in tree-ssa-loop-split

2023-08-10 Thread Jan Hubicka via Gcc-patches
Hi, Profile update I added to tree-ssa-loop-split can divide by zero in situation that the conditional is predicted with 0 probability which is triggered by jump threading update in the testcase. gcc/ChangeLog: PR middle-end/110923 * tree-ssa-loop-split.cc (split_loop): Watch for

Re: Disable loop distribution for loops with estimated iterations 0

2023-08-04 Thread Jan Hubicka via Gcc-patches
> On Fri, Aug 4, 2023 at 9:16 AM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > this prevents useless loop distribiton produced in hmmer. With FDO we now > > correctly work out that the loop created for last iteraiton is not going to > > iterate howe

Re: Fix profile upate after vectorizer peeling

2023-08-04 Thread Jan Hubicka via Gcc-patches
Hi, so I found the problem. We duplicate multiple paths and end up with: ;; basic block 6, loop depth 0, count 365072224 (estimated locally, freq 0.3400) ;; prev block 12, next block 7, flags: (NEW, REACHABLE, VISITED) ;; pred: 4 [never (guessed)] count:0 (estimated locally, freq

Fix profiledbootstrap

2023-08-03 Thread Jan Hubicka via Gcc-patches
Hi, Profiledbootstrap fails with ICE in update_loop_exit_probability_scale_dom_bbs called from loop unroling. The reason is that under relatively rare situations, we may run into case where loop has multiple exits and all are considered as likely but then we scale down the profile and one of the

Update estimated iteraitons counts after splitting

2023-08-03 Thread Jan Hubicka via Gcc-patches
Hi, Hmmer's internal function has 4 loops. The following is the profile at start: loop 1: estimate 472 iterations by profile: 473.497707 (reliable) count in:84821 (precise, freq 0.9979) loop 2: estimate 99 iterations by profile: 100.00 (reliable) count in:39848881

Re: Fix profile upate after vectorizer peeling

2023-08-04 Thread Jan Hubicka via Gcc-patches
> > > > A couple cycles ago I separated most of code to distinguish between the > > back and forward threaders. There is class jt_path_registry that is > > common to both, and {fwd,back}_jt_path_registry for the forward and > > backward threaders respectively. It's not perfect, but it's a start.

Disable loop distribution for loops with estimated iterations 0

2023-08-04 Thread Jan Hubicka via Gcc-patches
Hi, this prevents useless loop distribiton produced in hmmer. With FDO we now correctly work out that the loop created for last iteraiton is not going to iterate however loop distribution still produces a verioned loop that has no chance to survive loop vectorizer since we only keep distributed

Fix profile update after versioning ifconverted loop

2023-08-06 Thread Jan Hubicka via Gcc-patches
Hi, If loop is ifconverted and later versioning by vectorizer, vectorizer will reuse the scalar loop produced by ifconvert. Curiously enough it does not seem to do so for versions produced by loop distribution while for loop distribution this matters (since since both ldist versions survive to

Re: [PATCH] Support -m[no-]gather -m[no-]scatter to enable/disable vectorization for all gather/scatter instructions.

2023-08-10 Thread Jan Hubicka via Gcc-patches
> On Thu, Aug 10, 2023 at 9:42 AM Uros Bizjak wrote: > > > > On Thu, Aug 10, 2023 at 9:40 AM Richard Biener > > wrote: > > > > > > On Thu, Aug 10, 2023 at 3:13 AM liuhongt wrote: > > > > > > > > Currently we have 3 different independent tunes for gather > > > >

Fix profile update after peeled epilogues

2023-08-06 Thread Jan Hubicka via Gcc-patches
Hi, Epilogue peeling expects the scalar loop to have same number of executions as the vector loop which is true at the beggining of vectorization. However if the epilogues are vectorized, this is no longer the case. In this situation the loop preheader is replaced by new guard code with correct

loop-split improvements, part 1

2023-07-28 Thread Jan Hubicka via Gcc-patches
Hi, while looking on profile misupdate on hmmer I noticed that loop splitting pass is not able to handle the loop it has as an example it should apply on: One transformation of loops like: for (i = 0; i < 100; i++) { if (i < 50) A; else B; }

Fix profile update after cancelled loop distribution

2023-08-02 Thread Jan Hubicka via Gcc-patches
Hi, Loop distribution and ifcvt introduces verisons of loops which may be removed later if vectorization fails. Ifcvt does this by temporarily breaking profile and producing conditional that has two arms with 100% probability because we know one of the versions will be removed. Loop distribution

Loop-split improvements, part 2

2023-07-28 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes profile update in the first case of loop splitting. The pass still gives up on very basic testcases: __attribute__ ((noinline,noipa)) void test1 (int n) { if (n <= 0 || n > 10) return; for (int i = 0; i <= n; i++) { if (i < n) do_something ();

Re: [Bug tree-optimization/106293] [13/14 Regression] 456.hmmer at -Ofast -march=native regressed by 19% on zen2 and zen3 in July 2022

2023-07-28 Thread Jan Hubicka via Gcc-bugs
> This heuristic wants to catch > > > if (foo) abort (); > > > and avoid sinking "too far" across a path with "similar enough" > execution count (I think the original motivation was to fix some > spilling / register pressure issue). The loop depth test > should be !(bb_loop_depth

Re: Loop-split improvements, part 2

2023-07-28 Thread Jan Hubicka via Gcc-patches
> On Fri, Jul 28, 2023 at 9:58 AM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > this patch fixes profile update in the first case of loop splitting. > > The pass still gives up on very basic testcases: > > > > __attribute__ ((noinline,noipa)) >

Loop-split improvements, part 3

2023-07-28 Thread Jan Hubicka via Gcc-patches
Hi, This patch extends tree-ssa-loop-split to understand test of the form if (i==0) and if (i!=0) which triggers only during the first iteration. Naturally we should also be able to trigger last iteration or split into 3 cases if the test indeed can fire in the middle of the loop. Last

Re: Loop-split improvements, part 3

2023-07-28 Thread Jan Hubicka via Gcc-patches
> On Fri, Jul 28, 2023 at 2:57 PM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > This patch extends tree-ssa-loop-split to understand test of the form > > if (i==0) > > and > > if (i!=0) > > which triggers only during the first iteration.

Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()

2023-06-21 Thread Jan Hubicka via Gcc-patches
> > If I manually add a __builtin_unreachable () to the above case > I see the *(int *)0 = 0; store DSEd. Maybe we should avoid > removing stores that might trap here? POSIX wise such a trap > could be a way to jump out of the path leading to unreachable () > via siglongjmp ... I am not sure

Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()

2023-06-22 Thread Jan Hubicka via Gcc-patches
> > > On 6/22/23 00:31, Richard Biener wrote: > > I think there's a difference in that __builtin_trap () is observable > > while __builtin_unreachable () is not and reaching __builtin_unreachable > > () invokes undefined behavior while reaching __builtin_trap () does not. > > > > So the

Re: Do not account __builtin_unreachable guards in inliner

2023-06-23 Thread Jan Hubicka via Gcc-patches
> On Mon, Jun 19, 2023 at 12:15 PM Jan Hubicka wrote: > > > > > On Mon, Jun 19, 2023 at 9:52 AM Jan Hubicka via Gcc-patches > > > wrote: > > > > > > > > Hi, > > > > this was suggested earlier somewhere, but I can not fin

Re: Ping [PATCH v4] Add condition coverage profiling

2023-06-23 Thread Jan Hubicka via Gcc-patches
> > > > gcc/ChangeLog: > > > > * builtins.cc (expand_builtin_fork_or_exec): Check > > profile_condition_flag. > > * collect2.cc (main): Add -fno-profile-conditions to OBSTACK. > > * common.opt: Add new options -fprofile-conditions and > > * doc/gcov.texi: Add --conditions

Re: [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-23 Thread Jan Hubicka via Gcc-bugs
Just so it is somewhere, here is a testcase that we can't inline leaf functions to always_inlines unless we do some tracking of what calls were formerly indirect calls. We really overloaded always_inline from the original semantics "drop inlining heuristics" into "be sure that result is inlined"

Re: Do not account __builtin_unreachable guards in inliner

2023-06-23 Thread Jan Hubicka via Gcc-patches
> > So you need to feed it with extra info on the optimized out stmts because > as-is it will not remove __builtin_unreachable (). That means you're My plan was to add entry point to tree-ssa-dce that will take an set of stmts declared dead by external force and will do the usual mark stage

Re: Tiny phiprop compile time optimization

2023-06-23 Thread Jan Hubicka via Gcc-patches
Hi, here is updated version with TODO_update_ssa_only_virtuals. bootstrapped/regtested x86_64-linux. OK? gcc/ChangeLog: * tree-ssa-phiprop.cc (propagate_with_phi): Compute post dominators on demand. (pass_phiprop::execute): Do not compute it here; return

Enable early inlining into always_inline functions

2023-06-28 Thread Jan Hubicka via Gcc-patches
Hi, early inliner currently skips always_inline functions and moreover we ignore calls from always_inline in ipa_reverse_postorder. This leads to disabling most of propagation done using early optimization that is quite bad when early inline functions are not leaf functions, which is now quite

Extend ipa-fnsummary to skip __builtin_expect

2023-06-29 Thread Jan Hubicka via Gcc-patches
Compute ipa-predicates for conditionals involving __builtin_expect_p std::vector allocator looks as follows: __attribute__((nodiscard)) struct pair * std::__new_allocator >::allocate (struct __new_allocator * const this, size_type __n, const void * D.27753) { bool _1; long int _2; long

[libstdc++] Improve M_check_len

2023-06-18 Thread Jan Hubicka via Gcc-patches
Hi, _M_check_len is used in vector reallocations. It computes __n + __s but does checking for case that (__n + __s) * sizeof (Tp) would overflow ptrdiff_t. Since we know that __s is a size of already allocated memory block if __n is not too large, this will never happen on 64bit systems since

Do not account __builtin_unreachable guards in inliner

2023-06-19 Thread Jan Hubicka via Gcc-patches
Hi, this was suggested earlier somewhere, but I can not find the thread. C++ has assume attribute that expands int if (conditional) __builtin_unreachable () We do not want to account the conditional in inline heuristics since we know that it is going to be optimized out.

Tiny phiprop compile time optimization

2023-06-19 Thread Jan Hubicka via Gcc-patches
Hi, this patch avoids unnecessary post dominator and update_ssa in phiprop. Bootstrapped/regtested x86_64-linux, OK? gcc/ChangeLog: * tree-ssa-phiprop.cc (propagate_with_phi): Add post_dominators_computed; compute post dominators lazilly. (const pass_data

Extend fnsummary to predict SRA oppurtunities

2023-06-18 Thread Jan Hubicka via Gcc-patches
Hi, this patch extends ipa-fnsummary to anticipate statements that will be removed by SRA. This is done by looking for calls passing addresses of automatic variables. In function body we look for dereferences from pointers of such variables and mark them with new not_sra_candidate condition.

Re: Extend fnsummary to predict SRA oppurtunities

2023-06-18 Thread Jan Hubicka via Gcc-patches
Hi, as noticed by Jeff, this patch also triggers warning in one of LTO testcases. The testcase is reduced and warning seems legit, triggered by extra inlining. So I have just silenced it. Honza gcc/testsuite/ChangeLog: * gcc.dg/lto/20091013-1_0.c: Disable stringop-overread warning.

Optimize std::max early

2023-06-18 Thread Jan Hubicka via Gcc-patches
Hi, we currently produce very bad code on loops using std::vector as a stack, since we fail to inline push_back which in turn prevents SRA and we fail to optimize out some store-to-load pairs (PR109849). I looked into why this function is not inlined and it is inlined by clang. We currently

Re: [libstdc++] Improve M_check_len

2023-06-20 Thread Jan Hubicka via Gcc-patches
> > > > size_type > > _M_check_len(size_type __n, const char* __s) const > > { > > const size_type __size = size(); > > const size_type __max_size = max_size(); > > > > if (__is_same(allocator_type, allocator<_Tp>) > > && __size > __max_size

Re: Do not account __builtin_unreachable guards in inliner

2023-06-19 Thread Jan Hubicka via Gcc-patches
> On Mon, Jun 19, 2023 at 9:52 AM Jan Hubicka via Gcc-patches > wrote: > > > > Hi, > > this was suggested earlier somewhere, but I can not find the thread. > > C++ has assume attribute that expands int > > if (conditional) > > __builtin_un

Re: [libstdc++] Improve M_check_len

2023-06-19 Thread Jan Hubicka via Gcc-patches
> > - if (max_size() - size() < __n) > > - __throw_length_error(__N(__s)); > > + // On 64bit systems vectors of small sizes can not > > + // reach overflow by growing by small sizes; before > > + // this happens, we will run out of memory. > > + if

Re: [Bug libstdc++/110287] _M_check_len is expensive

2023-06-19 Thread Jan Hubicka via Gcc-bugs
> > There is no guarantee that std::vector::max_size() is PTRDIFF_MAX. It > depends on the Allocator type, A. A user-defined allocator could have > max_size() == 100. If inliner we see path to the throw functions, it will not determine _M_check_len as early inlinable. Perhaps we can

Re: [libstdc++] Improve M_check_len

2023-06-19 Thread Jan Hubicka via Gcc-patches
> On Mon, 19 Jun 2023 at 12:20, Jakub Jelinek wrote: > > > On Mon, Jun 19, 2023 at 01:05:36PM +0200, Jan Hubicka via Gcc-patches > > wrote: > > > - if (max_size() - size() < __n) > > > - __throw_length_error(__N(__s)); > >

Re: [libstdc++] Improve M_check_len

2023-06-20 Thread Jan Hubicka via Gcc-patches
> > > > > > size_type > > > _M_check_len(size_type __n, const char* __s) const > > > { > > > const size_type __size = size(); > > > const size_type __max_size = max_size(); > > > > > > if (__is_same(allocator_type, allocator<_Tp>) > > > &&

Fix predictions of conditionals with __builtin_expect

2023-06-30 Thread Jan Hubicka via Gcc-patches
Hi, while looking into the std::vector _M_realloc_insert codegen I noticed that call of __throw_bad_alloc is predicted with 10% probability. This is because the conditional guarding it has __builtin_expect (cond, 0) on it. This incorrectly takes precedence over more reliable heuristics

Fix update_bb_profile_for_threading

2023-07-01 Thread Jan Hubicka via Gcc-patches
Hi, this patch fixes some of profile mismatches caused by profile updating. It seems that I misupdated update_bb_profile_for_threading in 2017 which results in invalid updates from rtl threading and threadbackwards. update_bb_profile_for_threading knows that some paths to BB are being redirected

Fix profile updates in copy-header

2023-07-01 Thread Jan Hubicka via Gcc-patches
Hi, most common source of profile mismatches is now copyheader pass. The reason is that in comon case the duplicated header condition will become constant true and that needs changes in the loop exit condition probability. While this can be done by jump threading it is not, since it gives up

Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()

2023-06-26 Thread Jan Hubicka via Gcc-patches
Hi, playing with testcases for path isolation and const function, I noticed that we do not seem to even try to isolate out of range array accesses: int a[3]={0,1,2}; test(int i) { if (i > 3) return test2(a[i]); return a[i]; } Here call to test2 is dead, since a[i] will

Fix profile of forwardes produced by cd-dce

2023-06-26 Thread Jan Hubicka via Gcc-patches
Hi, compiling the testcase from PR109849 (which uses std:vector based stack to drive a loop) with profile feedbakc leads to profile mismatches introduced by tree-ssa-dce. This is the new code to produce unified forwarder blocks for PHIs. I am not including the testcase itself since checking it

Re: Enable ranger for ipa-prop

2023-06-28 Thread Jan Hubicka via Gcc-patches
> > On 6/27/23 12:24, Jan Hubicka wrote: > > > On 6/27/23 09:19, Jan Hubicka wrote: > > > > Hi, > > > > as shown in the testcase (which would eventually be useful for > > > > optimizing std::vector's push_back), ipa-prop can use context dependent > > > > ranger > > > > queries for better value

Re: [PATCH] libstdc++: Use RAII in std::vector::_M_realloc_insert

2023-06-28 Thread Jan Hubicka via Gcc-patches
> I think the __throw_bad_alloc() and __throw_bad_array_new_length() > functions should always be rare, so marking them cold seems fine (users who > define their own allocators that want to throw bad_alloc "often" will > probably throw it directly, they shouldn't be using our __throw_bad_alloc() >

Re: [Bug ipa/110334] [13/14 Regresssion] unused functions not eliminated before LTO streaming

2023-06-28 Thread Jan Hubicka via Gcc-bugs
> > why disallow caller->indirect_calls? See testcase in comment #9 > > > + return false; > > + for (cgraph_edge *e2 = callee->callees; e2; e2 = e2->next_callee) > > I don't think this flys - it looks quadratic. Can we compute this > in the inline summary once instead? I guess I

Re: [PATCH] ipa-sra: Disable candidates with no known callers (PR 110276)

2023-06-16 Thread Jan Hubicka via Gcc-patches
> Hi, > > In IPA-SRA we use can_be_local_p () predicate rather than just plain > local call graph flag in order to figure out whether the node is a > part of an external API that we cannot change. Although there are > cases where this can allow more transformations, it also means we can >

Enable ranger for ipa-prop

2023-06-27 Thread Jan Hubicka via Gcc-patches
Hi, as shown in the testcase (which would eventually be useful for optimizing std::vector's push_back), ipa-prop can use context dependent ranger queries for better value range info. Bootstrapped/regtested x86_64-linux, OK? Honza gcc/ChangeLog: PR middle-end/110377 *

Re: [PATCH] libstdc++: Use RAII in std::vector::_M_realloc_insert

2023-06-23 Thread Jan Hubicka via Gcc-patches
> I intend to push this to trunk once testing finishes. > > I generated the diff with -b so the whitespace changes aren't shown, > because there was some re-indenting that makes the diff look larger than > it really is. > > Honza, I don't think this is likely to make much difference for the PR >

Re: [PATCH] Improve DSE to handle stores before __builtin_unreachable ()

2023-06-25 Thread Jan Hubicka via Gcc-patches
> > Also as discussed some time ago, the volatile loads between traps has > > effect of turning previously pure/const functions into non-const which > > is somewhat sad, so it is still on my todo list to change it this stage1 > > to something more careful. We discussed internal functions

Re: Enable ranger for ipa-prop

2023-06-27 Thread Jan Hubicka via Gcc-patches
> > On 6/27/23 09:19, Jan Hubicka wrote: > > Hi, > > as shown in the testcase (which would eventually be useful for > > optimizing std::vector's push_back), ipa-prop can use context dependent > > ranger > > queries for better value range info. > > > > Bootstrapped/regtested x86_64-linux, OK? >

Re: Question on patch -fprofile-partial-training

2023-05-10 Thread Jan Hubicka via Gcc-patches
> Honza, > > Main motivation for this was profiling programs that contain specific > > code paths for different CPUs (such as graphics library in Firefox or Linux > > kernel). In the situation training machine differs from the machine > > program is run later, we end up optimizing for size all

Re: [Bug c++/106943] GCC building clang/llvm with LTO flags causes ICE in clang

2023-05-12 Thread Jan Hubicka via Gcc-bugs
> > Indeed it is quite long time problem with clang not building with lifetime > > DSE and strict aliasing. I wonder why this is not fixed on clang side? > > Because the problems were not communicated? I knew that Firefox needed > -flifetime-dse=1, but it's the first time I hear that any such

Re: Question on patch -fprofile-partial-training

2023-05-09 Thread Jan Hubicka via Gcc-patches
> > > > > > > > From my understanding, -fprofile-partial-training is one important > > > > option for PGO performance. > > > > > > I don't think so, speed benefit would be rather small I guess. > > I saw some articles online to introduce this option for gcc10, > >

Re: [PATCH] tree-optimization/110991 - unroll size estimate after vectorization

2023-08-14 Thread Jan Hubicka via Gcc-patches
> The following testcase shows that we are bad at identifying inductions > that will be optimized away after vectorizing them because SCEV doesn't > handle vectorized defs. The following rolls a simpler identification > of SSA cycles covering a PHI and an assignment with a binary operator > with

Re: [Bug tree-optimization/113787] [12/13/14 Regression] Wrong code at -O with ipa-modref on aarch64

2024-02-14 Thread Jan Hubicka via Gcc-bugs
> > I guess PTA gets around by tracking points-to set also for non-pointer > > types and consequently it also gives up on any such addition. > > It does. But note it does _not_ for POINTER_PLUS where it treats > the offset operand as non-pointer. > > > I think it is

Re: [Bug target/113233] LoongArch: target options from LTO objects not respected during linking

2024-01-04 Thread Jan Hubicka via Gcc-bugs
> Confirm. But option save/restore has been always implemented: > > .section.gnu.lto_.opts,"",@progbits > .ascii "'-fno-openmp' '-fno-openacc' '-fno-pie' '-fcf-protection" > .ascii "=none' '-mabi=lp64d' '-march=loongarch64' '-mfpu=64' '-m" > .ascii "simd=lasx'

[gcc r14-10093] Remove repeated information in -ftree-loop-distribute-patterns doc

2024-04-23 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:6f0a646dd2fc59e9c9cde63718b36085f84a19ba commit r14-10093-g6f0a646dd2fc59e9c9cde63718b36085f84a19ba Author: Jan Hubicka Date: Tue Apr 23 15:51:42 2024 +0200 Remove repeated information in -ftree-loop-distribute-patterns doc We have:

[gcc r15-482] Reduce recursive inlining of always_inline functions

2024-05-14 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:1ec49897253e093e1ef6261eb104ac0c111bac83 commit r15-482-g1ec49897253e093e1ef6261eb104ac0c111bac83 Author: Jan Hubicka Date: Tue May 14 12:58:56 2024 +0200 Reduce recursive inlining of always_inline functions this patch tames down inliner on (mutiply)

Re: [Bug libstdc++/109442] Dead local copy of std::vector not removed from function

2024-05-14 Thread Jan Hubicka via Gcc-bugs
This patch attempts to add __builtin_operator_new/delete. So far they are not optimized, which will need to be done by extra flag of BUILT_IN_ code. also the decl.cc code can be refactored to be less of cut and I guess has_builtin hack to return proper value needs to be moved to C++ FE. However

[gcc r15-581] Fix points_to_local_or_readonly_memory_p wrt TARGET_MEM_REF

2024-05-16 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:96d53252aefcbc2fe419c4c3b4bcd3fc03d4d187 commit r15-581-g96d53252aefcbc2fe419c4c3b4bcd3fc03d4d187 Author: Jan Hubicka Date: Thu May 16 15:33:55 2024 +0200 Fix points_to_local_or_readonly_memory_p wrt TARGET_MEM_REF TARGET_MEM_REF can be used to offset

[gcc r15-512] Avoid pointer compares on TYPE_MAIN_VARIANT in TBAA

2024-05-15 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:9b7cad5884f21cc5783075be0043777448db3fab commit r15-512-g9b7cad5884f21cc5783075be0043777448db3fab Author: Jan Hubicka Date: Wed May 15 14:14:27 2024 +0200 Avoid pointer compares on TYPE_MAIN_VARIANT in TBAA while building more testcases for ipa-icf I noticed

[gcc r14-9516] Add missing config/i386/zn4zn5.md file

2024-03-18 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:dfc9d1cc8353bdd7fbc37bc10bb3fd40f49fa4af commit r14-9516-gdfc9d1cc8353bdd7fbc37bc10bb3fd40f49fa4af Author: Jan Hubicka Date: Mon Mar 18 14:24:10 2024 +0100 Add missing config/i386/zn4zn5.md file gcc/ChangeLog: * config/i386/zn4zn5.md: Add

[gcc r14-9515] Add AMD znver5 processor enablement with scheduler model

2024-03-18 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:d0aa0af9a9b7dd709a8c7ff6604ed6b7da0fc23a commit r14-9515-gd0aa0af9a9b7dd709a8c7ff6604ed6b7da0fc23a Author: Jan Hubicka Date: Mon Mar 18 10:22:44 2024 +0100 Add AMD znver5 processor enablement with scheduler model 2024-02-14 Jan Hubicka

Re: [Bug ipa/114262] Over-inlining when optimizing for size with gnu_inline function

2024-03-07 Thread Jan Hubicka via Gcc-bugs
> Note GCC has not retuned its -Os heurstics for a long time because it has been > decent enough for most folks and corner cases like this is almost never come > up. There were quite few changes to -Os heuristics :) One of bigger challenges is that we do see more and more C++ code built with -Os

[gcc r14-9705] Hash operands of PHI in ipa-icf

2024-03-28 Thread Jan Hubicka via Gcc-cvs
https://gcc.gnu.org/g:0923fe2d4808c16b72c1d1bfe28220dd326d8b76 commit r14-9705-g0923fe2d4808c16b72c1d1bfe28220dd326d8b76 Author: Jan Hubicka Date: Thu Mar 28 13:24:54 2024 +0100 Hash operands of PHI in ipa-icf This patch fixes cache colision on function whose body differs only by

Re: [Bug ipa/113907] [11/12/13/14 regression] ICU miscompiled on x86 since r14-5109-ga291237b628f41

2024-04-09 Thread Jan Hubicka via Gcc-bugs
There is still problem with loop bounds. I am testing patch on that and then we should be (finally) finally safe.

Re: [Bug target/114232] [14 regression] ICE when building rr-5.7.0 with LTO on x86

2024-03-05 Thread Jan Hubicka via Gcc-bugs
Looking at the prototype patch, why need to change also the splitters? My original goal was to use splitters to expand to faster code sequences while having patterns necessary for both variants. This makes it possible to use optimize_insn_for_size/speed and make decisions using BB profile, since

Re: [Bug c++/110137] implement clang -fassume-sane-operator-new

2024-06-04 Thread Jan Hubicka via Gcc-bugs
> Is the option supposed to be only about the standard global scope operator > new/delete (_Znam etc.) or also user operator new/delete class methods? If > the > former, then I agree it is a global property (or at least a per shared > library/binary property, one can arrange stuff with symbol

gcc-wwwdocs branch master updated. 3aee4b0adcb86280ecdaec41447e7ff4f8d8c0a7

2024-05-07 Thread Jan Hubicka via Gcc-cvs-wwwdocs
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "gcc-wwwdocs". The branch, master has been updated via 3aee4b0adcb86280ecdaec41447e7ff4f8d8c0a7 (commit) from

<    1   2   3   4   5