[Bug gcov-profile/114851] Alternative to -Wmisexpect from LLVM in GCC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114851 --- Comment #3 from Alexander Zaitsev --- > Though I do wonder if the "hints" are used instead of the PGO here. We already discussed this question a bit in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112806 . If I understand correctly, no clear answer yet: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112806#c4 .
[Bug gcov-profile/114851] New: Alternative to -Wmisexpect from LLVM in GCC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114851 Bug ID: 114851 Summary: Alternative to -Wmisexpect from LLVM in GCC Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: gcov-profile Assignee: unassigned at gcc dot gnu.org Reporter: zamazan4ik at tut dot by Target Milestone: --- LLVM infrastructure supports a diagnostic for checking mismatches between user-provided __builtin_expect/[[likely]] hints and PGO profiles: https://clang.llvm.org/docs/DiagnosticsReference.html#wmisexpect + https://llvm.org/docs/MisExpect.html (and an example of its usage in Chromium: https://issues.chromium.org/issues/40694104). I was trying to find a similar diagnostic in GCC but found nothing. Is there anything similar in GCC? If not, can we make the issue a Feature Request for such a feature? Having such a diagnostic can be helpful in practice since it allows for finding wrongfully placed user hints in sources.
[Bug tree-optimization/114761] Ignored [[likely]] attribute with multiple if statements doing the same thing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114761 --- Comment #5 from Alexander Zaitsev --- > Is this based on real code or you just was looking at the differences between > gcc and clang here? Really, not on a real code. I came up with this example when I found that GCC for this example doesn't reorganize branches according to PGO profiles (when Clang does it). I just wondered about this difference in behavior between compilers, and trying to figure out what compiler is "right" here. Regarding generated code efficiency between Clang and GCC in this case. Am I right that in this case ignoring branch probabilities (in the GCC case) doesn't affect actual code performance? Asking it since I am not so proficient in compiler optimizations.
[Bug tree-optimization/114761] New: Ignored [[likely]] attribute
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114761 Bug ID: 114761 Summary: Ignored [[likely]] attribute Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zamazan4ik at tut dot by Target Milestone: --- For the following code: bool foo(int var) { if (var == 42) [[unlikely]] return true; if (var == 322) [[unlikely]] return true; if (var == 1337) [[likely]] return true; return false; } GCC (trunk) with "-O3 -std=c++20" generates the following: foo(int): cmp edi, 322 seteal cmp edi, 42 setedl or eax, edx cmp edi, 1337 setedl or eax, edx ret Clang (18) with "-O3 -std=c++20" however, generates a bit different version: foo(int):# @foo(int) mov al, 1 cmp edi, 1337 jne .LBB0_1 .LBB0_4: ret .LBB0_1: cmp edi, 42 je .LBB0_4 cmp edi, 322 je .LBB0_4 xor eax, eax ret GCC for some reason ignores [[likely]] attribute and doesn't place the branch with 1337 at the beginning of the function. Clang does it. Placing this branch at the beginning should be more optimal. I also tested GCC 13.2 (on my Fedora machine) with __builtin_expect and PGO - the result is the same for GCC: it ignores such an optimization. Godbolt link: https://godbolt.org/z/o8KMx8M33
[Bug gcov-profile/112829] Dump PGO profiles to a memory buffer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112829 --- Comment #2 from Alexander Zaitsev --- Am I right that right now in GCC there are no ready-to-use alternatives to "int __llvm_profile_write_buffer(char *Buffer)" from LLVM and it should be implemented somehow manually (as you described)?
[Bug gcov-profile/112829] New: Dump PGO profiles to a memory buffer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112829 Bug ID: 112829 Summary: Dump PGO profiles to a memory buffer Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: gcov-profile Assignee: unassigned at gcc dot gnu.org Reporter: zamazan4ik at tut dot by CC: marxin at gcc dot gnu.org Target Milestone: --- According to the GCC documentation (https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html) the only option is to dump PGO profiles to a filesystem. I am looking for an option to dump PGO profiles into a memory buffer. LLVM infrastructure has such an ability - it's documented here: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#using-the-profiling-runtime-without-a-filesystem . If GCC has such an ability too - would be great if it would be described somewhere in the Instrumentation documentation (or in any other better place in your opinion). The use case for having this is simple - in some systems, a filesystem can be read-only (e.g. due to security concerns) or even not enough to handle the PGO profile. With the memory approach, we will be able to collect PGO profiles and then deliver and expose them via other interfaces like HTTP or MQTT. I guess some related information can be found here (https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libgcc/libgcov-profiler.c) but I am not sure.
[Bug gcov-profile/112806] Profile-Guided Optimization (PGO) policy regarding explicit user optimization hint behavior
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112806 --- Comment #3 from Alexander Zaitsev --- > https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Other-Builtins.html#index-fprofile-arcs-1 I already read this and still do not understand the actual behavior. If PGO profiles show that the branch is "cold" but a user write for this branch via __builtin_expect/[[likely]] that the branch is "hot" - what decision will be made by the optimizer? On the link above there is only "In general, you should prefer to use actual profile feedback for this (-fprofile-arcs), as programmers are notoriously bad at predicting how their programs actually perform.". But it does not specify the actual behavior - it's just a recommendation to use PGO instead of manual [[likely]] hints.
[Bug gcov-profile/112717] .gcda profiles compatibility guarantees between GCC versions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112717 --- Comment #3 from Alexander Zaitsev --- > I thought this was documented but I don't see. There is no guarantee for > forward or backwards compatibility at all. In fact iirc there is a version > stored in the files to make sure the correct version is used with the version > of tools/compiler. Could we add this information to the documentation? Would be really helpful to the users to know this detail. Since your answer am I right that right now it's a strong recommendation/requirement to regenerate PGO profiles with each GCC update?
[Bug gcov-profile/112717] New: .gcda profiles compatibility guarantees between GCC versions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112717 Bug ID: 112717 Summary: .gcda profiles compatibility guarantees between GCC versions Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: gcov-profile Assignee: unassigned at gcc dot gnu.org Reporter: zamazan4ik at tut dot by CC: marxin at gcc dot gnu.org Target Milestone: --- Hi. I have several questions regarding .gcda profiles re-usage between GCC versions for Profile-Guided Optimization (PGO) purposes. The first question goes about forward and backward guarantees .gcda profiles. I didn't find related information in the GCC documentation. Are there guarantees in this area? Like "it's guaranteed that .gcda profiles from GCC version N will be always readable by GCC version N+1", where N is a minor/major GCC version. For us it's an important question since we are thinking about caching .gcda profiles in storage so PGO profiles can be reused later probably with a newer compiler. This goes in another direction too in the case if we generated the PGO profile with GCC 10 and some time later decided to revert the compiler to GCC 9. If there are some guarantees in this area, would be great to see them documented somewhere in the documentation (probably in a place like https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html). The second question is about PGO profiles reusability between GCC versions. As far as I understand, PGO profiles track some "counters" about the code. Possibly these counters can rely somehow on the performed by GCC optimizations (it's just my guess). Let's imagine that GCC 11 added more optimization passes that affect somehow generated code (e.g. much more aggressive inlining compared to GCC 10). In this case, probably, PGO profiles from GCC 10 wouldn't be useful anymore and we will need to regenerate them once again but with GCC 11. Is this scenario real? If yes, are there ways to mitigate it somehow? For LLVM I have the same questions that are discussed here: https://discourse.llvm.org/t/profile-guided-optimization-pgo-related-questions-and-suggestions/75232 . As far as I understand, GCC also implements PGO on something like "IR" (don't know how it's called properly in GCC - "Generic" or "GIMPLE"?), so probably some answers from LLVM would be applicable to GCC as well.
[Bug other/112492] New: Add LLVM BOLT support to the GCC build scripts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112492 Bug ID: 112492 Summary: Add LLVM BOLT support to the GCC build scripts Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: other Assignee: unassigned at gcc dot gnu.org Reporter: zamazan4ik at tut dot by Target Milestone: --- Hi! According to the Facebook Research Paper (https://research.facebook.com/publications/bolt-a-practical-binary-optimizer-for-data-centers-and-beyond/), LLVM BOLT (https://github.com/llvm/llvm-project/blob/main/bolt/README.md) helps with achieving better performance for GCC even after PGO-optimized GCC build. I think will be a good idea to add support for building GCC with BOLT as it's already for PGO-optimized GCC build with "make profiledbootstrap" target. Integrating LLVM BOLT to the build scripts allows maintainers in a much easier way to enable LLVM BOLT for GCC in their .spec files. Here I got some examples of how LLVM BOLT is already integrated into other projects: * Rustc: https://github.com/rust-lang/rust/pull/116352 * CPython: https://github.com/python/cpython/pull/95908 * Pyston: - https://github.com/pyston/pyston#building - https://github.com/pyston/pyston/blob/pyston_main/Makefile#L200 * Clang: https://github.com/llvm/llvm-project/blob/main/clang/cmake/caches/BOLT.cmake More about LLVM BOLT results for other projects can be found in: * Rustc: - https://github.com/rust-lang/rust/pull/116352 - https://www.reddit.com/r/rust/comments/y4w2kr/llvm_used_by_rustc_is_now_optimized_with_bolt_on/ * CPython: https://github.com/python/cpython/pull/95908 * YDB: https://github.com/ydb-platform/ydb/issues/140 * Clang: - [Slides](https://llvm.org/devmtg/2022-11/slides/Lightning15-OptimizingClangWithBOLTUsingCMake.pdf) - [Results on building Clang](https://github.com/ptr1337/llvm-bolt-scripts/blob/master/results.md) - [Linaro results](https://android-review.linaro.org/plugins/gitiles/toolchain/llvm_android/+/f36c64eeddf531b7b1a144c40f61d6c9a78eee7a) - [on AMD 7950X3D](https://github.com/llvm/llvm-project/issues/65010#issuecomment-1701255347) * LDC: https://github.com/ldc-developers/ldc/issues/4228#issuecomment-1334499428 * NodeJS: https://aaupov.github.io/blog/2020/10/08/bolt-nodejs * Chromium: https://aaupov.github.io/blog/2022/11/12/bolt-chromium * MySQL, MongoDB, memcached, Verilator: https://people.ucsc.edu/~hlitz/papers/ocolos.pdf More information can be found here: https://github.com/zamazan4ik/awesome-pgo
[Bug c++/96821] [concepts] Incorrect evaluation of concept with ill-formed expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96821 Alexander Zaitsev changed: What|Removed |Added CC||zamazan4ik at tut dot by --- Comment #6 from Alexander Zaitsev --- Any updates on the issue? Such behaviour is strange too since Clang and MSVC have a different opinion from GCC for the code.