[Bug target/106453] Redundant zero extension after crc32q

2022-07-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453 --- Comment #1 from Alexander Monakov --- Any idea if the following is reasonable? It compiles and achieves the desired result. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index bdde577dd..d82656678 100644 ---

[Bug middle-end/106470] Subscribed access to __m256i casted to (uint16_t *) produces garbage or a warning

2022-07-29 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106470 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug middle-end/106470] Subscribed access to __m256i casted to (uint16_t *) produces garbage or a warning

2022-07-29 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106470 --- Comment #8 from Alexander Monakov --- But that's the point of many warnings, isn't it? To help the user understand what's wrong when the code is bad? And bogus warnings just confuse more.

[Bug middle-end/106421] New: ICE with computed goto from a nested functon

2022-07-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106421 Bug ID: 106421 Summary: ICE with computed goto from a nested functon Product: gcc Version: unknown Status: UNCONFIRMED Keywords: ice-on-invalid-code Severity: normal

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 --- Comment #4 from Alexander Monakov --- Regarding point 1 above, I should mention that Glibc headers mark both 'vfork' and 'raise' as leaf.

[Bug target/106453] New: Redundant zero extension after crc32q

2022-07-27 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453 Bug ID: 106453 Summary: Redundant zero extension after crc32q Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target

[Bug tree-optimization/106422] [13 Regression] ice in duplicate_block, at cfghooks.cc:1115

2022-07-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106422 Alexander Monakov changed: What|Removed |Added CC||aldyh at gcc dot gnu.org ---

[Bug target/105504] New: Fails to break dependency for vcvtss2sd xmm, xmm, mem

2022-05-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504 Bug ID: 105504 Summary: Fails to break dependency for vcvtss2sd xmm, xmm, mem Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimization Severity:

[Bug rtl-optimization/105513] New: [9/10/11/12/13 Regression] Unnecessary SSE spill

2022-05-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105513 Bug ID: 105513 Summary: [9/10/11/12/13 Regression] Unnecessary SSE spill Product: gcc Version: unknown Status: UNCONFIRMED Keywords: missed-optimization, ra Severity:

[Bug target/105504] Fails to break dependency for vcvtss2sd xmm, xmm, mem

2022-05-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105504 --- Comment #5 from Alexander Monakov --- The strange xmm0 spill issue may affect more code, so I reported an isolated testcase: PR 105513 (regression vs. gcc-8, the complete testcase in this PR also does not spill with gcc-8).

[Bug middle-end/106688] New: leaving SSA emits assignment into the inner loop

2022-08-19 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106688 Bug ID: 106688 Summary: leaving SSA emits assignment into the inner loop Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug tree-optimization/106781] [13 Regression] ICE: verify_flow_info failed (error: returns_twice call is not first in basic block 2)

2022-08-31 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106781 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug ipa/106783] New: [12/13 Regression] ICE in ipa-modref.cc:analyze_function

2022-08-31 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106783 Bug ID: 106783 Summary: [12/13 Regression] ICE in ipa-modref.cc:analyze_function Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: ice-on-valid-code

[Bug tree-optimization/106781] [13 Regression] ICE: verify_flow_info failed (error: returns_twice call is not first in basic block 2) since r13-1754-g7a158a5776f5ca95

2022-08-31 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106781 --- Comment #4 from Alexander Monakov --- (In reply to Martin Liška from comment #3) > > Also ICEs in ipa-modref when 'noclone' added to 'noinline', a 12/13 > > regression (different cause, needs a separate PR). > > Can't reproduce Alexander,

[Bug middle-end/106804] Poor codegen for selecting and incrementing value behind a reference

2022-09-02 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106804 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug target/106834] GCC creates R_X86_64_GOTOFF64 for 4-bytes immediate

2022-09-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106834 --- Comment #10 from Alexander Monakov --- Okay, so this should have been reported against Binutils, but since we are having the conversation here: the current behavior is not good, gas is silently selecting a different relocation kind for no

[Bug target/106453] Redundant zero extension after crc32q

2022-09-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106453 Alexander Monakov changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug lto/91299] [10/11/12/13 Regression] LTO inlines a weak definition in presence of a non-weak definition from an ELF file

2022-09-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91299 Alexander Monakov changed: What|Removed |Added Keywords||wrong-code Summary|LTO

[Bug tree-optimization/106781] [13 Regression] ICE: verify_flow_info failed (error: returns_twice call is not first in basic block 2) since r13-1754-g7a158a5776f5ca95

2022-08-31 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106781 --- Comment #5 from Alexander Monakov --- GCC discovers that 'bar' is noreturn, tries to remove its LHS but unfortunately cgraph.cc:cgraph_edge::redirect_call_stmt_to_callee wants to emit an assignment of SSA default-def to the LHS.

[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result

2022-09-27 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 --- Comment #15 from Alexander Monakov --- (In reply to Richard Biener from comment #14) > I can't > seem to reproduce any vectorization for your smaller example though. My small C samples omit some detail as they were meant to illustrate what

[Bug middle-end/102380] [meta-bug] visibility (fvisibility=* and attributes) issues

2022-10-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102380 Bug 102380 depends on bug 99619, which changed state. Bug 99619 Summary: fails to infer local-dynamic TLS model from hidden visibility https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99619 What|Removed |Added

[Bug middle-end/99619] fails to infer local-dynamic TLS model from hidden visibility

2022-10-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99619 Alexander Monakov changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-10-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #1 from Alexander Monakov --- Suggested partial fix for the integer-pipe side of the blowup: https://inbox.sourceware.org/gcc-patches/4549f27b-238a-7d77-f72b-cc77df8ae...@ispras.ru/

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #8 from Alexander Monakov --- (In reply to Arseny Solokha from comment #7) > I have it on x86_64-pc-linux-gnu… Thanks for the info (I assume you don't have any special configure arguments), but that's surprising, I ran

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #9 from Alexander Monakov --- Actually, latest results from H.J. Lu's periodic x86_64 tester don't exhibit such issues either: https://inbox.sourceware.org/gcc-testresults/20221025065901.6dc0062...@gnu-34.sc.intel.com/T/#u

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #11 from Alexander Monakov --- I've broken out the C++ issue from comment #10 as PR 107393, thanks for the testcase. It's a separate issue from emutls and Fortran ICEs on other targets.

[Bug c++/107393] New: Wrong TLS model for specialized template

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107393 Bug ID: 107393 Summary: Wrong TLS model for specialized template Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #12 from Alexander Monakov --- ICE on the emutls-3.c testcase isn't related to emutls. Rather, the frontend invokes decl_default_tls_model before attributes are processed, so the first time around we miss the 'common' attribute when

[Bug other/107353] [13 regression] Numerous ICEs after r13-3416-g1d561e1851c466

2022-10-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 --- Comment #13 from Alexander Monakov --- As for the Fortran testcases, the issue is again caused by the front-end invoking decl_default_tls_model before assigning DECL_COMMON, this time in fortran/trans-common.cc:build_common_decl. So I

[Bug other/107353] frontends sometimes select wrong (too strong) TLS access model

2022-10-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107353 Alexander Monakov changed: What|Removed |Added Summary|[13 regression] Numerous|frontends sometimes select

[Bug c/107419] New: attributes are ignored when selecting TLS model

2022-10-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107419 Bug ID: 107419 Summary: attributes are ignored when selecting TLS model Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal

[Bug fortran/107421] New: problematic interaction of 'common' and 'threadprivate'

2022-10-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107421 Bug ID: 107421 Summary: problematic interaction of 'common' and 'threadprivate' Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: openmp

[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result

2022-09-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug target/106952] Missed optimization: x < y ? x : y not lowered to minss

2022-09-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106952 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result

2022-09-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 --- Comment #7 from Alexander Monakov --- Lawrence, thank you for the nice work reducing the testcase. For RawTherapee the recommended course of action would be to compile everything with -ffp-contract=off, then manually reintroduce use of fma

[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result

2022-09-19 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 --- Comment #11 from Alexander Monakov --- Can we move -ffp-contract=fast under the -ffast-math umbrella and default to -ffp-contract=on/off? Isn't it easy now to implement -ffp-contract=on by a GENERIC-only match.pd rule?

[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result

2022-09-19 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 --- Comment #13 from Alexander Monakov --- (In reply to Richard Biener from comment #12) > > Isn't it easy now to implement -ffp-contract=on by a GENERIC-only match.pd > > rule? > > You mean in the frontend only for -ffp-contract=on? Yes. >

[Bug lto/107014] flatten+lto fails the kernel build

2022-09-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug lto/107014] flatten+lto fails the kernel build

2022-09-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014 --- Comment #5 from Alexander Monakov --- (In reply to Jiri Slaby from comment #4) > > I am surprised that "flatten" blows up on this function. Is that with any > > config, or again some specific settings like gcov? Is there an existing lkml >

[Bug lto/107014] flatten+lto fails the kernel build

2022-09-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107014 --- Comment #7 from Alexander Monakov --- I wanted to understand what gets exposed in LTO mode that causes a blowup. I'd say flatten is not appropriate for this function (I don't think you want to force inlining of memset or _find_next_bit?),

[Bug tree-optimization/107250] Load unnecessarily happens before malloc

2022-10-13 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107250 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug middle-end/107115] Wrong codegen from TBAA under stores that change effective type?

2022-10-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107115 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org,

[Bug middle-end/107115] Wrong codegen from TBAA under stores that change effective type?

2022-10-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107115 --- Comment #8 from Alexander Monakov --- Just optimizing out the redundant store seems difficult because on some targets scheduling is invoked from reorg (and it relies on alias sets). We need a solution that works for combine too — is it

[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result

2022-09-30 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 --- Comment #19 from Alexander Monakov --- (In reply to rguent...@suse.de from comment #18) > True - but does that catch the cases people are interested and are > allowed by the FP contraction rules? I'm thinking of > > x = a*b + c*d + e +

[Bug tree-optimization/107107] [10/11/12/13 Regression] Wrong codegen from TBAA when stores to distinct same-mode types are collapsed?

2022-10-01 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107107 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug target/106902] [11/12/13 Regression] Program compiled with -O3 -mfma produces different result

2022-09-29 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 --- Comment #17 from Alexander Monakov --- (In reply to Richard Biener from comment #16) > I do think that since the only way to > preserve expression boundaries is by PAREN_EXPR Yes, but... > that the middle-end > shouldn't care about FAST

[Bug target/107250] Load unnecessarily happens before malloc

2022-10-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107250 --- Comment #3 from Alexander Monakov --- Well, obviously because in one function both 'f' and 'tmp' are live across the call, and in the other function only 'f' is live across the call. The difference is literally pushing one register vs. two

[Bug tree-optimization/107099] New: uncprop a bit

2022-09-30 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107099 Bug ID: 107099 Summary: uncprop a bit Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization

[Bug c++/106834] GCC creates R_X86_64_GOTOFF64 for 4-bytes immediate

2022-09-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106834 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug c++/106834] GCC creates R_X86_64_GOTOFF64 for 4-bytes immediate

2022-09-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106834 Alexander Monakov changed: What|Removed |Added CC||hjl.tools at gmail dot com ---

[Bug c/106835] [i386] Taking an address of _GLOBAL_OFFSET_TABLE_ produces a wrong value

2022-09-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106835 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug c/106835] [i386] Taking an address of _GLOBAL_OFFSET_TABLE_ produces a wrong value

2022-09-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106835 --- Comment #3 from Alexander Monakov --- It would be unfortunate if that makes it difficult or even impossible to make a R_386_32 relocation for the address of GOT in hand-written assembly. In any case, it seems GCC is not making the rules

[Bug c++/106834] GCC creates R_X86_64_GOTOFF64 for 4-bytes immediate

2022-09-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106834 --- Comment #6 from Alexander Monakov --- (In reply to Martin Liška from comment #5) > Do you mean gas or ld? gas > How did you get this output, please (from foo.o or final executable)? >From foo.o like in comment #0.

[Bug c++/106834] GCC creates R_X86_64_GOTOFF64 for 4-bytes immediate

2022-09-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106834 --- Comment #8 from Alexander Monakov --- Right, sorry, due to presence of 'main' I overlooked -fPIC in comment #0, and then after my prompt it got dropped in comment #3. If you modify the testcase as follows and compile it with -fPIC, it's

[Bug middle-end/107115] Wrong codegen from TBAA under stores that change effective type?

2022-10-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107115 --- Comment #12 from Alexander Monakov --- For reference, the previous whacked mole appears to be PR 106187 (where mems_same_for_tbaa_p comes from).

[Bug rtl-optimization/106553] pre-register allocation scheduler is now RMW aware

2022-08-08 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106553 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 --- Comment #16 from Alexander Monakov --- Draft patch for the sched1 issue: https://inbox.sourceware.org/gcc-patches/cf62c3ec-0a9e-275e-5efa-2689ff1f0...@ispras.ru/T/#m95238afa0f92daa0ba7f8651741089e7cfc03481

[Bug target/108229] New: [13 Regression] unprofitable STV transform

2022-12-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108229 Bug ID: 108229 Summary: [13 Regression] unprofitable STV transform Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug middle-end/108209] New: goof in genmatch.cc:commutative_op

2022-12-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108209 Bug ID: 108209 Summary: goof in genmatch.cc:commutative_op Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end

[Bug target/108229] [13 Regression] unprofitable STV transform since r13-4873-g0b2c1369d035e928

2022-12-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108229 --- Comment #3 from Alexander Monakov --- Thank you! I considered this unprofitable for these reasons: 1. As you said, the code grows in size, but the speed benefit is not clear. 2. The transform converts load+add operations in a loop, and

[Bug middle-end/108256] New: Missing integer overflow instrumentation when assignment LHS is narrow

2022-12-31 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108256 Bug ID: 108256 Summary: Missing integer overflow instrumentation when assignment LHS is narrow Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal

[Bug target/108315] New: -mcpu=power10 changes ABI

2023-01-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108315 Bug ID: 108315 Summary: -mcpu=power10 changes ABI Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: ABI, wrong-code Severity: normal Priority: P3

[Bug rtl-optimization/108318] Floating point calculation moved out of loop despite fesetround

2023-01-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108318 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug target/108322] Using __restrict parameter with -ftree-vectorize (default with -O2) results in massive code bloat

2023-01-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108322 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug target/108322] Using __restrict parameter with -ftree-vectorize (default with -O2) results in massive code bloat

2023-01-10 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108322 --- Comment #5 from Alexander Monakov --- (In reply to Richard Biener from comment #4) > > For the case at hand loading two vectors from the destination and then > punpck{h,l}bw and storing them again might be the most efficient thing > to do

[Bug middle-end/108376] TSVC s1279 runs 40% faster with aocc than gcc at zen4

2023-01-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108376 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug middle-end/108209] goof in genmatch.cc:commutative_op

2022-12-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108209 --- Comment #1 from Alexander Monakov --- Keeping notes as I go... Duplicated checks for 'op0' in lower_for are duplicated.

[Bug middle-end/107905] 2x slowdown versus CLANG and ICL

2022-11-30 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107905 --- Comment #6 from Alexander Monakov --- Let me add that Clang supports GCC's -fprofile-{generate,use} flags for compatibility as well.

[Bug tree-optimization/108008] [12 Regression] wrong code with -O3 and posix_memalign

2022-12-08 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108008 --- Comment #9 from Alexander Monakov --- I think this is tree-ldist placing memset(sameZ, 0, zPlaneCount) after the loop, overwriting conditional 'sameZ[i] = true' assignments that happen in the loop. For the smaller testcase from comment #6,

[Bug tree-optimization/108008] [12 Regression] wrong code with -O3 and posix_memalign

2022-12-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108008 --- Comment #10 from Alexander Monakov --- Looks similar to PR 107323, but needs explicit -ftree-loop-distribution to trigger.

[Bug tree-optimization/108076] [10/11/12/13 Regression] GCC with -O3 produces code which fails to link

2022-12-12 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108076 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug tree-optimization/107879] [13 Regression] ffmpeg-4 test suite fails on FPU arithmetics

2022-12-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107879 --- Comment #10 from Alexander Monakov --- If anyone is confused like I was, the commit actually includes a testcase, but the addition is not mentioned in the Changelog. I was sure the server-side receive hook was supposed to reject such

[Bug c/107971] linking an assembler object creates an executable stack

2022-12-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107971 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug middle-end/108140] ICE expanding __rbit

2022-12-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108140 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 Alexander Monakov changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 --- Comment #9 from Alexander Monakov --- (In reply to Feng Xue from comment #8) > In another angle, because gcc already model control flow and SSA web for > setjmp/longjmp, explicit volatile specification is not really needed. That covers

[Bug tree-optimization/108129] New: nop_atomic_bit_test_and_p is too bloated

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108129 Bug ID: 108129 Summary: nop_atomic_bit_test_and_p is too bloated Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 --- Comment #12 from Alexander Monakov --- Shouldn't there be another bug for the sched1 issue specifically? In absence of abnormal control flow, extending lifetimes of pseudos across calls is still likely to be a pessimization.

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 Alexander Monakov changed: What|Removed |Added Resolution|DUPLICATE |FIXED --- Comment #14 from

[Bug rtl-optimization/108117] Wrong instruction scheduling on value coming from abnormal SSA

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108117 Alexander Monakov changed: What|Removed |Added Resolution|FIXED |DUPLICATE --- Comment #15 from

[Bug rtl-optimization/57067] Missing control flow edges for setjmp/longjmp

2022-12-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57067 --- Comment #9 from Alexander Monakov --- *** Bug 108117 has been marked as a duplicate of this bug. ***

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-12-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #11 from Alexander Monakov --- Factoring out Lujiazui divider shrinks its tables by almost 20x: 3 r lujiazui_decoder_min_issue_delay 20 r lujiazui_decoder_transitions 32 r lujiazui_agu_min_issue_delay 126 r lujiazui_agu_transitions

[Bug c++/108008] Compiler mis-optimization with posix_memalign

2022-12-07 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108008 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug target/107676] Nonsensical docs for -mrelax-cmpxchg-loop

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107676 Alexander Monakov changed: What|Removed |Added Status|NEW |RESOLVED CC|

[Bug tree-optimization/107715] TSVC s161 for double runs at zen4 30 times slower when vectorization is enabled

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107715 --- Comment #3 from Alexander Monakov --- There's a forward dependency over 'c' (read of c[i] vs. write of c[i+1] with 'i' iterating forward), and the vectorized variant takes the hit on each iteration. How is a slowdown even surprising. For

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #8 from Alexander Monakov --- (In reply to Jan Hubicka from comment #7) > > 53730 r btver2_fp_min_issue_delay > > 53760 r znver1_fp_transitions > > 93960 r bdver3_fp_transitions > > 106102 r lujiazui_core_check > > 106102 r

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #10 from Alexander Monakov --- (In reply to Jan Hubicka from comment #9) > Actually for older cores I think the manufacturers do not care much. I > still have a working Bulldozer machine and I can do some testing. > I think in

[Bug middle-end/107719] 14% regression on TSVC s3113 on znve4 compared to GCC 7.5

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107719 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug rtl-optimization/107772] function prologue generated even though it's only needed in an unlikely path

2022-11-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107772 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel and AMD CPUs with AVX

2022-11-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 --- Comment #26 from Alexander Monakov --- Sure, the right course of action seems to be to simply document that atomic types and built-ins are meant to be used on "common" (writeback) memory, and no guarantees can be given otherwise, because it

[Bug target/104688] gcc and libatomic can use SSE for 128-bit atomic loads on Intel and AMD CPUs with AVX

2022-11-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104688 --- Comment #24 from Alexander Monakov --- (In reply to Peter Cordes from comment #23) > But at least on Linux, I don't think there's a way for user-space to even > ask for a page of WT or WP memory (or UC or WC). Only WB memory is easily >

[Bug middle-end/107905] 2x slowdown versus CLANG and ICL

2022-11-29 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107905 Alexander Monakov changed: What|Removed |Added Keywords|ra | CC|

[Bug driver/107787] -Werror=array-bounds=X does not work as expected

2022-11-30 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107787 Alexander Monakov changed: What|Removed |Added Status|NEW |RESOLVED CC|

[Bug target/87832] AMD pipeline models are very costly size-wise

2022-11-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #6 from Alexander Monakov --- With these patches on trunk, current situation is: nm -CS -t d --defined-only gcc/insn-automata.o | sed 's/^[0-9]* 0*//' | sort -n | tail -40 2496 r slm_base 2527 r bdver3_load_min_issue_delay 2746 r

[Bug tree-optimization/107647] [12/13 Regression] GCC 12.2.0 may produce FMAs even with -ffp-contract=off

2022-11-17 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107647 --- Comment #15 from Alexander Monakov --- I'm confused about the first hunk in the attached patch: --- a/gcc/tree-vect-slp-patterns.cc +++ b/gcc/tree-vect-slp-patterns.cc @@ -1035,8 +1035,10 @@ complex_mul_pattern::matches

[Bug tree-optimization/97832] AoSoA complex caxpy-like loops: AVX2+FMA -Ofast 7 times slower than -O3

2022-11-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97832 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

[Bug middle-end/107879] [13 Regression] ffmpeg-4 test suite fails on FPU arithmetics

2022-11-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107879 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org ---

<    1   2   3   4   >