https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110438
Bug ID: 110438
Summary: generating all-ones zmm needs dep-breaking pxor before
ternlog
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110202
--- Comment #9 from Alexander Monakov ---
(In reply to Hongtao.liu from comment #8)
>
> For this one, we can load *a into %zmm0 to avoid false_dependence.
>
> vmovdqau ZMMWORD PTR [rdi], zmm0
> vpternlogq zmm0, zmm0, zmm0, 85
Yes, since
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110431
Bug ID: 110431
Summary: Incorrect disambiguation of wide accesess from
store-merging or SLP
Product: gcc
Version: 12.3.0
Status: UNCONFIRMED
Keywords:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110237
--- Comment #21 from Alexander Monakov ---
(In reply to rguent...@suse.de from comment #19)
> But the size argument doesn't have anything to do with TBAA (and
> may_alias is about TBAA). I don't think we have any way to circumvent
> C object
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110273
--- Comment #6 from Alexander Monakov ---
Huh? Just compile the supplied testcases without avx512, you'll see proper
stack realignment.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110307
--- Comment #13 from Alexander Monakov ---
Note to self: check how control_flow_insn_p relates.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110307
--- Comment #6 from Alexander Monakov ---
Cross-compiler needs HAVE_AS_EXPLICIT_RELOCS=1.
With checking enabled, we get:
t.c:8:1: error: flow control insn inside a basic block
(call_insn 97 96 98 4 (parallel [
(set (reg:DI 0 $0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110307
--- Comment #3 from Alexander Monakov ---
Do you have older versions of GCC to check on this testcase?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110260
--- Comment #10 from Alexander Monakov ---
Right, those are different issues. Any chance of a standalone testcase
extracted from Wine? If you already see a function where stack realignment is
missing, just give us preprocessed containing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #24 from Alexander Monakov ---
Appreciate the advice. So far I've managed to reduce the number of LTO inputs
down to two files, RegisterBankInfo.cpp.o plus APInt.cpp.o. I also built
gcc-12.3 with lineinfo and have a better
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #26 from Alexander Monakov ---
Would that help? GCC raises its own stack limit to 64MB:
gcc.cc: stack_limit_increase (64 * 1024 * 1024);
toplev.cc: stack_limit_increase (64 * 1024 * 1024);
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #31 from Alexander Monakov ---
(In reply to Xi Ruoyao from comment #28)
> "To put it simply, operator delete for class User inspects memory of the
> object after the end of its lifetime. This shows as a use-after-dtor error
> when
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109841
Bug ID: 109841
Summary: [12/13/14 Regression] ranger ICE in
operator_bitwise_not::fold_range
Product: gcc
Version: 12.3.0
Status: UNCONFIRMED
Keywords:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #32 from Alexander Monakov ---
Ranger ICE is PR 109841 (reduced so it doesn't need LTO).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109849
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109806
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109690
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90746
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #8 from Alexander Monakov ---
Ah, forgot to mention that compiler the offending User.cpp without -flto also
avoids the problem.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #7 from Alexander Monakov ---
This problem seems to go way back. I'm told even gcc-9 broke LLVM like that.
For my investigation, I took latest gcc-11 snapshot and llvm-13.0.1.
My conclusion that it is a lifetime-dse violation in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902
--- Comment #20 from Alexander Monakov ---
I missed it the first time around, but placing PAREN_EXPR around the complete
expression won't work: nothing will prevent GCC from duplicating evaluations of
the sub-expressions, and then randomly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #17 from Alexander Monakov ---
Right, thanks, I think SUSE build log confirms that (careful, large file):
https://build.opensuse.org/public/build/openSUSE:Factory/standard/x86_64/llvm16/_log
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #21 from Alexander Monakov ---
(In reply to Xi Ruoyao from comment #18)
> Maybe. Should we send a patch?
Yes, if we have a volunteer.
> If I read the LLVM code correctly, -fno-strict-aliasing is enabled for
> Clang, but not other
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #22 from Alexander Monakov ---
(In reply to Jan Hubicka from comment #19)
> It would be really nice to have the ranger bug fixed. Since lifetime
> DSE is all handled in C++ FE there is no good reason why it should not
> work to LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #10 from Alexander Monakov ---
Indeed, that makes things easier, thanks.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #12 from Alexander Monakov ---
That would not fix the problem, lifetime-dse affects code that creates 'class
User' objects, not the implementation of its 'operator new' override.
(also the linked bug says "MDNode has the same
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106943
--- Comment #14 from Alexander Monakov ---
(In reply to Jan Hubicka from comment #13)
> Indeed it is quite long time problem with clang not building with lifetime
> DSE and strict aliasing. I wonder why this is not fixed on clang side?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902
--- Comment #22 from Alexander Monakov ---
Created attachment 55105
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55105=edit
patch 1/3
(In reply to Richard Biener from comment #21)
>
> Sounds reasonable. Though I wouldn't use GENERIC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902
--- Comment #25 from Alexander Monakov ---
(In reply to Richard Biener from comment #24)
> As of the patch it looks good, I wonder if we want to check for OPTIMIZE_BOTH
> though since at least when no extra negations are required the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902
--- Comment #26 from Alexander Monakov ---
> > Did you run into any of NON_LVALUE / C_MAYBE_CONST wrappings of the
> > multiplication btw?
>
> No, I'm not familiar with those, so I didn't try to construct corresponding
> testcases.
I had a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
--- Comment #10 from Alexander Monakov ---
(In reply to Martin Liška from comment #9)
> Started with zen tuning revision r13-4839-geef81eefcdc2a5.
The issue is also reproducible with -march=haswell or -march=skylake, so you
can use those for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
Alexander Monakov changed:
What|Removed |Added
Summary|csmith: runtime crash with |[12/13/14 Regression]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109780
--- Comment #12 from Alexander Monakov ---
Eh, that commit sneakily changed avx2 tuning without explaining that in the
Changelog. Anyway, it should possible to "workaround" that by compiling with
-O2 -mavx2 -mtune=skylake-avx512
instead, in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80922
Alexander Monakov changed:
What|Removed |Added
CC||bruno at clisp dot org
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109916
Alexander Monakov changed:
What|Removed |Added
Resolution|--- |DUPLICATE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109892
Bug ID: 109892
Summary: SLP failure with explicit fma
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113560
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113903
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113890
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44179
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113159
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113280
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113293
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113082
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=07
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112697
--- Comment #9 from Alexander Monakov ---
... as does inserting a nop before the compare ¯\_(ツ)_/¯
--- d.out.ltrans0.ltrans.slow.s 2023-12-01 18:32:54.255841611 +0300
+++ d.out.ltrans0.ltrans.s 2023-12-01 18:53:04.909438690 +0300
@@
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112697
--- Comment #8 from Alexander Monakov ---
Thanks, I can reproduce it. It is pretty tricky though. For instance, just
swapping the mov and the compare is enough to make it fast:
--- d.out.ltrans0.ltrans.slow.s 2023-12-01 18:32:54.255841611
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111655
--- Comment #13 from Alexander Monakov ---
> Then there is the MULT_EXPR x * x case
This is PR 111701.
It would be nice to clarify what "nonnegative" means in the contracts of this
family of functions, because it's ambiguous for NaNs and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112699
--- Comment #2 from Alexander Monakov ---
Sorry, even though GCC's limits.h is installed under include-fixed, it is
generated separately, not by the generic fixincludes mechanism. I was confused.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112699
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112701
Bug ID: 112701
Summary: wrong type inference for ternary operator in
preprocessing context
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112697
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110307
Alexander Monakov changed:
What|Removed |Added
CC||uros at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114765
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114960
Bug ID: 114960
Summary: [12/13/14/15 Regression] fails to clean up vector
casts
Product: gcc
Version: 12.3.1
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115014
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114923
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114923
--- Comment #4 from Alexander Monakov ---
You can place points of possible access outside of abstract machine in a
fine-grained manner with volatile asms:
asm volatile("" : "=m"(buf));
This cannot be reordered against accesses to volatile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944
--- Comment #4 from Alexander Monakov ---
Like this:
pandxmm1, XMMWORD PTR .LC0[rip]
movaps XMMWORD PTR [rsp-40], xmm0
xor eax, eax
xor edx, edx
movaps XMMWORD PTR [rsp-24], xmm1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115091
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261
--- Comment #3 from Alexander Monakov ---
The first attachment is empty (perhaps you made a non-recursive archive when
you meant to recursively zip a directory).
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261
--- Comment #8 from Alexander Monakov ---
If we want to get rid of the compilation time regression sooner rather than
later, I can suggest limiting my change only to functions that call setjmp:
diff --git a/gcc/sched-deps.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261
--- Comment #10 from Alexander Monakov ---
Indeed, but OTOH according to bug 84402 comment 58 it caused a noticeable hit
on gimple-match.cc compilation:
733a1b777f16cd397b43a242d9c31761f66d3da8 13th January 2023
sched-deps: do not schedule
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261
Alexander Monakov changed:
What|Removed |Added
CC||mkuvyrkov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114337
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480
--- Comment #21 from Alexander Monakov ---
It is possible to reduce gcc_qsort workload by improving the presorted-ness of
the array, but of course avoiding quadratic behavior would be much better.
With the following change, we go from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480
--- Comment #20 from Alexander Monakov ---
(note that if you uninclude the testcase and compile with -fno-exceptions it's
much faster)
On the smaller testcase from comment 14, prune_unused_phi_nodes invokes
gcc_qsort 53386 times. There are two
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66487
--- Comment #28 from Alexander Monakov ---
The bug is about the issue of lacking diagnostics, it should be fine to make
note of various approaches to remedy the problem in one bug report.
(in any case, all discussion of the Valgrind-based
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115170
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115132
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161
--- Comment #18 from Alexander Monakov ---
No, allowing value-changing transformations under -ftrapping-math is really not
appropriate. Invoking the intrinsic on a large floating-point value is not UB.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161
--- Comment #20 from Alexander Monakov ---
(In reply to Jakub Jelinek from comment #19)
> If we guarantee that we never constant fold FIX/UNSIGNED_FIX with
> -ftrapping-math (we shouldn't, as the exceptions should be raised), then
> using
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161
--- Comment #23 from Alexander Monakov ---
(In reply to Sergei Trofimovich from comment #22)
> Here `pcmpeqd %xmm2,%xmm1` is a problematic instruction. Why does `gcc` use
> `%xmm2` (result of `cvttps2dq`) instead of, say `%xmm0` which contains
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115333
Alexander Monakov changed:
What|Removed |Added
CC||amonakov at gcc dot gnu.org
---
301 - 379 of 379 matches
Mail list logo