https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #9 from Tamar Christina ---
(In reply to prathamesh3492 from comment #8)
> Hi Tamar,
> Using -falign-loops=5 indeed brings back the performance.
> The adrp instruction has same address (0x4ae784) by setting -falign-loops=5
> (which
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Blocks: 53947
Target Milestone: ---
Meta tickets about early break vectorization to better keep track of early
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120
--- Comment #3 from Tamar Christina ---
That makes sense, though I also wonder how it works for scalar multi exit
loops, IVops has various checks on single exits.
I guess one problem is that the code in IVops that does this uses the exit to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #7 from Tamar Christina ---
Yeah, it's most likely an alignment issue, especially as there's no code
changes.
We run our benchmarking with different flags so it may be why we don't see it.
the loop seems misaligned, you can try
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114412
--- Comment #5 from Tamar Christina ---
(In reply to Filip Kastl from comment #4)
> (In reply to Tamar Christina from comment #3)
> > Hi Filip,
> >
> > Do you generate these runs with counters based PGO or compiler
> > instrumentation?
> >
>
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
The testcase in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151 had another
"regression" in that the same loop see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114412
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
|ASSIGNED
Last reconfirmed||2024-05-13
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
--- Comment #8 from Tamar Christina ---
(In reply to Richard Biener from comment #7)
> Likely
>
> Base: (integ
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #6 from Tamar Christina ---
Created attachment 58096
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58096=edit
exchange2.fppized-bad.f90.187t.ivopts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #5 from Tamar Christina ---
Created attachment 58095
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58095=edit
exchange2.fppized-good.f90.187t.ivopts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #4 from Tamar Christina ---
reduced more:
---
module brute_force
integer, parameter :: r=9
integer block(r, r, 0)
contains
subroutine brute
do
do
do
do
do
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> > which is harder for prefetchers to follow.
>
> This seems like a limitation in the HW prefetcher rather than anything else.
> Maybe the cost model for
: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
CC: rguenth at gcc dot gnu.org
Target Milestone: ---
With the original fix from PR114074 applied (e.g
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92538
Tamar Christina changed:
What|Removed |Added
CC||jamborm at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #3 from Tamar Christina ---
I cannot reproduce this even recompiling libc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #1 from Tamar Christina ---
Hmm
I Am unable to reproduce this with -O3 - flto -mcpu=neoverse-v2 on a
neoverse-v2 machine.
Is any other option required?
Also that code was new in gcc 14 and was partially reverted due to register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114766
--- Comment #2 from Tamar Christina ---
(In reply to Vladimir Makarov from comment #1)
> (In reply to Tamar Christina from comment #0)
> > The documentation for ^ states:
>
> If it works for you, we could try to use the patch (although it needs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114769
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114769
--- Comment #2 from Tamar Christina ---
I believe this is safe, but the interface is definitely not the cleanest.
vect_recog_absolute_difference has two callers:
1. vect_recog_sad_pattern where if you return true with unprom not set, then
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113625
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
-optimization
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
CC: vmakarov at gcc dot gnu.org
Target Milestone: ---
The documentation for ^ states
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114513
Bug 114513 depends on bug 114741, which changed state.
Bug 114741 Summary: [14 regression] aarch64 sve: unnecessary fmov for scalar
int bit operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
What|Removed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
--- Comment #6 from Tamar Christina ---
and the exact armv9-a cost model you quoted, also does the right codegen.
https://godbolt.org/z/obafoT6cj
There is just an inexplicable penalty being applied to the r->r alternative.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114741
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113552
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #26 from Tamar Christina ---
(In reply to Richard Biener from comment #25)
> That means, when the loop takes the early exit we _must_ take that during
> the vector iterations. Peeling for gaps means if we would take the early
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #24 from Tamar Christina ---
(In reply to Richard Biener from comment #23)
> Maybe easier to understand testcase:
>
> with -O3 -msse4.1 -fno-vect-cost-model we return 20 instead of 8. Adding
> -fdisable-tree-cunroll avoids the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #22 from Tamar Christina ---
note that due to the secondary exit the actual full vector iteration count is 8
scalar elements at VF=4 == 2.
And it's this boundary condition where we fail, since ceil (8/4) == 2. any
other value would
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #21 from Tamar Christina ---
Created attachment 57932
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57932=edit
loop.c
attached reduced testcase that reproduces the issue and also checks the buffer
position and copied values.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
--- Comment #6 from Tamar Christina ---
(In reply to Jakub Jelinek from comment #4)
> Now, with SVE/RISCV vectors the actual vectorization factor is a poly_int
> rather than constant. One possibility would be to use VLA arrays in those
>
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
The following testcase reduced from an HPC workload:
#include
#define RESTRICT restrict
void work(int n, float
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*
The following sequence:
#include
svint32_t f (int *a, int *b)
{
int32x4_t va = vld1q_s32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114510
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*
Created attachment 57864
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682
--- Comment #9 from Tamar Christina ---
(In reply to Andrew Pinski from comment #8)
> This might be the path splitting running on the gimple level causing issues
> too; see PR 112402 .
Ah that's a good shout. It looks like Richi already
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #20 from Tamar Christina ---
This is a bad interaction with early break and peeling for gaps.
when peeling for gaps we set bias_for_lowest to 0, which then negates the ceil
for the upper bound calculation when the div is exact.
We
||2024-04-02
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
Ever confirmed|0 |1
--- Comment #19 from Tamar Christina ---
Thanks! back from holidays and looking into it now.
mine.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345
--- Comment #5 from Tamar Christina ---
(In reply to Richard Biener from comment #4)
> Well, the shuffling in .LOAD_LANES will be a bit awkward to do, but sure. We
> basically lack "constant folding" of .LOAD_LANES and similarly of course
> we
-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*
The following example:
#include
svfloat64_t widening (svint32_t
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114345
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> Oh VN does have some knowledge of MASK_STORE and LEN_STORE. Just not
> LOAD_LANES .
>
>
> See PR 106365 for MASK_STORE and LEN_STORE implementation.
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
The following example:
---
double f(int n, double *data, double b) {
double res = b;
for (int i=0;i
_48
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
The following testcase:
---
long tdiff = 10412095;
int main() {
struct {
long maximum;
int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114339
--- Comment #6 from Tamar Christina ---
vectorizer generates:
mask_patt_21.19_58 = vect_perm_even_49 >= vect_cst__57;
mask_patt_21.19_59 = vect_perm_even_55 >= vect_cst__57;
vexit_reduc_63 = mask_patt_21.19_58 | mask_patt_21.19_59;
if
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151
--- Comment #17 from Tamar Christina ---
> So doing in the vectorizer sth like the following should get us the best
> possible ranges? Ah, probably only global ranges since the SCEV query
> itself would still lack context sensitive info (but
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114234
Tamar Christina changed:
What|Removed |Added
Last reconfirmed||2024-03-05
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #12 from Tamar Christina ---
and it's not the first time we have conditional lowering. We already do so for
e.g. shifts, where shifting by an amount => bitsize of a vector element is
defined behavior or AArch64.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #11 from Tamar Christina ---
(In reply to Andrew Pinski from comment #10)
> (In reply to Tamar Christina from comment #9)
> > While RA should be able to deal with this,
> > shouldn't we also just lower TBLs in gimple?
> >
> > This
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151
--- Comment #3 from Tamar Christina ---
>
> This was a correctness fix btw, so I'm not sure we can easily recover - we
> could try using niter information for CHREC_VARIABLE but then there's
> variable niter here so I don't see a chance.
>
: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
CC: rguenth at gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #9 from Tamar Christina ---
While RA should be able to deal with this,
shouldn't we also just lower TBLs in gimple?
This no reason why this can't be a VEC_PERM_EXPR which would also get the
copies
removed at the gimple level and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102171
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> I think I am going to implement this (or assign it interally to someone else
> to implement).
If you do, please also remove them from arm_neon.h and use the
--- Comment #29 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #28)
> On Mon, 26 Feb 2024, tnfchris at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
> >
> > --- Comment #27 from Tamar Christina ---
> > Cre
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530
--- Comment #8 from Tamar Christina ---
(In reply to Andrew Pinski from comment #6)
> With my patch for V4QI, we still don't get the best code:
> vect_perm_even_271 = VEC_PERM_EXPR 4, 6 }>;
> vect_perm_even_273 = VEC_PERM_EXPR 4, 6 }>;
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #27 from Tamar Christina ---
Created attachment 57538
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57538=edit
proposed1.patch
proposed patch, this gets the gathers and scatters back. doing regression run.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099
--- Comment #8 from Tamar Christina ---
Created attachment 57537
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57537=edit
uses.patch
new code seems sensitive to visitation order as get_virtual_phi returns NULL
for blocks which don't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114081
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
--- Comment #14 from Tamar Christina ---
patch submitted
https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646415.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
--- Comment #13 from Tamar Christina ---
Created attachment 57510
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57510=edit
candidate-patch1.patch
candidate patch being tested.
I was hoping to correct it during peeling itself when the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
--- Comment #12 from Tamar Christina ---
looks like the moving of the store didn't update a stray out of block use of
the MEM.
working on patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*
The following example:
void fn (short *a, short *b, short *c, int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #4 from Tamar Christina ---
(In reply to Andrew Pinski from comment #3)
> Confirmed.
>
> Though maybe we should drop them in the vectorized version of the loop. HW
> prefetchers usually do a decent job and sometimes (maybe most) SW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #2 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> I thought there was already one recorded about this.
I could only find https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103938 about an
ICE when prefetching a
: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
The following example:
void foo(double * restrict a, double * restrict b, int n){
int i;
for(i=0; i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257
--- Comment #5 from Tamar Christina ---
(In reply to Sam James from comment #3)
> (In reply to Richard Earnshaw from comment #2)
> I'm missing why the combination then works though?
So we've made several changes here over time.
-mcpu=native
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Summary|[14 Regression]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295
Tamar Christina changed:
What|Removed |Added
Keywords|needs-bisection |
Summary|[14 Regression]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295
--- Comment #3 from Tamar Christina ---
I'm however able to reproduce it at -Ofast alone, no need for `-flto`
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295
--- Comment #2 from Tamar Christina ---
bisected to
commit g:2f46e3578d45ff060a0a329cb39d4f52878f9d5a
Author: Richard Sandiford
Date: Thu Dec 14 13:46:16 2023 +
aarch64: Improve handling of accumulators in early-ra
Being very
||2024-02-19
Ever confirmed|0 |1
Priority|P3 |P1
CC||tnfchris at gcc dot gnu.org
--- Comment #1 from Tamar Christina ---
Ah, I missed this. Yeah we've seen it but didn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56
--- Comment #21 from Tamar Christina ---
(In reply to Richard Biener from comment #18)
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 7cf9504398c..8deeecfd4aa 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56
--- Comment #15 from Tamar Christina ---
and just -O3 -march=armv8-a+sve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56
--- Comment #14 from Tamar Christina ---
(In reply to Richard Biener from comment #13)
> I didn't add STMT_VINFO_SLP_VECT_ONLY, I'm quite sure we can now do both SLP
> of masked loads and stores, so yes, STMT_VINFO_SLP_VECT_ONLY (when we formed
,
||tnfchris at gcc dot gnu.org
--- Comment #1 from Tamar Christina ---
This is a jumpthreading regression caused by:
commit g:0cfc9c953d0221ec3971a25e6509ebe1041f142e
Author: Andrew MacLeod
Date: Thu Aug 17 12:34:59 2023 -0400
Phi analyzer - Initialize with range instead
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56
Tamar Christina changed:
What|Removed |Added
CC||rguenth at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107071
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113903
--- Comment #2 from Tamar Christina ---
(In reply to Alexander Monakov from comment #1)
> Lifting those insns from the L8 BB to the L10 BB requires duplicating them
> on all incoming edges targeting L8, doesn't it?
>
No, because they're
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
The following testcase:
#define N 306
#define NEEDLE 136
int table[N];
int foo (int i, unsigned short parse_tables_n
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
Tamar Christina changed:
What|Removed |Added
Component|middle-end |tree-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
--- Comment #24 from Tamar Christina ---
The case I thought would go wrong with the above fix is:
#include
#include
#include
#define N 306
#define NEEDLE 135
__attribute__ ((noipa, noinline))
int use(int x[N])
{
printf("res=%d\n",
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
--- Comment #23 from Tamar Christina ---
small standalone reducer:
#include
#include
#include
#define N 306
#define NEEDLE 136
__attribute__ ((noipa, noinline))
int use(int x[N])
{
printf("res=%d\n", x[NEEDLE]);
return x[NEEDLE];
}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
--- Comment #22 from Tamar Christina ---
(In reply to Richard Biener from comment #21)
> loop->nb_iterations_upper_bound exactly is an upper bound on the number of
> latch executions, so maybe I'm missing the point here. When we update it it
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
--- Comment #20 from Tamar Christina ---
[local count: 21718864]:
...
_54 = (short unsigned int) bits_106;
_26 = _54 >> 9;
_88 = _139 + 7;
_89 = _88 & 7;
_111 = _26 + 10;
[local count: 181308616]:
# i_66 = PHI
#
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
--- Comment #19 from Tamar Christina ---
Ok, removing all the noise shows that this is the same issue as I saw before.
The code out of the vectorizer is correct, but cunroll does a dodgee unrolling.
-fdisable-tree-cunroll confirms it's the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
--- Comment #18 from Tamar Christina ---
Loop that gets miscompiled is the initialization loop:
while (parse_tables_n-- && i < 306)
table[i++] = 0;
and indeed, the compiler seems to also be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
--- Comment #17 from Tamar Christina ---
(In reply to Sam James from comment #16)
> Created attachment 57393 [details]
> test.c
>
> OK, all done now (I figured I'd let cvise finish). No more :)
>
> By the way, this fails on arm64 too (at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113734
--- Comment #15 from Tamar Christina ---
(In reply to Sam James from comment #14)
> Created attachment 57390 [details]
> test.c
>
> I'll try reducing it preprocessed now (couldn't do it before as checking w/
> clang as well in the reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113808
--- Comment #2 from Tamar Christina ---
I guess whether that code is correct depends on which exit was picked though.
I'll look at dump too.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113750
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
1 - 100 of 1006 matches
Mail list logo