Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Blocks: 53947
Target Milestone: ---
Target: aarch64*
Given the following vectors
a = [A1 A0]
b = [C D ]
c = [E D ]
we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121869
--- Comment #9 from Tamar Christina ---
bootstrap and regtest finished, pushed revert of
a632becefad29206a980cc080eee74ed808f9cd3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121869
--- Comment #4 from Tamar Christina ---
reproducer
---
short *elf_uncompress_lzma_block_probs;
void __glibcxx_backtrace_uncompress_lzma() {
int i = 0;
for (; i < 300; i++)
elf_uncompress_lzma_block_probs[i] = 10;
}
---
compiled with -O
-sun-solaris2.11 |sparc-sun-solaris2.11,
||arm-none-linux-gnu
CC||tnfchris at gcc dot gnu.org
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121869
--- Comment #3 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #2)
> > Am 09.09.2025 um 09:34 schrieb tnfchris at gcc dot gnu.org
> > :
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121869
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121766
--- Comment #4 from Tamar Christina ---
The original change happened because with the cost model disabled we started
costing inductions again and stopped costing truncations.
The not costing of truncation is just a missing feature, but I think
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
--- Comment #22 from Tamar Christina ---
(In reply to Ramana Radhakrishnan from comment #17)
>
>
> (In reply to Tamar Christina from comment #16)
> > (In reply to Soumya AR from comment #13)
> > > Hi Tamar,
> > >
> > > Thanks for the fix.
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
--- Comment #23 from Tamar Christina ---
(In reply to Dhruv Chawla from comment #18)
> Hi Tamar, here is a (somewhat-)minimized repro for the RAJAPerf kernel that
> Ramana mentioned: https://godbolt.org/z/jh8Ke6hPx - I'll also attach the
> code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121766
--- Comment #3 from Tamar Christina ---
Ok, have a patch that teaches the cost model about truncating stores.
We now get the SVE code again, but this makes me wonder about the widening
loads code.
We also pick a ridiculous unroll factor for th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
--- Comment #16 from Tamar Christina ---
(In reply to Soumya AR from comment #13)
> Hi Tamar,
>
> Thanks for the fix.
>
> This has now brought back performance for the mentioned kernels with -Ofast
> but is now regressing with -O3 ...
>
> Is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
--- Comment #15 from Tamar Christina ---
(In reply to Soumya AR from comment #13)
> Hi Tamar,
>
> Thanks for the fix.
>
> This has now brought back performance for the mentioned kernels with -Ofast
> but is now regressing with -O3 ...
>
> Is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
--- Comment #14 from Tamar Christina ---
(In reply to Soumya AR from comment #13)
> Hi Tamar,
>
> Thanks for the fix.
>
> This has now brought back performance for the mentioned kernels with -Ofast
> but is now regressing with -O3 ...
>
> Is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121766
Tamar Christina changed:
What|Removed |Added
CC|tamar.christina at arm dot com |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121766
--- Comment #1 from Tamar Christina ---
The change just brought back the costing to what it was before
https://godbolt.org/z/vWdrf6jPd
But it does look like the costing for the SVE modes does not take into account
that for some modes using a na
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
--- Comment #12 from Tamar Christina ---
Fixed, there's a throughput costing issue in the middle-end that I still need
to look at, but this should fix the reported failures.
I'll keep the ticket open so I remember the mid-end issue.
|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
Component|tree-optimization |target
--- Comment #10 from Tamar Christina ---
Ok mine, testing some patches.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
--- Comment #9 from Tamar Christina ---
Right, so the real tsvc calls
dummy(a, b, c, d, e, aa, bb, cc, chksum);
in each of these to make it so you can't elide the outer loop.
So back to costing. The scalar loop inner costing is quite weird,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121536
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121536
--- Comment #5 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #4)
> On Thu, 14 Aug 2025, tnfchris at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121536
> >
> > -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120996
Tamar Christina changed:
What|Removed |Added
Last reconfirmed||2025-08-15
Status|UNCONFI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121536
--- Comment #3 from Tamar Christina ---
Richi: g:fb59c5719c17a04ecfd58b5e566eccd6d2ac583a is problematic for us because
without the type we can't tell which one of our scalar register file the
operation is working on.
e.g. a + b has different c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121536
--- Comment #2 from Tamar Christina ---
(In reply to Jennifer Schmitz from comment #0)
>
> Comparing the 185t.vect dumps shows that scalar_stmt previously cost 2,
> while it now costs 1, making vectorization less profitable.
>
Yeah it looks l
||2025-08-13
CC||tnfchris at gcc dot gnu.org
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
--- Comment #1 from Tamar Christina ---
The main loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
--- Comment #7 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #6)
> On Wed, 13 Aug 2025, tnfchris at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
> >
> > -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
--- Comment #5 from Tamar Christina ---
In gimple that's
[local count: 108459]:
x_22 = a[0];
_69 = {x_22, x_22, x_22, x_22};
[local count: 10737416]:
# ivtmp_83 = PHI
[local count: 1063004408]:
# i_43 = PHI
# ivtmp_34 = P
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121290
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121382
Tamar Christina changed:
What|Removed |Added
CC||rguenth at gcc dot gnu.org
--- Commen
|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
--- Comment #6 from Tamar Christina ---
Looks like because it thinks that
f = f - e;
if (c - 9 * f) {
__builtin_abort();
is always true, but misses the obvious case that when c == 0 and e ==
2147483647 c will be 0 and stay 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121382
--- Comment #5 from Tamar Christina ---
Ok, taking a look at c0 and C2.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121383
--- Comment #3 from Tamar Christina ---
(In reply to Richard Biener from comment #2)
> if-conversion does nothing to loops with multiple exits. One could think to
> sub-divide its work, but in this case there's a lack of PHIs and the
> expected
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120996
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
: normal
Priority: P3
Component: tree-optimization
Assignee: tnfchris at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Blocks: 53947, 115130
Target Milestone: ---
The following example loop:
#define N 1000
int a[N] = {0};
int b[N] = {0};
int c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
Bug 115130 depends on bug 54013, which changed state.
Bug 54013 Summary: Loop with control flow not vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013
What|Removed |Added
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 54013, which changed state.
Bug 54013 Summary: Loop with control flow not vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013
What|Removed |Added
--
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 121190, which changed state.
Bug 121190 Summary: [15 regression] Segmentation fault when executing
vectorized loops with -O3 -march=znver2 since r15-6807-g68326d5d1a593d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121190
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121190
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Blocks|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121190
--- Comment #7 from Tamar Christina ---
Backported to GCC 15 to make RC1.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121020
Tamar Christina changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121190
Tamar Christina changed:
What|Removed |Added
Summary|[15/16 regression] |[15 regression]
|Se
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119577
Tamar Christina changed:
What|Removed |Added
Blocks||53947, 115130
--- Comment #5 from Tam
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
--- Comment #19 from Tamar Christina ---
(In reply to Avinash Jayakar from comment #18)
> (In reply to Tamar Christina from comment #17)
> > (In reply to Avinash Jayakar from comment #16)
> > > Created attachment 61956 [details]
> > > I have jus
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121240
--- Comment #2 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> Can't use section anchors due to merging at link time.
We can, you just have to make them a subsection, then you control the
unit of merging.
-optimization
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Couldn't find an existing ticket for this. but the following example
const dou
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
--- Comment #17 from Tamar Christina ---
(In reply to Avinash Jayakar from comment #16)
> Created attachment 61956 [details]
> I have just changed the order within the conditional where the const_vf gets
> assigned, to have it behave in similar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
--- Comment #13 from Tamar Christina ---
(In reply to Avinash Jayakar from comment #11)
>
> > I think the code before worked because a non-partial epilogue would have
> > niters_vector
> > be a const (e.g. a gimple value) but the partial iterati
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
--- Comment #9 from Tamar Christina ---
(In reply to Avinash Jayakar from comment #8)
> (In reply to Tamar Christina from comment #7)
> > (In reply to Avinash Jayakar from comment #6)
> > No, const_vf will be 0 when vector length agnostic code i
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
--- Comment #7 from Tamar Christina ---
(In reply to Avinash Jayakar from comment #6)
> (In reply to Tamar Christina from comment #5)
> > (In reply to Avinash Jayakar from comment #4)
> > > So my main doubt here is const_vf, is supposed to be 0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
--- Comment #5 from Tamar Christina ---
(In reply to Avinash Jayakar from comment #4)
> So my main doubt here is const_vf, is supposed to be 0 for the epilogue
> block right, just like log_vf was null for the epilogue. If so, this is a
> simple
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922
--- Comment #9 from Tamar Christina ---
(In reply to Robin Dapp from comment #6)
> (In reply to Tamar Christina from comment #5)
> > Question, can I count on
> >
> > -march=rv64gcv_zvl1024b -mrvv-vector-bits=zvl -mrvv-max-lmul=m8
> >
> > alway
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120980
--- Comment #10 from Tamar Christina ---
Could we perhaps emit additional annotation into gimple to describe what the
vectorizer thinks is safe? And the tool verifies the claims?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922
--- Comment #5 from Tamar Christina ---
Question, can I count on
-march=rv64gcv_zvl1024b -mrvv-vector-bits=zvl -mrvv-max-lmul=m8
always being available as a codegen option for RVV? or do I need some
require-effective-target checks?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120980
--- Comment #5 from Tamar Christina ---
(In reply to Richard Biener from comment #4)
>
> Now, the testcase shows a missed optimization - we are unnecessarily
> using a large VF because of
>
> t.c:2:21: note: ==> examining phi: ivtmp_21 = PHI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120922
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817
--- Comment #17 from Tamar Christina ---
(In reply to Richard Biener from comment #16)
> No, that cannot be required for correct operation. I think DSE is wrong in
> assessing that the store covers more than 5 bytes. The following fixes it
> f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817
--- Comment #15 from Tamar Christina ---
(In reply to Richard Biener from comment #13)
> (In reply to Tamar Christina from comment #12)
> > Looks like the problem is that during ao_ref_init_from_ptr_and_range when
> > initializing vectp_target.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120980
--- Comment #1 from Tamar Christina ---
I'm not sure that I'd draw the same conclusion. I view it as the vectorizer has
put a 32-byte alignment requirement on the object and so I'd consider the
object itself to be 32-bytes sized.
So to the not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817
--- Comment #12 from Tamar Christina ---
Looks like the problem is that during ao_ref_init_from_ptr_and_range when
initializing vectp_target.14_54 = &targetD.4595 + _55;
we don't enter the block splitting apart POINTER_PLUS_EXPR.
So it ends up
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817
--- Comment #11 from Tamar Christina ---
(In reply to Richard Biener from comment #10)
> (In reply to Tamar Christina from comment #8)
> > C testcase
> >
> > typedef struct {
> > int _M_current;
> > } __normal_iterator;
> >
> > typedef str
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
--- Comment #3 from Tamar Christina ---
before the bounds variable didn't have any range attached to it.
e.g.
bnd.704_180 = _181 - _132;
but now it shows
# RANGE [irange] unsigned int [1, 2147483647]
bnd.704_180 = _181 - _132;
For some reas
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120959
Tamar Christina changed:
What|Removed |Added
Assignee|tnfchris at gcc dot gnu.org|unassigned at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120959
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817
--- Comment #9 from Tamar Christina ---
So the key to triggering it here is the pass by value.
Testing a patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817
Tamar Christina changed:
What|Removed |Added
Target|aarch64-linux-gnu |aarch64-*
Build|x86_64-l
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817
--- Comment #7 from Tamar Christina ---
(In reply to Richard Biener from comment #6)
> A C testcase would be really nice.
Have not been able to reproduce it as C yet, but here's a cut-down C++ version
#include
#include
#include
extern int
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120817
--- Comment #4 from Tamar Christina ---
Confirmed, bisecting and taking a look.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120805
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842
--- Comment #12 from Tamar Christina ---
(In reply to Hongtao Liu from comment #11)
> (In reply to Tamar Christina from comment #9)
> > (In reply to Hongtao Liu from comment #8)
> > > (In reply to Tamar Christina from comment #7)
> > > > (In rep
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119187
--- Comment #10 from Tamar Christina ---
(In reply to ktkachov from comment #9)
> (In reply to Tamar Christina from comment #8)
> > (In reply to ktkachov from comment #7)
> > > Could this be extended to scale Neon intrinsics code to SVE by
> > >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
Bug 53947 depends on bug 116855, which changed state.
Bug 116855 Summary: [14 Regression] Unsafe early-break vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855
What|Removed |Added
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855
Tamar Christina changed:
What|Removed |Added
Resolution|--- |WONTFIX
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
Tamar Christina changed:
What|Removed |Added
Resolution|--- |WONTFIX
Status|UNCONFIRME
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120447
--- Comment #5 from Tamar Christina ---
I could be mistaken, but VNx4QI is a partial vector, so every QI element
occupies 32-bits (so we'd use a widening load here).
I'm not sure this operation is valid for partial vectors as it means you're
ta
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120357
--- Comment #9 from Tamar Christina ---
(In reply to Richard Biener from comment #8)
> The following fixes this. I'm not 100% convinced but it does seem "obvious"
> (but for the "peeled" case we seem to eventually create duplicate COND
> reduct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120447
Tamar Christina changed:
What|Removed |Added
Status|WAITING |NEW
Keywords|needs-source
Status: UNCONFIRMED
Keywords: aarch64-sve, ice-on-valid-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
CC: jschmitz at gcc dot
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120383
--- Comment #2 from Tamar Christina ---
(In reply to Richard Biener from comment #1)
> Sure, I'm OK with an optab for it. So it's like (half-type)((unsigned)(a +
> b) >> (sizeof(a)*4))?
Yeah, and I was planning on if an optab was acceptable to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120357
--- Comment #7 from Tamar Christina ---
(In reply to Richard Biener from comment #5)
> Confirmed on trunk. I'll eventually have a look.
Sorry I'm on holiday till Tuesday, I'm happy to take a look then if you prefer.
I did not mean to dump my b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120383
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Blocks: 53947, 115130
Target Milestone: ---
Target: aarch64*
Today if we unroll an early break loop such as:
#define N 640
long
: missed-optimization
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Blocks: 53947, 115130
Target Milestone: ---
The following sequence
#define N 4
int a[N
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116855
--- Comment #14 from Tamar Christina ---
(In reply to Richard Biener from comment #13)
> Too late for backporting to 14.3 IMO, also not sure how important it is - we
> did not have an actual case where this caused problems AFAIK. early-break
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164
--- Comment #9 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #8)
> On Thu, 8 May 2025, tnfchris at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164
> >
> > -
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164
--- Comment #7 from Tamar Christina ---
(In reply to Richard Biener from comment #6)
> (In reply to Tamar Christina from comment #5)
> > The given example is an easy one to drop, but I wonder what would happen if
> > the block had other instruct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164
--- Comment #5 from Tamar Christina ---
(In reply to Richard Biener from comment #4)
> Note with "vectorizing" prefetches I meant adjusting the prefetched address,
> "vectorizing" it as an induction but only prefetching on the first (or
> last?)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164
--- Comment #3 from Tamar Christina ---
(In reply to Tamar Christina from comment #2)
> (In reply to Richard Biener from comment #1)
> > As of today this is a job for the vectorizer if-conversion pass then.
> >
> > OTOH I believe we should work
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120164
--- Comment #2 from Tamar Christina ---
(In reply to Richard Biener from comment #1)
> As of today this is a job for the vectorizer if-conversion pass then.
>
> OTOH I believe we should work towards vectorizing the prefetches themselves
> rathe
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157
--- Comment #5 from Tamar Christina ---
(In reply to ktkachov from comment #4)
> > Ah indeed, -msve-vector-bits= does do what I expected. Feel free to close
> > this if it's not tracking anything new then.
>
> Ok. FWIW the original testcase for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Assignee|unassigne
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120157
--- Comment #1 from Tamar Christina ---
(In reply to ktkachov from comment #0)
> Not sure if this is a target-specific issue or not. For input:
> int f11(float *x, float val, int n)
> {
> int i;
> for (i = 0; i < n; i++) {
> if (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116140
--- Comment #20 from Tamar Christina ---
We're currently working on it.
The improvements come from architectures where the code vectorized. The
performance losses come from those where it didn't vectorize, or the vectorizer
generated inefficien
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119351
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118892
Tamar Christina changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119921
Tamar Christina changed:
What|Removed |Added
Version|13.3.1 |16.0
Target Milestone|---
-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*
Created attachment 61193
--> https://gcc.gnu.org/bugzi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119881
--- Comment #2 from Tamar Christina ---
(In reply to Richard Biener from comment #1)
> I wonder where this matters in practice and my usual stance is educating
> users
> about __restrict or #pragma GCC ivdep or OMP simd safelen is better than
>
1 - 100 of 1354 matches
Mail list logo