https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682
--- Comment #9 from Tamar Christina ---
(In reply to Andrew Pinski from comment #8)
> This might be the path splitting running on the gimple level causing issues
> too; see PR 112402 .
Ah that's a good shout. It looks like Richi already
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114577
Bug ID: 114577
Summary: Inefficient codegen for SVE/NEON bridge
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114510
Tamar Christina changed:
What|Removed |Added
CC||tnfchris at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
--- Comment #20 from Tamar Christina ---
This is a bad interaction with early break and peeling for gaps.
when peeling for gaps we set bias_for_lowest to 0, which then negates the ceil
for the upper bound calculation when the div is exact.
We
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114403
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113682
Tamar Christina changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Ever confirmed|0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Summary|[14 Regression]
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
Bug ID: 114061
Summary: GCC fails vectorization when using __builtin_prefetch
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113257
--- Comment #5 from Tamar Christina ---
(In reply to Sam James from comment #3)
> (In reply to Richard Earnshaw from comment #2)
> I'm missing why the combination then works though?
So we've made several changes here over time.
-mcpu=native
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #2 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> I thought there was already one recorded about this.
I could only find https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103938 about an
ICE when prefetching a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114061
--- Comment #4 from Tamar Christina ---
(In reply to Andrew Pinski from comment #3)
> Confirmed.
>
> Though maybe we should drop them in the vectorized version of the loop. HW
> prefetchers usually do a decent job and sometimes (maybe most) SW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114063
Bug ID: 114063
Summary: Use IFN_CHECK_RAW_PTRS/IFN_CHECK_WAR_PTRS for
Advanced. SIMD
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
--- Comment #12 from Tamar Christina ---
looks like the moving of the store didn't update a stray out of block use of
the MEM.
working on patch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102171
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #2)
> I think I am going to implement this (or assign it interally to someone else
> to implement).
If you do, please also remove them from arm_neon.h and use the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86530
--- Comment #8 from Tamar Christina ---
(In reply to Andrew Pinski from comment #6)
> With my patch for V4QI, we still don't get the best code:
> vect_perm_even_271 = VEC_PERM_EXPR 4, 6 }>;
> vect_perm_even_273 = VEC_PERM_EXPR 4, 6 }>;
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
Tamar Christina changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114099
--- Comment #8 from Tamar Christina ---
Created attachment 57537
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57537=edit
uses.patch
new code seems sensitive to visitation order as get_virtual_phi returns NULL
for blocks which don't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #27 from Tamar Christina ---
Created attachment 57538
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57538=edit
proposed1.patch
proposed patch, this gets the gathers and scatters back. doing regression run.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114192
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Status|UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #9 from Tamar Christina ---
While RA should be able to deal with this,
shouldn't we also just lower TBLs in gimple?
This no reason why this can't be a VEC_PERM_EXPR which would also get the
copies
removed at the gimple level and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151
Bug ID: 114151
Summary: [14 Regression] weird and inefficient codegen and
addressing modes since
g:a0b1798042d033fd2cc2c806afbb77875dd2909b
Product: gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151
--- Comment #3 from Tamar Christina ---
>
> This was a correctness fix btw, so I'm not sure we can easily recover - we
> could try using niter information for CHREC_VARIABLE but then there's
> variable niter here so I don't see a chance.
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #11 from Tamar Christina ---
(In reply to Andrew Pinski from comment #10)
> (In reply to Tamar Christina from comment #9)
> > While RA should be able to deal with this,
> > shouldn't we also just lower TBLs in gimple?
> >
> > This
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
--- Comment #12 from Tamar Christina ---
and it's not the first time we have conditional lowering. We already do so for
e.g. shifts, where shifting by an amount => bitsize of a vector element is
defined behavior or AArch64.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114234
Tamar Christina changed:
What|Removed |Added
Last reconfirmed||2024-03-05
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
Tamar Christina changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |tnfchris at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114068
--- Comment #13 from Tamar Christina ---
Created attachment 57510
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57510=edit
candidate-patch1.patch
candidate patch being tested.
I was hoping to correct it during peeling itself when the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115120
--- Comment #3 from Tamar Christina ---
That makes sense, though I also wonder how it works for scalar multi exit
loops, IVops has various checks on single exits.
I guess one problem is that the code in IVops that does this uses the exit to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
Bug ID: 115130
Summary: (early-break) [meta-bug] early break vectorization
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: meta-bug, missed-optimization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115130
Tamar Christina changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
--- Comment #6 from Tamar Christina ---
(In reply to Jakub Jelinek from comment #4)
> Now, with SVE/RISCV vectors the actual vectorization factor is a poly_int
> rather than constant. One possibility would be to use VLA arrays in those
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
Bug ID: 114635
Summary: OpenMP reductions fail dependency analysis
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114860
--- Comment #9 from Tamar Christina ---
(In reply to prathamesh3492 from comment #8)
> Hi Tamar,
> Using -falign-loops=5 indeed brings back the performance.
> The adrp instruction has same address (0x4ae784) by setting -falign-loops=5
> (which
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #11 from Tamar Christina ---
(In reply to Richard Biener from comment #10)
> I think the question is why IVOPTs ends up using both the signed and
> unsigned variant of the same IV instead of expressing all uses of both with
> one
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54013
Tamar Christina changed:
What|Removed |Added
Blocks||115130
--- Comment #4 from Tamar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #13 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #12)
> > since we don't care about overflow here, it looks like the stripping should
> > be recursive as long as it's a NOP expression between two integral
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #15 from Tamar Christina ---
(In reply to rguent...@suse.de from comment #14)
> On Thu, 6 Jun 2024, tnfchris at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
> >
> > --- Comment #13 from Tamar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
--- Comment #10 from Tamar Christina ---
Thanks for the fix, but I don't think it's sufficient.
what I meant with the earlier comment was that the subregs are broken in
general, so not just the one generated by the undef fast path.
i.e.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
--- Comment #3 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> I suspect PR 20999 would fix this ...
> but we have to be careful since without masked stores, you could still
> vectorize this unlike the transformed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115531
Bug ID: 115531
Summary: vectorizer generates inefficient code for masked
conditional update loops
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534
Bug ID: 115534
Summary: intermediate stack use not eliminated
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534
--- Comment #2 from Tamar Christina ---
(In reply to Andrew Pinski from comment #1)
> I suspect there is a dup of this already. See the bug which I made this one
> blocking for a list of related bugs.
Most of the other bugs relate to the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537
Bug ID: 115537
Summary: [15 Regression] vectorizable_reduction ICEs after
g:d66b820f392aa9a7c34d3cddaf3d7c73bf23f82d
Product: gcc
Version: 15.0
Status: UNCONFIRMED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115537
--- Comment #5 from Tamar Christina ---
Thanks for the fix!
I think the testcase needs SVE enabled to ICE no?
shouldn't that be -mcpu=neoverse-v1 and not -mcpu=neoverse-n1?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115534
--- Comment #5 from Tamar Christina ---
(In reply to Andrew Pinski from comment #4)
> This might be improved by
> https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654819.html . Or it
> might be the case the vectorizer case needs to be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114932
--- Comment #9 from Tamar Christina ---
It's taken me a bit of time to track down all the reasons for the speedup with
the earlier patch.
This comes from two parts:
1. Signed IVs don't get simplified. Due to possible UB with signed overflows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
Tamar Christina changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
Tamar Christina changed:
What|Removed |Added
Last reconfirmed||2024-06-12
CC|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
--- Comment #7 from Tamar Christina ---
(In reply to Tamar Christina from comment #6)
> (In reply to Richard Sandiford from comment #5)
> > In this kind of situation, we should go through a fresh pseudo rather than
> > try to take the subreg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115464
--- Comment #6 from Tamar Christina ---
(In reply to Richard Sandiford from comment #5)
> In this kind of situation, we should go through a fresh pseudo rather than
> try to take the subreg directly.
I did try that but fwprop pushed it back
701 - 750 of 750 matches
Mail list logo