gnu.org |rsandifo at gcc dot
gnu.org
--- Comment #1 from Richard Sandiford ---
Testing a patch.
-on-valid-code
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rsandifo at gcc dot gnu.org
CC: avieira at gcc dot gnu.org
Target Milestone: ---
Target: aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115258
Richard Sandiford changed:
What|Removed |Added
Summary|[14/15 Regression] register |[14 Regression] register
|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
--- Comment #2 from Richard Sandiford ---
I agree with Andrew that this seems mostly to be luck. On that basis, I'm not
sure (either way) whether we should backport the fix.
Now that we're hopefully getting better at tracking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115192
Richard Sandiford changed:
What|Removed |Added
Summary|[11/12/13/14/15 regression] |[11/12/13/14 regression]
|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114635
--- Comment #19 from Richard Sandiford ---
(In reply to Richard Biener from comment #14)
> Usually targets do have a limit on the actual length but I see
> constant_upper_bound_with_limit doesn't query such. But it would
> be a more
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664
--- Comment #14 from Richard Sandiford ---
Yeah, I think so.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664
--- Comment #12 from Richard Sandiford ---
(In reply to Peter Bergner from comment #11)
> > > but how are users supposed to know whether
> > > -fno-omit-frame-pointer is in effect or not? I've looked and there is no
> > > pre-defined macro a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664
--- Comment #10 from Richard Sandiford ---
(In reply to Peter Bergner from comment #7)
> Then that would seem to indicate that mentioning the frame pointer reg in
> the asm clobber list is an error
Yeah, I agree it's an error. The PR says
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114607
--- Comment #2 from Richard Sandiford ---
Fixed on trunk. I'll backport in a few weeks if there's no fallout.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114607
Richard Sandiford changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rsandifo at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*-*-*
svsudot is supposed to expand to USDOT with the second and third arguments
swapped. However, there is a thinko
|UNCONFIRMED |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
Ever confirmed|0 |1
--- Comment #2 from Richard Sandiford ---
Fix on trunk so far, but I'll backport if possible.
: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rsandifo at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*-*-*
An overly lax condition on the cnot combine pattern means that we optimise:
#include
svint32_t foo(svbool_t pg
||rsandifo at gcc dot gnu.org
Status|UNCONFIRMED |RESOLVED
--- Comment #2 from Richard Sandiford ---
Fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114521
Richard Sandiford changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515
--- Comment #5 from Richard Sandiford ---
For the record, the associated new testsuite failures are:
FAIL: gcc.target/aarch64/ashltidisi.c scan-assembler-times asr 3
FAIL: gcc.target/aarch64/asimd-mull-elem.c scan-assembler-times
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515
--- Comment #4 from Richard Sandiford ---
(In reply to Richard Biener from comment #1)
> Btw, why does forwprop not do this?
Not 100% sure (I wasn't involved in choosing the current heuristics). But
fwprop can propagate across blocks, so there
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114515
--- Comment #3 from Richard Sandiford ---
In RTL terms, the dup is vec_duplicate. The combination is:
Trying 10 -> 13:
10: r107:V4SF=vec_duplicate(r115:SF)
REG_DEAD r115:SF
13: r110:V4SF=r111:V4SF*r107:V4SF
REG_DEAD
-optimization
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rsandifo at gcc dot gnu.org
Target Milestone: ---
The following test regressed on aarch64 after
g
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97696
Richard Sandiford changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302
--- Comment #5 from Richard Sandiford ---
(In reply to Andrew Stubbs from comment #4)
> Yes, that's what the simd-math-3* tests do.
Ah, OK.
> The simd-math-5* tests are explicitly supposed to be doing this in the
> context of the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302
--- Comment #3 from Richard Sandiford ---
Ah, ok. If the main aim is to test the libgcc routines, it might be safer to
use something like:
typedef char v64qi __attribute__((vector_size(64)));
v64qi f(v64qi x, v64qi y) { return x / y; }
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114302
--- Comment #1 from Richard Sandiford ---
The decision to stop narrowing division was deliberate, see the comments in
PR113281 for details. Is the purpose of the test to check vectorisation
quality, or to check for the right ABI routines?
-on-valid-code
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rsandifo at gcc dot gnu.org
Target Milestone: ---
The following test ICEs with -Ofast on aarch64:
void bar();
float
foo (float x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
Richard Sandiford changed:
What|Removed |Added
Attachment #57602|0 |1
is obsolete|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #41 from Richard Sandiford ---
(In reply to Richard Biener from comment #40)
> So I wonder if we can use "local costing" to decide a gather is always OK
> compared to the alternative with peeling for gaps. On x86 gather tends
> to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #39 from Richard Sandiford ---
(In reply to Richard Sandiford from comment #38)
> (In reply to Richard Biener from comment #37)
> > Even more iteration looks bad. I do wonder why when gather can avoid
> > peeling for GAPs using
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #38 from Richard Sandiford ---
(In reply to Richard Biener from comment #37)
> Even more iteration looks bad. I do wonder why when gather can avoid
> peeling for GAPs using load-lanes cannot?
Like you say, we don't realise that all
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #36 from Richard Sandiford ---
Created attachment 57602
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57602=edit
proof-of-concept patch to suppress peeling for gaps
This patch does what I suggested in the previous comment:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #35 from Richard Sandiford ---
Maybe I've misunderstood the flow of the ticket, but it looks to me like we do
still correctly recognise the truncating scatter stores. And, on their own, we
would be able to convert them into masked
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #33 from Richard Sandiford ---
Can you give me a chance to look at it a bit when I back? This doesn't feel
like the way to go to me.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113441
--- Comment #31 from Richard Sandiford ---
(In reply to Tamar Christina from comment #29)
> This works fine for normal gather and scatters but doesn't work for widening
> gathers and narrowing scatters which only the pattern seems to handle.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98877
Richard Sandiford changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97696
--- Comment #3 from Richard Sandiford ---
Created attachment 57520
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57520=edit
Candidate patch
The attached patch seems to fix it. I'm taking next week off, but I'll run the
patch through
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97696
Richard Sandiford changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113205
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113205
--- Comment #12 from Richard Sandiford ---
Created attachment 57511
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57511=edit
Candidate patch
Sorry for the very slow response on this. I'm testing the attached.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113205
Richard Sandiford changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112922
Richard Sandiford changed:
What|Removed |Added
Resolution|--- |FIXED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 112922, which changed state.
Bug 112922 Summary: [14 Regression] 465.tonto from SPECFP 2006 fails train run
on Aarch64-linux with -O2 and -flto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112922
What
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295
Richard Sandiford changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 113295, which changed state.
Bug 113295 Summary: [14 Regression] SPEC 2006 416.gamess miscompares on Aarch64
when built with -Ofast -mcpu=native since
g:2f46e3578d45ff060a0a329cb39d4f52878f9d5a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113613
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295
--- Comment #6 from Richard Sandiford ---
For me the miscompilation is in jkdmem_, where we end up allocating the same
registers to both arms of an fcsel. It sounds like it occurs elsewhere too.
I have a candidate fix, but need to think a bit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114007
--- Comment #14 from Richard Sandiford ---
I might have misunderstood the suggestion and so be arguing against something
that no-one is suggesting, but I think [[__extension__ …]] should accept the
same things for all standard versions (C23,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113995
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113220
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113295
Richard Sandiford changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112922
--- Comment #2 from Richard Sandiford ---
I don't remember there being a deliberate bug fix in that patch,
but there were some others later. I suppose the optimistic case
is that this first went latent and then was fixed “properly”
afterwards.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113778
Richard Sandiford changed:
What|Removed |Added
Resolution|--- |DUPLICATE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113220
Richard Sandiford changed:
What|Removed |Added
CC||zsojka at seznam dot cz
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113805
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113995
Richard Sandiford changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113988
Richard Sandiford changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
|1
Last reconfirmed||2024-02-15
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
--- Comment #3 from Richard Sandiford ---
Possibly the same as PR113220
||2024-02-12
Ever confirmed|0 |1
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
--- Comment #1 from Richard Sandiford ---
Heh. I knew this coming after seeing the previous -mtrack-speculation bug, but
didn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113763
--- Comment #14 from Richard Sandiford ---
AFAIK, the constructor shouldn't be necessary. (And without it, the whole
thing would fit on one line.) LGTM (and preapproved) otherwise. Thanks for
doing this.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113763
--- Comment #11 from Richard Sandiford ---
Currently away so can't try it myself, but how about just using an ad-hoc
structure instead?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #34 from Richard Sandiford ---
(In reply to Richard Biener from comment #32)
> Btw, AVX512 knotb will invert all 8 bits and there's no knot just affecting
> the lowest 4 or 2 bits.
>
> It all feels like desaster waiting to happen
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #30 from Richard Sandiford ---
(In reply to Richard Biener from comment #29)
> But that's just for CONSTRUCTORs, we got the VIEW_CONVERT_EXPR path for
> VECTOR_CSTs. But yeah, that _might_ argue we should perform the same
> masking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113636
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113623
Richard Sandiford changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111677
--- Comment #21 from Richard Sandiford ---
(In reply to Alex Coplan from comment #13)
> The problem seems to be this code in aarch64_process_components:
>
> while (regno != last_regno)
> {
> bool frame_related_p =
|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
--- Comment #6 from Richard Sandiford ---
Mine.
|1
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
Last reconfirmed||2024-01-29
--- Comment #10 from Richard Sandiford ---
Mine.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281
--- Comment #24 from Richard Sandiford ---
Fixed on trunk so far, but it's latent on branches. I'll see what
the trunk fallout is like before asking about backports.
|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
--- Comment #7 from Richard Sandiford ---
early-ra does try to avoid reusing registers too soon, to increase scheduling
freedom. But in this case I imagine it handles the two statements as separate
regions. Should be fixable by carrying
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #18 from Richard Sandiford ---
(In reply to Tamar Christina from comment #17)
> Well the mid-end has generated the right precision. The type it generates is
> vector(4) vexit_reduc_67;
> so it does say it's a single bit boolean.
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #16 from Richard Sandiford ---
(In reply to Richard Biener from comment #15)
> I think the problem is the cbranch pattern which looks at all of the
> QImode mask - but of course it doesn't know it's really V4BImode it's
> working on
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113550
Richard Sandiford changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113485
Richard Sandiford changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113572
Richard Sandiford changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576
--- Comment #13 from Richard Sandiford ---
I don't think there's any principle that upper bits must be zero.
How do we end up with a pattern that depends on that being the case?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113281
Richard Sandiford changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113572
Richard Sandiford changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113485
Richard Sandiford changed:
What|Removed |Added
Status|NEW |ASSIGNED
--- Comment #7 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109929
--- Comment #7 from Richard Sandiford ---
Hmm, yeah, like you say, neither of those commits should have made a different
to whether bootstrap works. I guess the problem is just latent now.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111267
--- Comment #12 from Richard Sandiford ---
I don't object to the patch, but for the record: the current heuristics go back
a long way. Although I reworked the pass to use rtl-ssa a few years ago, I
tried as far as possible to preserve the old
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113196
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112989
Richard Sandiford changed:
What|Removed |Added
Resolution|--- |FIXED
Status|ASSIGNED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112989
--- Comment #12 from Richard Sandiford ---
> another is try
> #pragma GCC aarch64 "arm_sve.h"
> after a couple of intentional declarations of the SVE builtins with
> non-standard return/argument types and make sure that while it emits some
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112989
Richard Sandiford changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113270
--- Comment #8 from Richard Sandiford ---
Thanks for trying it, and sorry for not doing it myself.
The patch LGTM FWIW, so preapproved if it passes testing (which I'm sure it
will :))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113270
--- Comment #6 from Richard Sandiford ---
I think we want the patch in comment 3, but in addition, I then also needed to
use the following for a similar SVE case:
extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES + 1];
tree
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113104
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68703
Richard Sandiford changed:
What|Removed |Added
CC||rsandifo at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113220
Richard Sandiford changed:
What|Removed |Added
CC|richard.sandiford at arm dot com |rsandifo at gcc dot
gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113196
Richard Sandiford changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rsandifo at gcc dot gnu.org
CC: tnfchris at gcc dot gnu.org
Target Milestone: ---
Target: aarch64*-*-*
For this testcase, adapted from the one for PR110625
||2023-12-30
Ever confirmed|0 |1
CC||rsandifo at gcc dot gnu.org
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
--- Comment #4 from Richard Sandiford ---
FWIW, we
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113091
--- Comment #5 from Richard Sandiford ---
> The issue here is that because the "outer" pattern consumes
> patt_64 = (int) patt_63 it should have adjusted _2 = (int) _1
> stmt-to-vectorize
> as being the outer pattern root stmt for all this
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113094
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112948
Richard Sandiford changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113094
Richard Sandiford changed:
What|Removed |Added
Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot
gnu.org
|RESOLVED
CC||rsandifo at gcc dot gnu.org
--- Comment #5 from Richard Sandiford ---
Fixed.
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rsandifo at gcc dot gnu.org
Target Milestone: ---
The lack of vec_set and vec_extract optabs for structure modes means that the
following testcase spills to the stack when compiled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109543
--- Comment #5 from Richard Sandiford ---
I think the loop in compute_mode_layout needs to be smarter
for unions. At the moment it's sensitive to field order,
which doesn't make much conceptual sense.
E.g. for the admittedly contrived
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80283
--- Comment #39 from Richard Sandiford ---
(In reply to Andrew Pinski from comment #38)
> For aarch64, the test from comment #11 is so much worse on the trunk than in
> GCC 13.2.0.
I've been working on a fix for that. I'm hoping to post it
1 - 100 of 1879 matches
Mail list logo