https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90693
--- Comment #2 from Wilco ---
(In reply to Dávid Bolvanský from comment #1)
> >> __builtin_popcount (x) == 1 into x == (x & -x)
>
>
> This will not work for x = 0.
>
> Should work:
> x && x == (x & -x)
> x && (x & x-1) == 0
Good point,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242
--- Comment #24 from Wilco ---
Author: wilco
Date: Mon Jun 3 13:55:15 2019
New Revision: 271870
URL: https://gcc.gnu.org/viewcvs?rev=271870=gcc=rev
Log:
Fix PR64242 - Longjmp expansion incorrect
Improve the fix for PR64242. Various
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90684
--- Comment #3 from Wilco ---
Author: wilco
Date: Mon Jun 3 11:27:50 2019
New Revision: 271864
URL: https://gcc.gnu.org/viewcvs?rev=271864=gcc=rev
Log:
Fix alignment option parser (PR90684)
Fix the alignment option parser to always allow up
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82853
--- Comment #36 from Wilco ---
(In reply to Orr Shalom Dvory from comment #35)
> Hi, thanks for your respond. can someone mark this bug as need to be
> improved?
> Does anyone agree/disagree with my new proposed method?
It's best to create a
Assignee: unassigned at gcc dot gnu.org
Reporter: wilco at gcc dot gnu.org
Target Milestone: ---
While GCC optimizes __builtin_popcount (x) != 0 into x != 0, we can also
optimize __builtin_popcount (x) == 1 into x == (x & -x), and __builtin_popcount
(x) > 1 into (x & (x-1)) != 0.
gnu.org |wilco at gcc dot gnu.org
--- Comment #1 from Wilco ---
Proposed patch: https://gcc.gnu.org/ml/gcc-patches/2019-05/msg02030.html
: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wilco at gcc dot gnu.org
Target Milestone: ---
GCC9 always reports an error when using -falign-functions=16:8:8
cc1: error: invalid number of arguments for ‘-falign-functions’ option:
‘16:8:8’
This is not working
||wilco at gcc dot gnu.org
Resolution|--- |FIXED
--- Comment #6 from Wilco ---
Fixed in GCC9.
||2019-05-29
CC||wilco at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #1 from Wilco ---
Confirmed
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wilco at gcc dot gnu.org
Target Milestone: ---
The following testcase emits a popcount which computes the final pointer value.
This is redundant given the loop already computes the pointer value. The
popcount causes
||wilco at gcc dot gnu.org
Resolution|--- |WORKSFORME
--- Comment #4 from Wilco ---
Works since at least GCC4.5.4.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16996
Bug 16996 depends on bug 38570, which changed state.
Bug 38570 Summary: [arm] -mthumb generates sub-optimal prolog/epilog
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38570
What|Removed |Added
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38570
Wilco changed:
What|Removed |Added
Status|NEW |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38570
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #12 from
||wilco at gcc dot gnu.org
Resolution|--- |WONTFIX
--- Comment #8 from Wilco ---
There doesn't appear to be anything that can be improved here. Literal pool
loads can't be easily peepholed into LDM, and there aren't many opportunities
anyway.
||wilco at gcc dot gnu.org
Resolution|--- |WORKSFORME
--- Comment #6 from Wilco ---
This has been fixed since at least GCC5.4: https://www.godbolt.org/z/6IAGfh
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263
--- Comment #20 from Wilco ---
(In reply to Martin Liška from comment #19)
> Created attachment 46265 [details]
> Patch candidate v2
>
> Update patch that should be fine. Tests on x86_64 work except:
> FAIL:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263
--- Comment #18 from Wilco ---
(In reply to Martin Liška from comment #14)
> Created attachment 46262 [details]
> Patch candidate
>
> Patch candidate that handles:
>
> $ cat ~/Programming/testcases/mempcpy.c
> int *mempcopy2 (int *p, int *q,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263
--- Comment #17 from Wilco ---
(In reply to Wilco from comment #16)
> (In reply to Martin Sebor from comment #15)
> > I just noticed I have been misreading mempcpy as memccpy and so making no
> > sense. Sorry about that! Please ignore my
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263
--- Comment #16 from Wilco ---
(In reply to Martin Sebor from comment #15)
> I just noticed I have been misreading mempcpy as memccpy and so making no
> sense. Sorry about that! Please ignore my comments.
I see, yes we have too many and the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #5 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263
--- Comment #12 from Wilco ---
(In reply to Martin Sebor from comment #11)
> My concern is that transforming memccpy to memcpy would leave little
> incentive for libraries like glibc to provide a more optimal implementation.
> Would implementing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263
--- Comment #9 from Wilco ---
(In reply to Martin Sebor from comment #7)
> Rather than unconditionally transforming mempcpy to memcpy I would prefer to
> see libc implementations of memccpy optimized. WG14 N2349 discusses a
> rationale for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263
--- Comment #6 from Wilco ---
(In reply to Martin Liška from comment #5)
> The discussion looks familiar to me. Isn't that PR70140, where I was
> suggesting something like:
>
> https://marc.info/?l=gcc-patches=150166433909242=2
>
> with a new
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263
--- Comment #4 from Wilco ---
(In reply to Jakub Jelinek from comment #3)
> Because then you penalize properly maintained targets which do have
> efficient mempcpy. And even if some targets don't have efficient mempcpy
> right now, that doesn't
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90263
--- Comment #2 from Wilco ---
(In reply to Jakub Jelinek from comment #1)
> As stated several times in the past, I strongly disagree.
Why? GCC already does this for bzero and bcopy.
Assignee: unassigned at gcc dot gnu.org
Reporter: wilco at gcc dot gnu.org
Target Milestone: ---
While GCC now inlines fixed-size mempcpy like memcpy, GCC still emits calls to
mempcpy rather than converting to memcpy. Since most libraries, including
GLIBC, do not have optimized
Assignee: unassigned at gcc dot gnu.org
Reporter: wilco at gcc dot gnu.org
Target Milestone: ---
GCC does not inline fixed-size memmoves. However memmove can be as easily
inlined as memcpy. The existing memcpy infrastructure could be reused/expanded
for this - all loads would
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871
--- Comment #52 from Wilco ---
(In reply to Segher Boessenkool from comment #48)
> With just Peter's and Jakub's patch, it *improves* code size by 0.090%.
> That does not fix this PR though :-/
But it does fix most of the codesize regression.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871
--- Comment #47 from Wilco ---
(In reply to Segher Boessenkool from comment #46)
> With all three patches together (Peter's, mine, Jakub's), I get a code size
> increase of only 0.047%, much more acceptable. Now looking what that diff
> really
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871
--- Comment #38 from Wilco ---
(In reply to Segher Boessenkool from comment #37)
> Yes, it is a balancing act. Which option works better?
Well the question really is what is bad about movsi_compare0 that could be
easily fixed?
The move is for
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #54 from Wilco ---
(In reply to Jeffrey A. Law from comment #53)
> Realistically the register allocation issues are not going to get addressed
> this cycle nor are improvements to the overall handling of RMW insns in
> combine. So
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #52 from Wilco ---
(In reply to Jeffrey A. Law from comment #49)
> I think the insv_1 (and it's closely related insv_2) regressions can be
> fixed by a single ior/and pattern in the backend or by hacking up combine a
> bit. I'm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871
--- Comment #21 from Wilco ---
(In reply to Vladimir Makarov from comment #20)
> (In reply to Wilco from comment #19)
> > (In reply to Peter Bergner from comment #18)
> > > (In reply to Segher Boessenkool from comment #15)
> > > > Popping
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871
--- Comment #19 from Wilco ---
(In reply to Peter Bergner from comment #18)
> (In reply to Segher Boessenkool from comment #15)
> > Popping a5(r116,l0) -- assign reg 3
> > Popping a3(r112,l0) -- assign reg 4
> > Popping
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81800
--- Comment #16 from Wilco ---
(In reply to Jakub Jelinek from comment #15)
> (In reply to Wilco from comment #14)
> > (In reply to Jakub Jelinek from comment #13)
> > > Patches should be pinged after a week if they aren't reviewed,
> > >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81800
--- Comment #14 from Wilco ---
(In reply to Jakub Jelinek from comment #13)
> Patches should be pinged after a week if they aren't reviewed, furthermore,
> it is better to CC explicitly relevant maintainers.
I've got about 10 patches waiting,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834
--- Comment #16 from Wilco ---
(In reply to kugan from comment #15)
> (In reply to Wilco from comment #11)
> > There is also something odd with the way the loop iterates, this doesn't
> > look right:
> >
> > whilelo p0.s, x3, x4
> >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88834
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #11 from
||wilco at gcc dot gnu.org
Resolution|--- |FIXED
Target Milestone|--- |9.0
--- Comment #9 from Wilco ---
Fixed in GCC9 already, so closing.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89493
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89752
--- Comment #10 from Wilco ---
It seems that rewriting "+rm" into "=rm" and "0" is not equivalent. Eg.
__asm__ ("" : [a0] "=m" (A0) : "0" (A0));
gives a million warnings "matching constraint does not allow a register", so
"0" appears to imply
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89752
--- Comment #4 from Wilco ---
Small example which generates the same ICE on every GCC version:
typedef struct { int x, y, z; } X;
void f(void)
{
X A0, A1;
__asm__ ("" : [a0] "+rm" (A0),[a1] "+rm" (A1));
}
So it's completely invalid inline
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89752
--- Comment #3 from Wilco ---
Full instruction:
(insn 531 530 532 19 (parallel [
(set (mem/c:BLK (reg:DI 3842) [29 A0+0 S2 A64])
(asm_operands:BLK ("") ("=rm") 0 [
(mem/c:BLK (reg:DI 3846) [29
||2019-03-18
CC||wilco at gcc dot gnu.org
Ever confirmed|0 |1
--- Comment #2 from Wilco ---
Confirmed. It ICEs in Eigen::internal::gebp_kernel, 2, 4,
false, false>::operator()
It seems to choke on this asm dur
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89222
Wilco changed:
What|Removed |Added
Target Milestone|--- |8.5
Summary|[7/8/9 regression] ARM
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89222
--- Comment #9 from Wilco ---
Author: wilco
Date: Tue Mar 5 15:04:01 2019
New Revision: 269390
URL: https://gcc.gnu.org/viewcvs?rev=269390=gcc=rev
Log:
[ARM] Fix PR89222
The GCC optimizer can generate symbols with non-zero offset from simple
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89437
Wilco changed:
What|Removed |Added
Status|UNCONFIRMED |RESOLVED
Resolution|---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89437
--- Comment #1 from Wilco ---
Author: wilco
Date: Mon Mar 4 12:36:04 2019
New Revision: 269364
URL: https://gcc.gnu.org/viewcvs?rev=269364=gcc=rev
Log:
Fix PR89437
Fix PR89437. Fix the sinatan-1.c testcase to not run without
a C99 target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86829
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #8 from
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wilco at gcc dot gnu.org
Target Milestone: ---
A recently added optimization uses an inline expansion for sinl (atanl (x)). As
it involves computing sqrtl (x * x + 1) which can overflow for large x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314
--- Comment #25 from Wilco ---
(In reply to Steve Ellcey from comment #24)
> See email strings at:
>
> https://gcc.gnu.org/ml/fortran/2019-01/msg00276.html
> https://gcc.gnu.org/ml/fortran/2019-02/msg00057.html
>
> For more discussion.
Sure,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78314
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #23 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89037
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
Version|9.0
||2019-02-15
CC||wilco at gcc dot gnu.org
Summary|ICE in |[8 regression] ICE in
|aarch64_classify_address, |aarch64_classify_address
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89190
--- Comment #2 from Wilco ---
Author: wilco
Date: Wed Feb 13 16:22:25 2019
New Revision: 268848
URL: https://gcc.gnu.org/viewcvs?rev=268848=gcc=rev
Log:
[ARM] Fix Thumb-1 ldm (PR89190)
This patch fixes an ICE in the Thumb-1 LDM peepholer.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89222
--- Comment #8 from Wilco ---
(In reply to Wilco from comment #7)
> Patch: https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00780.html
Updated patch: https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00947.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86637
--- Comment #14 from Wilco ---
Author: wilco
Date: Mon Feb 11 18:14:37 2019
New Revision: 268777
URL: https://gcc.gnu.org/viewcvs?rev=268777=gcc=rev
Log:
[COMMITTED] Fix pthread errors in pr86637-2.c
Fix test errors on targets which do not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89222
--- Comment #7 from Wilco ---
Patch: https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00780.html
||2019-02-08
Assignee|unassigned at gcc dot gnu.org |wilco at gcc dot gnu.org
Ever confirmed|0 |1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89222
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #5 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #8 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89195
--- Comment #11 from Wilco ---
(In reply to Segher Boessenkool from comment #9)
> That patch is pre-approved if it regchecks fine (on more than just x86).
> Thanks!
check-gcc is clean on aarch64_be-none-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89195
--- Comment #10 from Wilco ---
(In reply to Jakub Jelinek from comment #8)
> Created attachment 45606 [details]
> gcc9-pr89195.patch
>
> Now in patch form (untested so far).
That works fine indeed. It avoids accessing the object out of bounds
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89195
--- Comment #4 from Wilco ---
(In reply to Segher Boessenkool from comment #3)
> (In reply to Wilco from comment #1)
> > len is unsigned HOST_WIDE_INT, so bits_to_bytes_round_down does an unsigned
> > division...
>
> That shouldn't make a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89195
--- Comment #1 from Wilco ---
make_extraction does:
if (MEM_P (inner))
{
poly_int64 offset;
/* POS counts from lsb, but make OFFSET count in memory order. */
if (BYTES_BIG_ENDIAN)
offset
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89195
Wilco changed:
What|Removed |Added
Target||aarch64
Target Milestone|---
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wilco at gcc dot gnu.org
Target Milestone: ---
The following testcase generates incorrect stack offsets on AArch64 since GCC7
when compiled with -O1 -mbig-endian:
struct S
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89190
Wilco changed:
What|Removed |Added
Status|UNCONFIRMED |ASSIGNED
Last reconfirmed|
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: wilco at gcc dot gnu.org
Target Milestone: ---
The following testcases ICEs with -march=armv8-m.base on arm.none.eabi:
long long a;
int b, c;
int d(int e, int f) { return e << f; }
void g() {
long long h;
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89104
--- Comment #5 from Wilco ---
(In reply to Jakub Jelinek from comment #4)
> I really don't like these aarch64 warnings, declare simd is an optimization
> (admittedly with ABI consequences) and warning about this by default is
> weird,
> + it is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89104
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #3 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89101
Wilco changed:
What|Removed |Added
Status|WAITING |NEW
Known to work|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89101
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #1 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
--- Comment #23 from Wilco ---
(In reply to ktkachov from comment #22)
> helps even more. On Cortex-A72 it gives a bit more than 6% (vs 3%)
> improvement on parest, and about 5.3% on a more aggressive CPU.
> I tried unrolling 8x in a similar
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #32 from Wilco ---
Author: wilco
Date: Fri Jan 25 13:29:06 2019
New Revision: 268265
URL: https://gcc.gnu.org/viewcvs?rev=268265=gcc=rev
Log:
[PATCH][AArch64] Fix generation of tst (PR87763)
The TST instruction no longer matches in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
--- Comment #21 from Wilco ---
(In reply to rguent...@suse.de from comment #20)
> On Thu, 24 Jan 2019, wilco at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
> >
> > --
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
--- Comment #19 from Wilco ---
(In reply to rguent...@suse.de from comment #18)
> > 1) Unrolling for load-pair-forming vectorisation (Richard Sandiford's
> > suggestion)
>
> If that helps, sure (I'd have guessed uarchs are going to split
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #23 from Wilco ---
Author: wilco
Date: Tue Jan 22 17:49:46 2019
New Revision: 268159
URL: https://gcc.gnu.org/viewcvs?rev=268159=gcc=rev
Log:
Fix vect-nop-move.c test
Fix a failing test - changes in Combine mean the test now fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #22 from Wilco ---
(In reply to Steve Ellcey from comment #21)
> If I look at this specific example:
>
> int f2 (int x, int y)
> {
> return (x & ~0x0ff000) | ((y & 0x0ff) << 12);
> }
>
> Is this because of x0 (a hard register)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #19 from Wilco ---
(In reply to Segher Boessenkool from comment #18)
> https://gcc.gnu.org/ml/gcc/2019-01/msg00112.html
Thanks, I hadn't noticed that yet... I need to look at it in more detail, but
are you saying that combine no
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #17 from Wilco ---
(In reply to Vladimir Makarov from comment #14)
> I've checked cvtf_1.c generated code and I don't see additional fmov
> anymore. I guess it was fixed by an ira-costs.c change (a special
> consideration of moves
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #13 from Wilco ---
(In reply to Segher Boessenkool from comment #12)
> Before the change combine forwarded all argument (etc.) hard registers
> wherever
> it could, doing part of RA's job (and doing a lousy job of it). If after the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763
--- Comment #11 from Wilco ---
A SPEC2006 run shows the codesize cost of make_more_copies is 0.05%.
Practically all tests are worse, the largest increases are perlbench at 0.20%,
gromacs 0.12%, calculix 0.12%, soplex 0.08%, xalancbmk 0.07%, wrf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88560
--- Comment #6 from Wilco ---
(In reply to Vladimir Makarov from comment #5)
> We have too many tests checking expected generated code. We should more
> focus on overall effect of the change. SPEC would be a good criterium
> although it is
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
--- Comment #16 from Wilco ---
(In reply to rguent...@suse.de from comment #15)
> which is what I refered to for branch prediction. Your & prompts me
> to a way to do sth similar as duffs device, turning the loop into a nest.
>
> head:
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
--- Comment #14 from Wilco ---
(In reply to rguent...@suse.de from comment #13)
> Usually the peeling is done to improve branch prediction on the
> prologue/epilogue.
Modern branch predictors do much better on a loop than with this kind of
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
--- Comment #37 from Wilco ---
(In reply to rsand...@gcc.gnu.org from comment #35)
> Yeah, the expr.c patch makes the original testcase work, but we still fail
> for:
That's the folding in ccp1 after inlining, which will require a similar fix.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
--- Comment #34 from Wilco ---
With just the expr.c patch the gcc regression tests all pass on big-endian
AArch64. Interestingly this includes the new torture test, ie. it does not
trigger the union bug.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
--- Comment #33 from Wilco ---
(In reply to Richard Biener from comment #32)
> >
> > Index: gcc/expr.c
> > ===
> > --- gcc/expr.c (revision 267553)
> > +++ gcc/expr.c (working
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
--- Comment #7 from Wilco ---
(In reply to rguent...@suse.de from comment #6)
> On Wed, 9 Jan 2019, wilco at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
> >
> > --- Comment #5 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
--- Comment #5 from Wilco ---
(In reply to Wilco from comment #4)
> (In reply to ktkachov from comment #2)
> > Created attachment 45386 [details]
> > aarch64-llvm output with -Ofast -mcpu=cortex-a57
> >
> > I'm attaching the full LLVM aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
Wilco changed:
What|Removed |Added
CC||wilco at gcc dot gnu.org
--- Comment #4 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
--- Comment #29 from Wilco ---
(In reply to Richard Biener from comment #26)
> Did anybody test the patch? Testing on x86_64 will be quite pointless...
Well that generates _18 = BIT_FIELD_REF <_2, 16, 14>; and becomes:
ubfxx1, x20, 2, 16
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
--- Comment #27 from Wilco ---
(In reply to Eric Botcazou from comment #22)
> > Is it really pure RTL, therefore not used in tree? So the above patch using
> > BITS_BIG_ENDIAN for tree stuff would be incorrect to use it?
>
> I wouldn't say
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
--- Comment #25 from Wilco ---
(In reply to rguent...@suse.de from comment #17)
> On Tue, 8 Jan 2019, wilco at gcc dot gnu.org wrote:
>
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
> >
> > --- Comment #16 fro
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
--- Comment #21 from Wilco ---
(In reply to Eric Botcazou from comment #20)
> > BITS_BIG_ENDIAN is just a convenience to the target code writer. The other
> > four do matter, and are quite obvious really (and all four are necessary).
>
> Yes,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739
--- Comment #19 from Wilco ---
(In reply to Segher Boessenkool from comment #18)
> Well, it is always possible to generate code with the opposite endianness to
> what the hardware "wants". It just won't be very fast code.
>
> BITS_BIG_ENDIAN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398
--- Comment #20 from Wilco ---
I see Kyrill added some examples that show LLVM knows how to unroll loops:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88760
This kind of thing is much worse than a trailing loop, both for branch
prediction and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398
--- Comment #19 from Wilco ---
(In reply to Jakub Jelinek from comment #18)
> The duffs device doesn't need to be done with computed jump, it can be done
> with 3 conditional branches + 3 comparisons too. The advantage of doing
> that is
301 - 400 of 712 matches
Mail list logo