https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503
Wilco wdijkstr at arm dot com changed:
What|Removed |Added
CC||wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503
--- Comment #10 from Wilco wdijkstr at arm dot com ---
The loops shown are not the correct inner loops for those options - with
-ffast-math they are vectorized. LLVM unrolls 2x but GCC doesn't. So the
question is why GCC doesn't unroll vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503
--- Comment #13 from Wilco wdijkstr at arm dot com ---
(In reply to Andrew Pinski from comment #11)
(In reply to Wilco from comment #10)
The loops shown are not the correct inner loops for those options - with
-ffast-math they are vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503
--- Comment #15 from Wilco wdijkstr at arm dot com ---
(In reply to Evandro Menezes from comment #14)
Compiling the test-case above with just -O2, I can reproduce the code I
mentioned initially and easily measure the cycle count to run
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61915
Wilco wdijkstr at arm dot com changed:
What|Removed |Added
CC||wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61915
--- Comment #10 from Wilco wdijkstr at arm dot com ---
(In reply to Andrew Pinski from comment #2)
https://gcc.gnu.org/ml/gcc/2014-05/msg00160.html
Note currently it is not possible to use FP registers for spilling using the
hooks - basically
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503
--- Comment #19 from Wilco wdijkstr at arm dot com ---
(In reply to Evandro from comment #16)
(In reply to Wilco from comment #15)
Using -Ofast is not any different from -O3 -ffast-math when compiling
non-Fortran code. As comment 10 shows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61915
--- Comment #15 from Wilco wdijkstr at arm dot com ---
(In reply to Evandro from comment #12)
(In reply to Evandro from comment #11)
Do you have an idea of the performance impact of this patch?
At least in Dhrystone, it improved by over 2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61915
--- Comment #16 from Wilco wdijkstr at arm dot com ---
(In reply to Andrew Pinski from comment #13)
(In reply to Wilco from comment #9)
I committed a workaround
(http://gcc.gnu.org/ml/gcc-patches/2014-09/msg00362.html) by increasing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61915
--- Comment #18 from Wilco wdijkstr at arm dot com ---
(In reply to Andrew Pinski from comment #17)
(In reply to Wilco from comment #16)
(In reply to Andrew Pinski from comment #13)
(In reply to Wilco from comment #9)
I committed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503
--- Comment #22 from Wilco wdijkstr at arm dot com ---
(In reply to Evandro from comment #21)
(In reply to ramana.radhakrish...@arm.com from comment #20)
What's the kind of performance delta you see if you managed to unroll
the loop just
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63503
--- Comment #24 from Wilco wdijkstr at arm dot com ---
(In reply to Evandro from comment #23)
(In reply to Wilco from comment #22)
Unrolling alone isn't good enough in sum reductions. As I mentioned before,
GCC doesn't enable any
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61915
Wilco wdijkstr at arm dot com changed:
What|Removed |Added
Status|ASSIGNED|RESOLVED
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60580
Wilco wdijkstr at arm dot com changed:
What|Removed |Added
CC||wdijkstr at arm dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64151
--- Comment #2 from Wilco wdijkstr at arm dot com ---
(In reply to H.J. Lu from comment #1)
Revert the reg_class change:
diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c
index 72c00cc..16fd6e8 100644
--- a/gcc/ira-costs.c
+++ b/gcc/ira
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64156
--- Comment #3 from Wilco wdijkstr at arm dot com ---
(In reply to Michael Meissner from comment #2)
Note, the fix proposed in PR64151 DOES NOT work on the PowerPC, so it may be
a dup in terms of what change broke the build, but the potential
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
As PR rtl-optimization/64151 showed, the longjmp expansion on i386 is incorrect
if the base register is spilled. It turns out it is trivial to write an example
that reproduces this without my patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64151
--- Comment #7 from Wilco wdijkstr at arm dot com ---
See PR rtl-optimization/64242 for the longjmp issue on i386.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64242
--- Comment #3 from Wilco wdijkstr at arm dot com ---
(In reply to H.J. Lu from comment #2)
Dup of PR 59039?
No that talks about not using __builtin_setjmp and __builtin_longjmp within the
same function. I only used longjmp. Or are they so
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862
--- Comment #4 from Wilco wdijkstr at arm dot com ---
(In reply to Vladimir Makarov from comment #3)
But I can not just revert the patch making ALL_REGS available
to make
coloring heuristic more fotunate for your particular case, as it
reopens
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862
--- Comment #13 from Wilco wdijkstr at arm dot com ---
(In reply to Vladimir Makarov from comment #9)
Created attachment 35503 [details]
ira-hook.patch
Here is the patch. Could you try it and give me your opinion about it.
Thanks.
I
: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The isinf, isnan, isnormal, isfinite, fpclassify and signbit builtins use FP
arithmetic to compute their result even with -fsignaling-nans (signbit only
when -ffast-math
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66462
--- Comment #1 from Wilco wdijkstr at arm dot com ---
Note when this is fixed, GLIBC math/math.h should be updated to enable the
isinf builtins even with -fsignaling-nans.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304
Wilco wdijkstr at arm dot com changed:
What|Removed |Added
CC||wdijkstr at arm dot com
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
Created attachment 36016
-- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36016action=edit
preprocessed iso-2022-cn-ext.c
Since recently (around May) GCC6 has started to emit
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66946
--- Comment #4 from Wilco wdijkstr at arm dot com ---
(In reply to Andrew Pinski from comment #2)
Comment on attachment 36021 [details]
minimal example
written == ((wchar_t) 0xfffd)
Will ever be true or is there some sign extending going
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66946
--- Comment #1 from Wilco wdijkstr at arm dot com ---
Created attachment 36021
-- https://gcc.gnu.org/bugzilla/attachment.cgi?id=36021action=edit
minimal example
Minimal example which still reports the spurious warning.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63304
--- Comment #33 from Wilco ---
(In reply to Evandro from comment #32)
> (In reply to Ramana Radhakrishnan from comment #31)
> > (In reply to Evandro from comment #30)
> > > The performance impact of always referring to constants as if they were
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69176
--- Comment #9 from Wilco ---
(In reply to Andrew Pinski from comment #8)
> (In reply to Wilco from comment #7)
> > > > I think the problem is the constraints on *add3_pluslong allows
> > > > all immediates.
> > >
> > > I'm not sure what you
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69176
--- Comment #15 from Wilco ---
(In reply to Richard Henderson from comment #14)
> (In reply to Wilco from comment #12)
> > The only remaining question I had whether it would be possible to use
> > peephole expansions rather than the late splits.
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
There are 2 new failures in the tail-call-2.c test on recent trunk builds:
FAIL: gcc.dg/plugin/must-tail-call-2.c -fplugin
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69176
--- Comment #12 from Wilco ---
(In reply to Wilco from comment #11)
> With your patch expand always emits add instructions with complex immediates
> which then can't be optimized.
OK, so I can change your patch do the right thing with 2 minor
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69176
--- Comment #11 from Wilco ---
(In reply to Richard Henderson from comment #10)
> Created attachment 37267 [details]
> proposed patch
>
> Andrew is exactly right re plus being special.
>
> The pluslong hoops that are being jumped through are
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #29 from Wilco ---
(In reply to rguent...@suse.de from comment #28)
> On Fri, 5 Feb 2016, alalaw01 at gcc dot gnu.org wrote:
> > Should I raise a new bug for this, as both this and 53068 are CLOSED?
>
> I think this has been
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #5 from Wilco ---
This still fails on AArch64 in exactly the same way with latest trunk - can
someone reopen this? I don't seem to have the right permissions...
(In reply to Richard Biener from comment #4)
> So - can you please
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #8 from Wilco ---
In a few functions GCC decides that the assignments in loops are redundant. The
loops still execute but have their loads and stores removed. Eg. the first DO
loop in MP2NRG should be:
.L1027:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69657
--- Comment #5 from Wilco ---
(In reply to Andrew Pinski from comment #4)
> (In reply to Jonathan Wakely from comment #3)
> > Recategorising as component=c++, and removing the regression marker (because
> > the change in libstdc++ that reveals
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69336
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #13 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69619
--- Comment #2 from Wilco ---
Changing to c = 3 generates code after a short time. The issue is recursive
calls to expand_ccmp_expr during the 2 possible options tried to determine
costs. That makes the algorithm exponential.
A fix would be to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69619
--- Comment #3 from Wilco ---
A simple workaround is to calculate cost1 early and only try the 2nd option if
the cost is low (ie. it's not a huge expression that may evaluate into lots of
ccmps). A slightly more advanced way would be to walk
Priority: P3
Component: libstdc++
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
Since a recent C++ header change abs() no longer gets inlined if we include an
unrelated header before it.
#include
#include
int
wrap_abs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69619
--- Comment #5 from Wilco ---
Proposed patch: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00206.html
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #6 from Wilco ---
This still fails on AArch64 in exactly the same way with latest trunk - can
someone reopen this? I don't seem to have the right permissions...
(In reply to Richard Biener from comment #4)
> So - can you please
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69416
--- Comment #2 from Wilco ---
Started looking at this- it looks like line 1833 in emit-rtl.c gets miscompiled
in combine:
(insn 397 389 394 38 (set (reg:SI 462)
(const_int 29 [0x1d])) ./emit-rtl.c:1833 49 {*movsi_aarch64}
(nil))
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69416
--- Comment #6 from Wilco ---
(In reply to Andrew Pinski from comment #4)
> Actually I think the problem is (const_int 8 [0x8]) does that make sense
> for CC mode? I don't think it does.
It should make sense as a CCmode immediate. It relies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69416
--- Comment #7 from Wilco ---
(In reply to Richard Henderson from comment #5)
> Created attachment 37419 [details]
> proposed patch
>
> I'm testing the following, but it does produce correct results
> on a spot check of emit-rtl.c:1833.
Yes,
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #41 from Wilco ---
(In reply to Jerry DeLisle from comment #40)
> Do you have a reduced test case of the Fortran code we can look at?
See comment 13/14, the same common array is declared with different sizes in
various places.
> I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368
--- Comment #9 from Wilco ---
The loops get optimized away in dom2. The info this phase emits is hard to
figure out, so it's not obvious why it thinks the array assignments are
redundant (the array is used all over the place so clearly cannot be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #15 from Wilco ---
(In reply to Richard Biener from comment #14)
> The regression in the original description looks severe enough to warrant
> some fixing even if regressing some other cases.
Agreed, I think the improvement from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #17 from Wilco ---
(In reply to Jiong Wang from comment #16)
> * for the second patch at #c10, if we always do the following no matter
> op0 is virtual & eliminable or not
>
> "op1 = force_operand (op1, NULL_RTX);"
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #20 from Wilco ---
(In reply to Richard Henderson from comment #19)
> I wish that message had been a bit more complete with the description
> of the performance issue. I must guess from this...
>
> > ldr dst1, [reg_base1,
-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The expansion of __builtin_mempcpy is inefficient on many targets (eg. AArch64,
ARM, PPC). The issue is due to not using the same expansion options that memcpy
uses
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70055
--- Comment #9 from Wilco ---
(In reply to H.J. Lu from comment #8)
> Inlining mempcpy uses a callee-saved register:
>
...
>
> Not inlining mempcpy is preferred.
If codesize is the only thing that matters... The cost is not at the caller
side
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #5 from Wilco ---
(In reply to amker from comment #4)
> (In reply to ktkachov from comment #3)
> > Started with r233136.
>
> That's why I forced base+offset out of memory reference and kept register
> scaling in in the first place.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70055
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70055
--- Comment #5 from Wilco ---
(In reply to Jakub Jelinek from comment #3)
> If some arch in glibc implements memcpy.S and does not implement mempcpy.S,
> then obviously the right fix is to add mempcpy.S for that arch, usually it
> is just a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70055
--- Comment #6 from Wilco ---
(In reply to Jakub Jelinek from comment #4)
> Note the choice of this in a header file is obviously wrong, if you at some
> point fix this up, then apps will still call memcpy rather than mempcpy,
> even when the
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #12 from Wilco ---
(In reply to Jiong Wang from comment #11)
> (In reply to Richard Henderson from comment #10)
> > Created attachment 37890 [details]
> > second patch
> >
> > Still going through full testing, but I wanted to post
: target
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The following example generates very inefficient code on AArch64:
int f1(int i) { int p[1000]; p[i] = 1; return p[i + 10] + p[i + 20]; }
f1:
sub sp, sp, #4000
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70048
--- Comment #1 from Wilco ---
The regression seem to have appeared on trunk around Feb 3-9.
: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
GCC emits the same code for caller-saves in all cases, even if the caller-save
is an immediate which can be trivially rematerialized. The caller-save code
should
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The following code in ira-costs.c tries to improve the memory cost for
rematerializeable loads. There are several issues with this though:
1. The memory cost can
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70861
--- Comment #3 from Wilco ---
(In reply to Andrew Pinski from comment #2)
> Note I think if we had gotos instead of assignment here we should do the
> similar thing for the switch table itself.
Absolutely, that was my point.
> Note also the
-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
GCC uses a very basic check to determine whether to use a switch table. A
simple example from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11823 still
generates a huge table
: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
IVOpt chooses between using indexing for induction variables or incrementing
pointers. Due to way loop unrolling works, a decision that is optimal if
unrolling
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70946
--- Comment #1 from Wilco ---
PR36712 seems related to this
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70961
--- Comment #5 from Wilco ---
As for a simple example, Proc_4 in Dhrystone is a good one. With -O2 and
-fno-rename-registers I get the following on Thumb-2:
00c8 :
c8: b430push{r4, r5}
ca: f240 0300 movwr3,
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
When deciding which register to use regrename.c calls the target function
preferred_rename_class. However in pass 2 in find_rename_reg it then just
ignores this preference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70961
--- Comment #3 from Wilco ---
(In reply to Eric Botcazou from comment #2)
> Pass #2 ignores it since the preference simply couldn't be honored.
In which case it should not rename that chain rather than just ignore the
preference (and a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71022
--- Comment #2 from Wilco ---
(In reply to Richard Biener from comment #1)
> IRA might choose to do this as part of life-range splitting/shortening. Note
> that reg-reg moves may be cheaper code-size wise (like on CISC archs with
> non-fixed
-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
When assigning the same immediate value to different registers, GCC will always
CSE the immediate and emit a register move for subsequent uses. This creates
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
With -Ofast GCC doesn't reassociate constant multiplies or negates away from
divisors to allow for more reciprocal division optimizations. It is also
possible to avoid divisions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77484
--- Comment #29 from Wilco ---
(In reply to Jan Hubicka from comment #28)
> > On SPEC2000 the latest changes look good, compared to the old predictor gap
> > improved by 10% and INT/FP by 0.8%/0.6%. I'll run SPEC2006 tonight.
>
> It is rather
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77484
--- Comment #31 from Wilco ---
(In reply to Jan Hubicka from comment #30)
> >
> > When I looked at gap at the time, the main change was the reordering of a
> > few
> > if statements in several hot functions. Incorrect block frequencies also
>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #10 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69847
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #27 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66946
Wilco changed:
What|Removed |Added
Status|WAITING |RESOLVED
Resolution|---
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
Changes in the static branch predictor (around August last year) caused
regressions on SPEC2000. The PRED_CALL predictor causes GAP
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The __builtin_eh_return implementation on AArch64 generates incorrect code for
many cases due to using an incorrect offset/pointer when writing the new return
address to the stack
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65068
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #3 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77455
Wilco changed:
What|Removed |Added
Target||AArch64
Known to fail|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77568
--- Comment #5 from Wilco ---
(In reply to Andrew Pinski from comment #2)
> Note there are two different issues here.
Well they are 3 examples of the same underlying issue - don't do a CSE when
it's not profitable. How they are resolved might
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
The recently introduced code hoisting aggressively moves common subexpressions
that might otherwise be mergeable with other
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77568
--- Comment #3 from Wilco ---
(In reply to Andrew Pinski from comment #1)
> I think this is just a pass ordering issue. We create fmas after PRE.
> Maybe we should do it both before and after ...
> Or enhance the pass which produces FMA to
Assignee: unassigned at gcc dot gnu.org
Reporter: wdijkstr at arm dot com
Target Milestone: ---
A commonly used benchmark contains a hot loop which calls one of 2 virtual
functions via a static variable which is set just before. A reduced example is:
int f1(int x) { return x + 1; }
int f2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71026
--- Comment #3 from Wilco ---
(In reply to ktkachov from comment #2)
> The transforms
>
> int f4(float x) { return (1.0f / x) < 0.0f; } // -> x < 0.0f
> int f5(float x) { return (x / 2.0f) <= 0.0f; }// -> x <= 0.0f
>
> can be
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32650
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
--- Comment #8 from Wilco ---
(In reply to Bernd Edlinger from comment #7)
> (In reply to Richard Earnshaw from comment #6)
> > (In reply to Bernd Edlinger from comment #5)
> > > (In reply to Wilco from comment #4)
> > > > However dealing with
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #14 from Wilco ---
(In reply to Bernd Edlinger from comment #13)
> I am still trying to understand why thumb1 seems to outperform thumb2.
>
> Obviously thumb1 does not have the shiftdi3 pattern,
> but even if I remove these from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #12 from Wilco ---
It looks like we need a different approach, I've seen the extra SETs use up
more registers in some cases, and in other cases being optimized away early
on...
Doing shift expansion at the same time as all other DI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
--- Comment #4 from Wilco ---
(In reply to Bernd Edlinger from comment #3)
> (In reply to Wilco from comment #2)
> > (In reply to Bernd Edlinger from comment #1)
> > > some background about this bug can be found here:
> > >
> > >
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78041
--- Comment #11 from Wilco ---
(In reply to ktkachov from comment #10)
> Confirmed then. Wilco, if you're working on this can you please assign it to
> yourself?
Unfortunately the form doesn't allow me to do anything with the headers...
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #32 from Wilco ---
(In reply to Bernd Edlinger from comment #31)
> Sure, combine cant help, especially because it runs before split1.
>
> But I wondered why this peephole2 is not enabled:
>
> (define_peephole2 ; ldrd
> [(set
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71951
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #8 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71951
--- Comment #11 from Wilco ---
(In reply to Icenowy Zheng from comment #10)
> In my environment (glibc 2.25, and both the building scripts of glibc and
> gcc have -fomit-frame-pointer automatically enabled), this bug is not fully
> resolved yet.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82439
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #4 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78468
Wilco changed:
What|Removed |Added
CC||wdijkstr at arm dot com
--- Comment #38 from
1 - 100 of 134 matches
Mail list logo