x23, x24, [sp,#32]
ldpx25, x26, [sp,#48]
ldpx27, x28, [sp,#64]
ldrx30, [sp,#80]
ldpx19, x20, [sp],#96
ret
Passes bootstrap, OK for commit (and backport to GCC7)?
ChangeLog:
2018-01-05 Wilco Dijkstra
* config/aarch64/aarch64.c (aarch64_components_for_bb
Andrew Pinski wrote:
> Seems like you should do something similar to the integer madd/msub
> instructions too (aarch64_mla is already correct but aarch64_mla_elt
> needs this too).
Integer madd/msub may benefit too, however it wouldn't make a difference
for a 3-operand mla since the register allo
Segher Boessenkool wrote:
> On Fri, Jan 05, 2018 at 12:22:44PM +0000, Wilco Dijkstra wrote:
>> An example epilog in a shrinkwrapped function before:
>>
>> ldp x21, x22, [sp,#16]
>> ldr x23, [sp,#32]
>> ldr x24, [sp,#40]
>> ldp x25, x26, [sp,#48
Segher Boessenkool wrote:
> On Mon, Jan 08, 2018 at 01:27:24PM +0000, Wilco Dijkstra wrote:
>
>> Peepholing is very conservative about instructions using SP and won't touch
>> anything frame related. If this was working better then the backend could
>> just
>&g
explicitly checking for a subset of GENERAL_REGS and FP_REGS.
Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
Passes regress, OK for commit? Since it is a regression introduced in GCC8, OK
to
backport to GCC8?
ChangeLog:
2018-05-25 Wilco Dijkstra
* config/aarch64/
James Greenhalgh wrote:
> > Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
>
> > I'd prefer more detail than this for a workaround; which test, why did it
> > start to fail, why is this the right solution, etc.
It was gcc.target/aarch64/vect_copy_lane_1.c generating:
test
Richard Sandiford
> The "?" change seems to make intrinsic sense given the extra cost of the
> GPR alternative. But I think the real reason for this failure is that
> we define no V1DF patterns, and target-independent code falls back to
> using moves in the corresponding *integer* mode. So for
Richard Sandiford wrote:
>> This has probably been reported elsewhere already but I can't find
>> such a report, so sorry for possible duplicate,
>> but this patch is causing ICEs on aarch64
>> FAIL: gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
>> (internal compiler error)
>> FAIL:
Since PR64946 has been fixed, we can remove the xfail from this test.
Committed as obvious.
ChangeLog:
2018-06-18 Wilco Dijkstra
PR tree-optimization/64946
* gcc.target/aarch64/vect-abs-compile.c: Remove xfail.
--
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-abs
Add missing target pthread to ensure test doesn't fail on bare-metal
targets. Committed as obvious.
ChangeLog:
2018-06-18 Wilco Dijkstra
PR tree-optimization/86076
* gcc.dg/pr86076.c: Add target pthread for bare-metal targets.
--
diff --git a/gcc/testsuite/gcc.dg/pr8607
:
f:
str x30, [sp, -16]!
bl lroundf
add x0, x0, 1
ldr x30, [sp], 16
ret
With -fno-math-errno:
f:
fcvtas x0, s0
add x0, x0, 1
ret
Passes regress on AArch64. OK for commit?
ChangeLog:
2018-06-18 Wilco Dijkstra
Richard Biener wrote:
> There are a number of regression tests that check for errno handling
> (I added some to avoid aliasing for example). Please make sure to
> add explicit -fmath-errno to those that do not already have it set
> (I guess such patch would be obvious and independent of this one)
Joseph Myers wrote:
> On Thu, 21 Jun 2018, Jeff Law wrote:
>
> > I think all this implies that the setting of -fno-math-errno by default
> > really depends on the math library in use since it's the library that
> > has to arrange for either errno to get set or for an exception to be raised.
>
> If
Eric Botcazou wrote:
> > The AArch64 parts are OK. I've been holding off approving the patch while
> > I wait for Eric to reply on the x86_64 fails with your new testcase.
>
> The test is not portable in any case since it uses the "optimize" attribute
> so
> I'd just make it specific to Aarch64
Eric Botcazou wrote:
>> This test can easily be changed not to use optimize since it doesn't look
>> like it needs it. We really need to tests these builtins properly,
>> otherwise they will continue to fail on most targets.
>
> As far as I can see PR target/84521 has been reported only for Aarch6
Joseph Myers wrote:
> On Tue, 26 Jun 2018, Wilco Dijkstra wrote:
> > That looks incorrect indeed but that's mostly a problem with -fmath-errno
> > as it
> > would result in GCC assuming the function is const/pure when in fact it
> > isn't.
> > Does
Fix and simplify the testcase so it generates dup even on latest trunk.
This fixes the failure reported in:
https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01799.html
Committed as obvious.
ChangeLog:
2018-06-28 Wilco Dijkstra
* gcc.target/aarch64/f16_mov_immediate_3.c: Fix testcase
Segher Boessenkool wrote:
> On Mon, Jan 08, 2018 at 0:25:47PM +0000, Wilco Dijkstra wrote:
>> > Always pairing two registers together *also* degrades code quality.
>>
>> No, while it's not optimal, it means smaller code and fewer memory accesses.
>
> It means
Richard Biener wrote:
>On Thu, Jan 4, 2018 at 10:27 PM, Marc Glisse wrote:
>> I don't understand how the check you added helps.
It simply blocks the transformation for infinity:
+ (if (!REAL_VALUE_ISINF (TREE_REAL_CST (@0)))
+ (switch
+ (if (real_less (&dconst0, TREE_REAL_CST_P
Segher Boessenkool wrote:
> Of course I see that ldp is useful. I don't think that this particular
> way of forcing more pairs is a good idea. Needs testing / benchmarking /
> instrumentation, and we haven't seen any of that.
I wouldn't propose a patch if it caused slowdowns. In fact I am see
>= and <= for now since C / x can underflow if C is small.
Simplify (x * C1) > C2 into x > (C2 / C1) with -funsafe-math-optimizations.
If C1 is negative the comparison is reversed.
OK for commit?
ChangeLog
2018-01-10 Wilco Dijkstra
Jackson Woodruff
gcc/
as x0, s0
4: d65f03c0ret
With -fno-math-errno:
f:
fcvtas x0, s0
add x0, x0, 1
ret
OK for commit?
2018-01-12 Wilco Dijkstra
* common.opt (fmath-errno): Change default to 0.
* opts.c (set_fast_math_flags): Force -fno-ma
ommit?
ChangeLog:
2018-01-15 Wilco Dijkstra
Richard Sandiford
gcc/
PR target/82964
* config/aarch64/aarch64.md (movti_aarch64): Use Uti constraint.
* config/aarch64/aarch64.c (aarch64_mov128_immediate): New function.
(aarch64_legitimate_constant_p):
Joseph Myers wrote:
> Another question to consider: what about configurations (mostly
> soft-float) where floating-point exceptions are not supported? (glibc
> wrongly defines math_errhandling to include MATH_ERREXCEPT there, but the
> only option actually permitted by C99 in that case would b
egister for same-size int<->fp conversions.
Passes regress & bootstrap, OK for commit?
ChangeLog:
2018-01-16 Wilco Dijkstra
* config/aarch64/aarch64.md (mov): Remove '*' in alternatives.
(movsi_aarch64): Likewise.
(load_pairsi): Likewise.
(load_p
Hi,
In general I think the best way to achieve this would be to use the
existing cost models which are there for exactly this purpose. If
this doesn't work well enough then we should fix those. As is,
this patch disables a whole class of instructions for a specific
target rather than simply tellin
(finished version this time, somehow Outlook loves to send emails early...)
Hi,
In general I think the best way to achieve this would be to use the
existing cost models which are there for exactly this purpose. If
this doesn't work well enough then we should fix those. As is,
this patch disables
Siddhesh Poyarekar wrote:
> The current cost model will disable reg offset for loads as well as
> stores, which doesn't work well since loads with reg offset are faster
> for falkor.
Why is that a bad thing? With the patch as is, the testcase generates:
.L4:
ldr q0, [x2, x3]
James Greenhalgh wrote:
> - /* Do not allow wide int constants - this requires support in movti. */
> + /* Only allow simple 128-bit immediates. */
> if (CONST_WIDE_INT_P (x))
> - return false;
> + return aarch64_mov128_immediate (x);
> I can see why this could be correct, but it is
Siddhesh Poyarekar wrote:
On Wednesday 17 January 2018 08:31 PM, Wilco Dijkstra wrote:
> Why is that a bad thing? With the patch as is, the testcase generates:
>
> .L4:
> ldr q0, [x2, x3]
> add x5, x1, x3
> add x3, x3, 16
> cmp
s the failures and has no effect otherwise. Committed as trivial fix.
ChangeLog:
2018-01-18 Wilco Dijkstra
gcc/
PR target/82964
* config/aarch64/aarch64.c (aarch64_legitimate_constant_p):
Use GET_MODE_CLASS for scalar floating point.
--
diff --git a/gcc/config/aarch64/aarc
Christophe Lyon wrote:
> After this patch (r256800), I have noticed new failures on aarch64:
> gcc.target/aarch64/f16_mov_immediate_1.c scan-assembler-times
> mov\tw[0-9]+, #?19520 3 (found 0 times)
Thanks for spotting these, the scripts appear to have missed those
(contrib/dg-cmp-results.sh s
- libquantum and SPECv6
performance improves.
OK for commit?
ChangeLog:
2018-01-22 Wilco Dijkstra
PR target/79262
* config/aarch64/aarch64.c (generic_vector_cost): Adjust
vec_to_scalar_cost.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index
sorted using the pressure model, and instructions outside it will use
RFS_DEP_COUNT and/or RFS_TIE for their order.
Bootstrap OK on AArch64, OK for commit?
ChangeLog:
2018-01-31 Wilco Dijkstra
PR rlt-optimization/84068
* haifa-sched.c (rank_for_schedule): Fix SCHED_PRESSURE_MODEL
Richard Sandiford wrote:
> This was the original intent, but was changed in r213708. TBH I'm not
> sure what the second hunk in that revision fixed, since model_index is
> supposed to return an index greater than all valid indices when passed
> an instruction outside the current block. Maxim, do
James Greenhalgh wrote:
> Please queue for GCC 9. OK when trunk is back open for new code.
This fixes the regressions introduced by the SVE merge conflicts and
the failures of aarch64/pr62178.c, both of which are new regressions,
so we should fix these now.
Wilco
Richard Sandiford wrote:
> But why wasn't the index 0 as expected for the insns outside of the block?
Well it seems it checks for index 0 and sets the model_index as the current
maximum model_index count. This means the target_bb check isn't
strictly required - I build all of SPECINT2017 using t
sort on model_index. If the model_index is the same
we defer to RFS_DEP_COUNT and/or RFS_TIE.
Bootstrap OK, OK for commit?
ChangeLog:
2018-02-02 Wilco Dijkstra
PR rlt-optimization/84068
* haifa-sched.c (rank_for_schedule): Fix SCHED_PRESSURE_MODEL sorting.
PR rlt
Hi Kugan,
> Based on the previous discussions, I tried to implement a tree loop
> unroller for partial unrolling. I would like to queue this RFC patches
> for next stage1 review.
This is a great plan - GCC urgently requires a good unroller!
> * Cost-model for selecting the loop uses the same par
Hi Adhemerval,
A few comments on the assembly code:
+# This function is called with non-standard calling convention: on entry
+# x10 is the requested stack pointer, x11 is previous stack pointer (if
+# functions has stacked arguments which needs to be restored), and x12 is
+# the caller link reg
Hi Siddhesh,
I still don't like the idea of disabling a whole class of instructions in the
md file.
It seems much better to adjust the costs here so that you get most of the
improvement now, and fine tune it once we can differentiate between
loads and stores.
Taking your example, adding -funroll
Richard Biener wrote:
>> This is a great plan - GCC urgently requires a good unroller!
>
> How so?
I thought it is well-known for many years that the rtl unroller doesn't work
properly.
In practically all cases where LLVM beats GCC, it is due to unrolling small
loops.
You may have noticed how p
Richard Biener wrote:
> With Ooo CPUs speculatively executing the next iterations I very much doubt
> that.
OoO execution is like really dumb loop unrolling, you still have all the
dependencies
between iterations, all the branches, all the pointer increments etc.
Optimizing those
reduces instr
Siddhesh Poyarekar wrote:
> On Thursday 15 February 2018 07:50 PM, Wilco Dijkstra wrote:
>> So it seems to me using existing cost mechanisms is always preferable, even
>> if you
>> currently can't differentiate between loads and stores.
>
> Luis is working on addr
testcase
and gives 1% speedup on SPECFP2017, fixing the performance regression.
OK for commit?
ChangeLog:
2018-02-23 Wilco Dijkstra
PR tree-optimization/84114
* config/aarch64/aarch64.c (aarch64_reassociation_width)
Avoid reassociation of FLOAT_MODE addition.
--
diff
Richard Biener
> It happens that on some targets doing two FMAs in parallel and one
> non-FMA operation merging them is faster than chaining three FMAs...
Like I mentioned in the PR, long chains should be broken, but for that we need
a new parameter to state how long a chain may be before it is
Richard Sandiford wrote:
> But there's the third question of whether the frame pointer is available
> for general allocation. By removing frame_pointer_required, we're saying
> that the frame pointer is always available for general use.
Unlike on ARM/Thumb-2, the frame pointer is unfortunately
Jakub Jelinek wrote:
> On Thu, Apr 12, 2018 at 03:52:09PM +0200, Richard Biener wrote:
>> Not sure if I missed some important part of the discussion but
>> for the testcase we want to preserve the tailcall, right? So
>> it would be enough to set avoid_libcall to
>> endp != 0 && CALL_EXPR_TAILCALL
Jakub Jelinek wrote:
> On Thu, Apr 12, 2018 at 03:53:13PM +0000, Wilco Dijkstra wrote:
>> The tailcall issue is just a distraction. Historically the handling of
>> mempcpy
>> has been horribly inefficient in both GCC and GLIBC for practically all
>> targets.
>&
Jakub Jelinek wrote:
>On Thu, Apr 12, 2018 at 04:30:07PM +0000, Wilco Dijkstra wrote:
>> Jakub Jelinek wrote:
>> Frankly I don't see why it is a P1 regression. Do you have a benchmark that
>
>That is how regression priorities are defined.
How can one justify consider
Jakub Jelinek wrote:
>On Thu, Apr 12, 2018 at 05:29:35PM +0000, Wilco Dijkstra wrote:
>> > Depending on what you mean old, I see e.g. in 2010 power7 mempcpy got
>> > added,
>> > in 2013 other power versions, in 2016 s390*, etc. Doing a decent mempcpy
>> >
Hi Denis,
> We are working on applying Address/LeakSanitizer for the full Tizen OS
> distribution. It's about ~1000 packages, ASan/LSan runtime is installed
> to ld.so.preload. As we know ASan/LSan has interceptors for
> allocators/deallocators such as (malloc/realloc/calloc/free) and so on.
> O
Hi Denis,
>> Adding support for a frame chain would require an ABI change. It
> would have to
> > work across GCC, LLVM, Arm, Thumb-1 and Thumb-2 - not a trivial amount of
> > effort.
> Clang already works that way.
No, that's incorrect like Richard pointed out. Only a single register can be
u
.8b
fmovw0, s0
ret
After:
fmovs0, w0
cnt v0.8b, v0.8b
addvb0, v0.8b
fmovw0, s0
ret
Passes regress on AArch64, OK for commit?
ChangeLog:
2018-09-27 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.md (zero_extendsidi2
Matthew wrote:
> The canonical way to require even-odd pairs of registers to implement a TImode
> pseudo register as mentioned in the documentation is to limit *all* TImode
> registers to being even-odd by using the TARGET_HARD_REGNO_MODE_OK hook.
And that is the best approach for cases like this
cnt v0.8b, v0.8b
addvb0, v0.8b
fmovw0, s0
ret
After:
fmovs0, w0
cnt v0.8b, v0.8b
addvb0, v0.8b
fmovw0, s0
ret
Passes regress on AArch64, OK for commit?
ChangeLog:
2018-09-28 Wilco Dijkstra
gcc/
* conf
Richard Henderson wrote:
> If you're going to add moves r->w, why not also go ahead and add w->r.
> There are also HImode fmov zero-extensions, fwiw.
Well in principle it would be possible to support all 8/16/32-bit zero
extensions
for all combinations of int and fp registers. However I prefer t
or commit?
ChangeLog:
2018-10-03 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.md (zero_extendsidi2_aarch64): Add alternatives
to zero-extend between int and floating-point registers.
(load_pair_zero_extendsidi2_aarch64): Add alternative to emit
zero-extended
ldp in
Hi Jeff,
> So I went back and reviewed all the discussion around this. I'm still
> having trouble getting comfortable with flipping the default -- unless
> we know ahead of time that the target runtime doesn't set errno on any
> of the math routines. That implies a target hook to describe the
>
Richard Biener wrote:
> why use BUILT_IN_ISUNORDERED but not a GIMPLE_COND with
> UNORDERED_EXPR? Note again that might trap/throw with -fsignalling-nans
> so better avoid this transform for flag_signalling_nans as well...
Both currently trap on signalling NaNs due to the implementation of the C
James Greenhalgh wrote:
> On Tue, Jan 16, 2018 at 04:32:36PM +0000, Wilco Dijkstra wrote:
>> v2: Rebased after the big SVE commits
>>
>> Remove the remaining uses of '*' from aarch64.md.
>> Using '*' in alternatives is typically incorrect as it
ping
From: Wilco Dijkstra
Sent: 04 January 2018 17:46
To: GCC Patches
Cc: nd
Subject: [PATCH][AArch64] Improve register allocation of fma
This patch improves register allocation of fma by preferring to update the
accumulator register. This is done by adding fma insns with operand 1 as the
ping
From: Wilco Dijkstra
Sent: 25 October 2017 16:29
To: GCC Patches
Cc: nd
Subject: [PATCH][AArch64] Simplify frame pointer logic
Simplify frame pointer logic based on review comments here
(https://gcc.gnu.org/ml/gcc-patches/2017-10/msg01727.html).
This patch incrementally adds to these
ping
From: Wilco Dijkstra
Sent: 17 November 2017 15:21
To: GCC Patches
Cc: nd
Subject: [PATCH][AArch64] Set SLOW_BYTE_ACCESS
Contrary to all documentation, SLOW_BYTE_ACCESS simply means accessing
bitfields by their declared type, which results in better codegeneration on
practically
any
Hi,
> I see nothing about you addressing James' comment from 17th November...
I addressed that in a separate patch, see
https://patchwork.ozlabs.org/patch/839126/
Wilco
Hi,
> Which doesn't appear to have been approved. Did you follow up with Jeff?
I'll get back to that one at some point - it'll take some time to agree on a way
forward with the callback.
Wilco
Hi,
James Greenhalgh wrote:
>
> This seems like a fairly horrible hack around the register allocator
> behaviour.
That is why I proposed to improve the register allocator so one can explicitly
specify the copy preference in the md syntax. However that wasn't accepted,
so we'll have to use a hack
Richard Earnshaw wrote:
>>> Which doesn't appear to have been approved. Did you follow up with Jeff?
>>
>> I'll get back to that one at some point - it'll take some time to agree on a
>> way
>> forward with the callback.
>>
>> Wilco
>>
>>
>
> So it seems to me that this should then be q
Kyrill Tkachov wrote:
> That patch would look like the attached. Is this preferable?
> For the above example it generates the desired:
> foo_v4sf:
> ldr s0, [x0]
> ldr s1, [x1, 8]
> ins v0.s[1], v1.s[0]
> ld1 {v0.s}[2], [x2]
> ld1 {v0.s}[3], [x3]
>
James Greenhalgh wrote:
> +/* Determine whether a frame chain needs to be generated. */
> +static bool
> +aarch64_needs_frame_chain (void)
> +{
> + /* Force a frame chain for EH returns so the return address is at FP+8. */
> + if (frame_pointer_needed || crtl->calls_eh_return)
> + return tr
ND_FP_REGS register class which is now used instead of
ALL_REGS.
Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
Passes regress, OK for commit?
Since it is a regression introduced in GCC8, OK to backport to GCC8?
ChangeLog:
2018-05-22 Wilco Dijkstra
* config/aarch64
Richard Sandiford wrote:
> - if (allocno_class != ALL_REGS)
> + if (allocno_class != POINTER_AND_FP_REGS)
> return allocno_class;
>
> - if (best_class != ALL_REGS)
> + if (best_class != POINTER_AND_FP_REGS)
> return best_class;
>
> mode = PSEUDO_REGNO_MODE (regno);
> I think
Wilco Dijkstra
* gcc/ree.c (combine_reaching_defs):
Ensure inserted copy writes a single register.
---
gcc/ree.c | 8 +++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/gcc/ree.c b/gcc/ree.c
index 856745f..9aa1e36 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -743,6
Hi,
This is a set of patches improving register costs on AArch64. The first fixes
aarch64_register_move_cost() to support CALLER_SAVE_REGS and POINTER_REGS so
costs are calculated
correctly in the register allocator.
ChangeLog:
2014-09-04 Wilco Dijkstra
* gcc/config/aarch64/aarch64
This patch fixes a bug in aarch64_register_move_cost(): GET_MODE_SIZE is in
bytes not bits. As a
result the FP2FP cost doesn't need to be set to 4 to catch the special case for
Q register moves.
ChangeLog:
2014-09-04 Wilco Dijkstra
* gcc/config/aarch64/aarc
Cleanup inconsistent use of __extension__.
ChangeLog:
2014-09-04 Wilco Dijkstra
* gcc/config/aarch64/aarch64.c: Cleanup use of __extension__.
---
gcc/config/aarch64/aarch64.c | 38 +++---
1 file changed, 11 insertions(+), 27 deletions(-)
diff --git a
://gcc.gnu.org/ml/gcc-patches/2014-09/msg00356.html).
OK for commit?
Wilco
ChangeLog:
2014-09-04 Wilco Dijkstra
* gcc/config/aarch64/aarch64.c:
Add cortexa57_regmove_cost and cortexa53_regmove_cost to avoid
spilling from integer to FP registers.
---
gcc/config/aarch64
> From: Marcus Shawcroft [mailto:marcus.shawcr...@gmail.com]
> > - NAMED_PARAM (FP2FP, 4)
> > + NAMED_PARAM (FP2FP, 2)
>
> This is not directly related to the change below and it is missing
> from the ChangeLog. Originally this number had to be > 2 in order
> for secondary reload to kick in.
> Thanks! Jakub noticed a potential problem in this area a while back,
> but I never came up with any code to trigger and have kept that issue on
> my todo list ever since.
>
> Rather than ensuring the inserted copy write a single register, it seems
> to me we're better off ensuring that the numb
> Here is a new rematerialization sub-pass of LRA.
>
> I've tested and benchmarked the sub-pass on x86-64 and ARM. The
> sub-pass permits to generate a smaller code in average on both
> architecture (although improvement no-significant), adds < 0.4%
> additional compilation time in -O2 mode o
> Vladimir Makarov wrote:
> > On SPECINT2k performance is ~0.5% worse (5.5% regression on perlbmk), and
> > SPECFP is ~0.2% faster.
> Thanks for reporting this. It is important for me as I have no aarch64
> machine for benchmarking.
>
> Perlbmk performance degradation is too big and I'll definite
> Wilco Dijkstra wrote:
> > Vladimir Makarov wrote:
> > > On SPECINT2k performance is ~0.5% worse (5.5% regression on perlbmk), and
> > > SPECFP is ~0.2% faster.
> > Thanks for reporting this. It is important for me as I have no aarch64
> > machine for be
> James Greenhalgh wrote:
> On Mon, Apr 27, 2015 at 05:57:26PM +0100, Wilco Dijkstra wrote:
> > > James Greenhalgh wrote:
> > > On Mon, Apr 27, 2015 at 02:42:36PM +0100, Wilco Dijkstra wrote:
> > > > > -Original Message-
> > > &
illing.
2015-04-27 Wilco Dijkstra
* gcc/config/aarch64/aarch64.md (aarch64_ashl_sisd_or_int_3):
Place integer variant first.
---
gcc/config/aarch64/aarch64.md | 14 +++---
1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/gcc/config/aarch64/aarch64.md b/gc
ping
> -Original Message-
> From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> Sent: 03 March 2015 16:19
> To: GCC Patches
> Subject: [PATCH][AArch64] Use conditional negate for abs expansion
>
> Expand abs into a compare and conditional negate. This is the most
ping
> -Original Message-
> From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> Sent: 03 March 2015 18:06
> To: GCC Patches
> Subject: [PATCH][AArch64] Make aarch64_min_divisions_for_recip_mul
> configurable
>
> This patch makes aarch64_min_divisions_for_recip_mu
ping
> -Original Message-
> From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> Sent: 04 March 2015 15:38
> To: GCC Patches
> Subject: [PATCH][AArch64] Fix aarch64_rtx_costs of PLUS/MINUS
>
> Include the cost of op0 and op1 in all cases in PLUS and MINUS in
> aarch6
ping
> -Original Message-
> From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> Sent: 05 March 2015 14:49
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH][AArch64] Fix Cortex-A53 shift costs
>
> This patch fixes the shift costs for Cortex-A53 so they are more accurate
> Jeff Law wrote:
> On 12/10/14 06:26, Wilco Dijkstra wrote:
> >
> > If recomputing is best does that mean that record_reg_classes should not
> > give a boost to the preferred class in the 2nd pass?
> Perhaps. I haven't looked deeply at this part of IRA. I was re
> James Greenhalgh wrote:
> On Mon, Apr 27, 2015 at 02:42:36PM +0100, Wilco Dijkstra wrote:
> > > -Original Message-----
> > > From: Wilco Dijkstra [mailto:wdijk...@arm.com]
> > > Sent: 03 March 2015 16:19
> > > To: GCC Patches
> > > Subje
> Marcus Shawcroft wrote:
> On 27 April 2015 at 14:43, Wilco Dijkstra wrote:
>
> >> static unsigned int
> >> -aarch64_min_divisions_for_recip_mul (enum machine_mode mode
> >> ATTRIBUTE_UNUSED)
> >> +aarch64_min_divisions_for_recip_mul (enu
> Marcus Shawcroft wrote:
> On 5 March 2015 at 14:49, Wilco Dijkstra wrote:
> > This patch fixes the shift costs for Cortex-A53 so they are more accurate -
> > immediate shifts
> use
> > SBFM/UBFM which takes 2 cycles, register controlled shifts take 1 cycle.
>
> Marcus Shawcroft wrote:
> On 1 May 2015 at 12:26, Wilco Dijkstra wrote:
> >
> >
> >> Marcus Shawcroft wrote:
> >> On 27 April 2015 at 14:43, Wilco Dijkstra wrote:
> >>
> >> >> static unsigned int
> >> >> -aarch64_mi
lls.
OK for commit?
ChangeLog:
2014-11-14 Wilco Dijkstra
* gcc/config/aarch64/aarch64.c (generic_regmove_cost):
Increase FP move cost.
---
gcc/config/aarch64/aarch64.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/gcc/config/aarch64/aarch64.c
Hi Jiong,
Can you commit this please?
2014-11-19 Wilco Dijkstra
* gcc/config/aarch64/aarch64.c (generic_regmove_cost):
Increase FP move cost (PR61915).
---
gcc/config/aarch64/aarch64.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/gcc/config
to 2 did give an improvement,
but vector had no effect, so I'll leave to 1 for now. The patch is the same
as last time, it just sets integer to 2, and uses the same settings for all
CPUs.
OK for commit?
ChangeLog:
2014-11-24 Wilco Dijkstra
* gcc/config/aarch64/aarch64-protos.h
> Jeff Law wrote:
> Do you have a testcase that shows the expected improvements from this
> change? It's OK if it's specific to a target.
>
> Have you bootstrapped and regression tested this change?
>
> With a test for the testsuite and assuming it passes bootstrap and
> regression testing, this
> Jeff Law wrote:
> OK with the appropropriate ChangeLog entires. THe original for
> ira-costs.c was fine, so you just need the trivial one for the testcase.
ChangeLog below - Jiong, could you commit for me please?
2014-12-02 Wilco Dijkstra
* gcc/ira-costs.c (scan_one_insn)
rcx,%rdi), %eax
ret
After:
cmp w0, 4
csneg w0, w0, w0, lt
ret
movl%edi, %edx
movl%edi, %eax
negl%edx
cmpl$4, %edi
cmovge %edx, %eax
ret
ChangeLog:
2015-02-26 Wilco Dijkstra wdijk...@arm.
> Richard Biener wrote:
> On Thu, Feb 26, 2015 at 11:20 PM, Jeff Law wrote:
> > On 02/26/15 10:30, Wilco Dijkstra wrote:
> >>
> >> Several GCC versions ago a conditional negate optimization was introduced
> >> as a workaround for
> >> PR45685.
301 - 400 of 1132 matches
Mail list logo