Fix test errors on targets which do not support pthreads.
Committed as obvious.
ChangeLog:
2019-02-11 Wilco Dijkstra
PR tree-optimization/86637
* gcc.c-torture/compile/pr86637-2.c: Test pthread and graphite target.
---
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr86637
.
ARMv5te bootstrap OK, regression tests pass. OK for commit?
ChangeLog:
2019-02-06 Wilco Dijkstra
gcc/
PR target/89222
* config/arm/arm.md (movsi): Use arm_cannot_force_const_mem
to decide when to split off an offset from a symbol.
* config/arm/arm.c
Hi Olivier,
> Sorry, I had -mapcs-frame in mind.
That's identical to -mapcs, and equally deprecated. It was superceded 2 decades
ago. -mpcs-frame bugs have been reported multiple times, including on VxWorks.
For example https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64379 suggests
VxWorks doesn't n
Hi Olivier,
> Below is a description of a very annoying bug we are witnessing
> on ARM.
...
> compiled with -Og -mapcs
Do you know -mapcs has been deprecated for more than 4 years now?
Is there a reason you are still using it? It was deprecated since -mapcs
is both extremely inefficient and buggy
Hi Steve,
>> After special cases you could do something like t = mask2 + (HWI_1U <<
>> shift);
>> return t == (t & -t) to check for a valid bfi.
>
> I am not sure I follow this logic and my attempts to use this did not
> work so I kept my original code.
It's similar to the initial code in aarch6
Hi Steve,
Thanks for looking at this. A few comments on the patch:
+bool
+aarch64_masks_and_shift_for_bfi_p (scalar_int_mode mode,
+ unsigned HOST_WIDE_INT mask1,
+ unsigned HOST_WIDE_INT shft_amnt,
+
is by explicitly checking whether the base is
loaded. Also enable LDMs which load the first register.
Bootstrap OK on armhf, testsuite passes. OK for commit?
ChangeLog:
2019-02-04 Wilco Dijkstra
PR target/89190
* config/arm/arm.c (ldm_stm_operation_p) Set
addr_reg_in_re
Hi,
Segher wrote:
>On Tue, Jan 29, 2019 at 02:51:30PM -0800, Andrew Pinski wrote:
>
>> Seems to me rather this should have been simplified to just:
>> (set (reg:SI 93)
>> (ashift:SI (sign_extract:SI (reg:SI 95)
>> (const_int 3 [0x3])
>> (const_int 0 [0]))
>>
The TST instruction no longer matches in all cases due to changes in
Combine. The fix is simple, we now need to allow a subreg as well when
selecting the cc_mode. This fixes the tst_5.c and tst_6.c failures.
AArch64 regress & bootstrap OK.
ChangeLog:
2019-01-23 Wilco Dijkstra
Fix a failing test - changes in Combine mean the test now fails
eventhough the generated code is the same. Given there are several
AArch64-specific tests for vec-select, remove the scanning of Combine
output. Committed as trivial fix.
ChangeLog:
2019-01-22 Wilco Dijkstra
PR rtl
e testcase now passes - committed as obvious.
ChangeLog
2019-01-09 Wilco Dijkstra
testsuite/
* gcc.target/aarch64/pr62178.c: Relax scan-assembler checks.
--- gcc/testsuite/gcc.target/aarch64/pr62178.c (revision 266178)
+++ gcc/testsuite/gcc.target/aarch64/pr62178.c (working copy
Hi Jakub,
Any other comments? I'd like to finish this rather than leaving it in its
current
half-done state.
Wilco
Hi,
Jakub Jelinek wrote:
On Fri, Dec 07, 2018 at 04:19:22PM +0000, Wilco Dijkstra wrote:
>> The test case doesn't need an aligned object to fail, so wh
ping
From: Wilco Dijkstra
Sent: 14 December 2018 13:16
To: GCC Patches
Cc: nd
Subject: [PATCH] Fix PR84521
This fixes and simplifies the setjmp and non-local goto implementation.
Currently the virtual frame pointer is saved when using __builtin_setjmp or
a non-local goto. Depending on
Hi James,
TImode is an integer mode so we strongly prefer using integer registers
to avoid inefficient allocations using SIMD registers. We might be able to
use TFmode since that prefers Q registers. However we don't support
TFmode LDP/STP unless emitted explicitly like in prolog/epilog. LDP of
TI
Hi Alejandro,
+emit_move_insn (mask,
+ aarch64_simd_gen_const_vector_dup (mode,
+ HOST_WIDE_INT_M1U
+ << bits));
+
+emit_insn (gen_and3 (sign, arg2, mask));
Is there
Hi Sam,
This is a trivial test fix, so it falls under the obvious rule and can be
committed without approval - https://www.gnu.org/software/gcc/svnwrite.html
Cheers,
Wilco
Hi Olivier,
> I'm experimenting with the idea of adjusting the
> stack probing code using r9 today, to see if it could
> save/restore that reg if it happens to be the static chain
> as well.
>
> If that can be made to work, maybe that would be a better
> alternative than just swapping and have the
Hi Martin,
> There is a similar mechanism for pointer-to-member-functions
> used by C++. Is this correct on aarch64?
/* By default, the C++ compiler will use the lowest bit of the pointer
to function to indicate a pointer-to-member-function points to a
virtual member function. However, if
Hi,
Jakub Jelinek wrote:
> On Wed, Dec 19, 2018 at 07:53:48PM +, Uecker, Martin wrote:
>> What do you think about making the trampoline a single call
>> instruction and have a large memory region which is the same
>> page mapped many times?
This sounds like a good idea, but given a function d
Hi Hans-Peter,
> While the choice of static-chain register does not affect the
> ABI, it's the other way round: the choice of static-chain
> register matters, specifically it's call-clobberedness.
Agreed.
> It looks like the current aarch64 static-chain register R18 is
> call-saved but without s
seems incorrect since the helper
function moves the the frame pointer value into the static chain register
(so this patch does nothing to make it better or worse).
AArch64 bootstrap OK, new test passes on AArch64, x86-64 and Arm.
ChangeLog:
2018-12-13 Wilco Dijkstra
gcc/
PR middle-end/
Hi Martin,
> One could also argue that it creates a false sense of security
> and diverts resources from properly fixing the real problems
> i.e. the buffer overflows which lets an attacker write to the
> stack in the first place. A program without buffer overflows
> is secure even without an exec
Hi Martin,
Uecker, Martin wrote:
>Am Mittwoch, den 12.12.2018, 22:04 + schrieb Wilco Dijkstra:
>> Hi Martin,
>>
>> > Does a non-executable stack actually improve security?
>>
>> Absolutely, it's like closing your front door rather than just leave i
Hi Martin,
> Does a non-executable stack actually improve security?
Absolutely, it's like closing your front door rather than just leave it open
for anyone.
> For the alternative implementation using (custom) function
> descriptors (-fno-trampolines) the static chain becomes
> part of the ABI or
Hi,
>> On 12 Dec 2018, at 18:21, Richard Earnshaw (lists)
>> wrote:
>
>> However, that introduces an issue that that
>> code is potentially used across multiple versions of gcc, with
>> potentially different choices of the static chain register. Hmm, this
>> might need some more careful though
Hi Oliver,
+#define FIXED_R18 0
{ \
0, 0, 0, 0, 0, 0, 0, 0, /* R0 - R7 */ \
0, 0, 0, 0, 0, 0, 0, 0, /* R8 - R15 */ \
- 0, 0, 0, 0, 0, 0, 0, 0, /* R16 - R23 */ \
+ 0, 0, FIXED_R18, 0, 0, 0, 0, 0, /* R16 - R23 */
Hi,
>> Ultimately, the best solution here will probably depend on which we
>> think is more likely, copysign or the example I give above.
> I'd tend to suspect we'd see more pure integer bit twiddling than the
> copysign stuff.
All we need to do is to clearly separate the integer and FP/SIMD case
Hi,
Jakub Jelinek wrote:
On Fri, Dec 07, 2018 at 04:19:22PM +, Wilco Dijkstra wrote:
>> The test case doesn't need an aligned object to fail, so why did you add it?
>
> It needed it on i686, because otherwise it happened to see the value it
> wanted in the caller's
Hi,
Jakub Jelinek wrote:
> On Fri, Dec 07, 2018 at 02:52:48PM +0000, Wilco Dijkstra wrote:
>> - struct __attribute__((aligned (32))) S { int a[4]; } s;
>>
Log:
2018-12-07 Wilco Dijkstra
gcc/
PR middle-end/64242
* builtins.c (expand_builtin_longjmp): Add frame clobbers and schedule
block.
(expand_builtin_nonlocal_goto): Likewise.
testsuite/
PR middle-end/64242
* gcc.c-torture/execute/pr64242.c: Update test.
--
Hi,
Florian wrote:
> For userland, I would like to eventually copy the OpenBSD approach for
> architectures which have some form of PC-relative addressing: we can
> have multiple random canaries in (RELRO) .rodata in sufficiently close
> to the code that needs them (assuming that we have split .ro
eLog:
2018-11-29 Wilco Dijkstra
gcc/
PR middle-end/64242
* builtins.c (expand_builtin_longjmp): Use a temporary when restoring
the frame pointer.
(expand_builtin_nonlocal_goto): Likewise.
testsuite/
PR middle-end/64242
* gcc.c-torture/execute/pr642
Hi,
> I checked it. They are all the same on x86_64:
> https://pastebin.com/e63FxDAy
> I even forced to call the glibc sinh and atanh, but use the sqrtsd
> instruction.
> But I do agree that there may be an arch that sets an errno for sinh
> or cosh but not for sqrt, implying in a unexpected beha
Hi Segher,
> On Wed, Nov 14, 2018 at 12:37:05PM +0000, Wilco Dijkstra wrote:
>> +/* { dg-final { scan-assembler-not { dup } } } */
>> +/* { dg-final { scan-assembler-not { fmov } } } */
>
> { dup } is the same as " dup " , that is, with spaces and all.
>
Hi,
> Indeed. After plotting the graph of both functions, it is very clear
> that this check isn't required. Sorry about that.
It wouldn't be clear from the graph, you need to check that +0.0, -0.0,
out of range values, infinities, NaNs give the same answer before/after
your transformation. If s
other. However the
generated vector loop is fast either way since it generates MLA and
merges the DUP either with a load or MLA. So relax the conditions
slightly and check we still generate MLA and there is no DUP or FMOV.
The testcase now passes - committed as obvious.
ChangeLog
2018-11-14
is to disable lrint/llrint on double if the size of a long is
smaller (ie. ilp32).
Passes regress and bootstrap on AArch64. OK for commit?
ChangeLog
2018-11-13 Wilco Dijkstra
gcc/
PR target/81800
* gcc/config/aarch64/aarch64.md (lrint): Disable lrint pattern i
Hi James,
>On Mon, Jan 22, 2018 at 09:22:27AM -0600, Richard Biener wrote:
>> It would be better to dissect this cost into vec_to_scalar and vec_extract
>> where
>> vec_to_scalar really means getting at the scalar value of a vector of
>> uniform values
>> which most targets can do without any ins
Hi James,
> We have 7 unique target tuning structures in the AArch64 backend, of which
> only one has a 2x ratio between scalar_int_cost and vec_to_scalar_cost. Other
> ratios are 1, 3, 8, 3, 4, 6.
I wouldn't read too much in the exact value here - the costs are simply
relative to
other values f
e C / x can underflow to zero if x is huge, it's not safe otherwise).
If C is negative the comparison is reversed.
Simplify (x * C1) > C2 into x > (C2 / C1) with -funsafe-math-optimizations.
If C1 is negative the comparison is reversed.
OK for commit?
ChangeLog
2018-11-09 Wil
, OK for commit?
ChangeLog:
2018-11-09 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.c (aarch64_classify_symbol):
Apply reasonable limit to symbol offsets.
testsuite/
* gcc.target/aarch64/symbol-range.c (foo): Set new limit.
* gcc.target/aarch64/symbol-r
for commit until we get rid of it?
ChangeLog:
2017-11-17 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.h (SLOW_BYTE_ACCESS): Set to 1.
--
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index
056110afb228fb919e837c04aa5e55
- libquantum and SPECv6
performance improves.
OK for commit?
ChangeLog:
2018-01-22 Wilco Dijkstra
PR target/79262
* config/aarch64/aarch64.c (generic_vector_cost): Adjust
vec_to_scalar_cost.
--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index
and regress OK on arm-none-linux-gnueabihf.
OK for stage 1?
ChangeLog:
2017-04-12 Wilco Dijkstra
* gcc/config/arm/arm.c (arm_cortex_a53_tune): Set max_cond_insns to 2.
(arm_cortex_a35_tune): Likewise.
---
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index
Hi,
> But the max. error in sinh/cosh/atanh is less than 2 ULP, with some math
> libraries. It could be < 1 ULP, in theory, so sinh(atanh(x)) less than
> 2 ULP even.
You can't add ULP errors in general - a tiny difference in the input can
make a huge difference in the result if the derivative i
Hi Jeff,
> So if we're going from 0->2 ULPs in some cases, do we want to guard it
> with one of the various options, if so, which? Giuliano's follow-up
> will still have the potential for 2ULPs.
The ULP difference is not important since the individual math functions
already have ULP of 3 or hig
Hi Prathamesh,
Prathamesh Kulkarni wrote:
> Thanks for the suggestions. The last time I benchmarked the patch
> (around Jan 2016)
> I got following results with the patch for SPEC2006:
>
> a15: +0.64% overall, 481.wrf: +6.46%
> a53: +0.21% overall, 416.gamess: -1.39%, 481.wrf: +6.76%
> a57: +0.35%
Prathamesh Kulkarni wrote:
> This is a rebased version of patch that adds a pattern to neon.md for
> implementing division with multiplication by reciprocal using
> vrecpe/vrecps with -funsafe-math-optimizations excluding -Os.
> The newly added test-cases are not vectorized on armeb target with
>
Hi,
>> Generally the goal is 1ULP in round to nearest
>
> Has that changed recently? At least in the past for double the goal has
> been always .5ULP in round to nearest.
Yes. 0.5 ULP (perfect rounding) as a goal was insane as it caused ridiculous
slowdowns in the 10x range for no apparent r
Hi,
>> So I think the runtime math libraries shoot for .5 ULP (yes, they don't
>> always make it, but that's their goal). We should probably have the
>> same goal. Going from 0 to 2 ULPs would be considered bad.
Generally the goal is 1ULP in round to nearest - other rounding modes may have
high
Hi,
>> Maybe I am crazy, or the labels here are wrong, but that looks like the
>> error is three times as *big* after the patch. I.e. it worsened instead
>> of improving.
This error is actually 1ULP, so just a rounding error. Can't expect any better
than that!
> with input : = 9.98807907
Jakub Jelinek wrote:
> At this point this seems like something that shouldn't be done inline
> anymore, so either we don't do this optimization at all, because the errors
> are far bigger than what is acceptable even for -ffast-math, or we have a
> library function that does the sinh (tanh (x)) an
Hi,
>> Did you enable FMA? I'd expect 1 - x*x to be accurate with FMA, so the
>> relative error
>> should be much better. If there is no FMA, 2*(1-fabs(x)) - (1-fabs(x))^2
>> should be
>> more accurate when abs(x)>0.5 and still much faster.
>
>No, but I will check how to enable it if FMA is avai
Hi,
> Well, I compared the results before and after the simplifications with a
> 512-bit
> precise mpfr value. Unfortunately, I found that sometimes the error is very
> noticeable :-( .
Did you enable FMA? I'd expect 1 - x*x to be accurate with FMA, so the relative
error
should be much better.
Hi Olivier,
> STATIC_CHAIN_REGNUM still needs to be adjusted directly I think.
>
> I wondered if we could set it to R11 unconditionally and picked
> the way ensuring no change for !vxworks ports, especially since I
> don't have means to test more than what I described above.
Yes it should always
fmovw0, s0
ret
After:
fmovs0, w0
cnt v0.8b, v0.8b
addvb0, v0.8b
fmovw0, s0
ret
Passes regress on AArch64, OK for commit?
ChangeLog:
2018-10-11 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.md (zero_extendsidi2_aarc
As mentioned in PR87511, the shift used in aarch64_mask_and_shift_for_ubfiz_p
should be evaluated as a HOST_WIDE_INT rather than int.
Passes bootstrap, OK for commit and backport?
ChangeLog:
2018-10-11 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.c (aarch64_mask_and_shift_for_ubfiz_p
Hi,
> if (math_errhandling & MATH_ERRNO) == 0 a math
> function may still set errno.
>
> it can only set it if there was an error though,
> not arbitrarily clobber it, but this means that
>
> (1) reordering errno access around math calls is
> invalid even with -fno-math-errno.
It's typically th
Hi,
> Note that "will ever set errno" includes possibly setting it in the
> future, since code may be built with one libm version and used with
> another. So it wouldn't be correct to have a "never sets errno" attribute
> on glibc logb / lround / llround / lrint / llrint / fma / remquo (missin
Joseph Myers wrote:
> On Mon, 8 Oct 2018, Richard Biener wrote:
>> So I think it would be fine if we'd have -fno-math-errno as documented
>> and then the C library would annotate their math functions according
>> to whether they will ever set errno or not. Once a math function is
>> const or pure
Hi Jeff,
> So I went back and reviewed all the discussion around this. I'm still
> having trouble getting comfortable with flipping the default -- unless
> we know ahead of time that the target runtime doesn't set errno on any
> of the math routines. That implies a target hook to describe the
>
or commit?
ChangeLog:
2018-10-03 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.md (zero_extendsidi2_aarch64): Add alternatives
to zero-extend between int and floating-point registers.
(load_pair_zero_extendsidi2_aarch64): Add alternative to emit
zero-extended
ldp in
Richard Henderson wrote:
> If you're going to add moves r->w, why not also go ahead and add w->r.
> There are also HImode fmov zero-extensions, fwiw.
Well in principle it would be possible to support all 8/16/32-bit zero
extensions
for all combinations of int and fp registers. However I prefer t
cnt v0.8b, v0.8b
addvb0, v0.8b
fmovw0, s0
ret
After:
fmovs0, w0
cnt v0.8b, v0.8b
addvb0, v0.8b
fmovw0, s0
ret
Passes regress on AArch64, OK for commit?
ChangeLog:
2018-09-28 Wilco Dijkstra
gcc/
* conf
Matthew wrote:
> The canonical way to require even-odd pairs of registers to implement a TImode
> pseudo register as mentioned in the documentation is to limit *all* TImode
> registers to being even-odd by using the TARGET_HARD_REGNO_MODE_OK hook.
And that is the best approach for cases like this
.8b
fmovw0, s0
ret
After:
fmovs0, w0
cnt v0.8b, v0.8b
addvb0, v0.8b
fmovw0, s0
ret
Passes regress on AArch64, OK for commit?
ChangeLog:
2018-09-27 Wilco Dijkstra
gcc/
* config/aarch64/aarch64.md (zero_extendsidi2
Hi Denis,
>> Adding support for a frame chain would require an ABI change. It
> would have to
> > work across GCC, LLVM, Arm, Thumb-1 and Thumb-2 - not a trivial amount of
> > effort.
> Clang already works that way.
No, that's incorrect like Richard pointed out. Only a single register can be
u
Hi Denis,
> We are working on applying Address/LeakSanitizer for the full Tizen OS
> distribution. It's about ~1000 packages, ASan/LSan runtime is installed
> to ld.so.preload. As we know ASan/LSan has interceptors for
> allocators/deallocators such as (malloc/realloc/calloc/free) and so on.
> O
Hi Steve,
The latest version compiles the examples I used correctly, so it looks fine
from that perspective (but see comments below). However the key point of
the ABI is to enable better code generation when calling a vector function,
and that will likely require further changes that may conflict
ping
From: Wilco Dijkstra
Sent: 18 June 2018 15:01
To: GCC Patches
Cc: nd; Joseph Myers
Subject: [PATCH v3] Change default to -fno-math-errno
GCC currently defaults to -fmath-errno. This generates code assuming math
functions set errno and the application checks errno. Few applications
Hi,
> But we still have an issue with performance, when we are using default
> unwinder, which uses unwind tables. It could be up to 10 times faster to
> use frame based stack unwinder instead "default unwinder".
Switching on the frame pointer typically costs 1-2% performance, so it's a bad
idea
Hi,
A quick benchmark shows it's faster up to about 10 bytes, but after that it
becomes extremely slow. At 16 bytes it's already 2.5 times slower and for
larger sizes its over 13 times slower than the GLIBC implementation...
> The implementation falls back to the library call if the
> string is
Nicolas Pitre wrote:
>> However if r4 is non-zero, the carry will be set, and the tsths will be
>> executed. This
>> clears the carry and sets the Z flag based on bit 20.
>
> No, not at all. The carry is not affected. And that's the point of the
> tst instruction here rather than a cmp: it sets
Hi Nicolas,
I think your patch doesn't quite work as expected:
@@ -238,9 +238,10 @@ LSYM(Lad_a):
movsip, ip, lsl #1
adcsxl, xl, xl
adc xh, xh, xh
- tst xh, #0x0010
- sub r4, r4, #1
- bne LSYM(Lad_e)
+ subsr4, r4, #1
+
Steve Ellcey wrote:
> OK, I think I understand this a bit better now. I think my main
> problem is with the term 'writeback' which I am not used to seeing.
> But if I understand things correctly we are saving one or two registers
> and (possibly) updating the stack pointer using auto-increment/a
Umesh Kalappa wrote:
> We tested on the SP and yes the problem persist on the SP too and
> attached patch will fix the both SP and DP issues for the denormal
> resultant.
The patch now looks correct to me (but I can't approve).
> We bootstrapped the compiler ,look ok to us with minimal testing
Umesh Kalappa wrote:
> We tried some of the normalisation numbers and the fix works and please
> could you help us with the input ,where if you see that fix breaks down.
Well try any set of inputs which require normalisation. You'll find these no
longer get normalised and so will get incorrect r
Hi Umesh,
Looking at your patch, this would break all results which need to be normalized.
Index: libgcc/config/arm/ieee754-df.S
===
--- libgcc/config/arm/ieee754-df.S (revision 262850)
+++ libgcc/config/arm/ieee754-df.S (
Steve Ellcey wrote:
> Yes, I see where I missed this in aarch64_push_regs
> and aarch64_pop_regs. I think that is why the second of
> Wilco's two examples (f2) is wrong. I am unclear about
> exactly what is meant by writeback and why we have it and
> how that and callee_adjust are used. Any cha
Hi Steve,
> This patch checks for SIMD functions and saves the extra registers when
> needed. It does not change the caller behavour, so with just this patch
> there may be values saved by both the caller and callee. This is not
> efficient, but it is correct code.
I tried a few simple test cas
Fix and simplify the testcase so it generates dup even on latest trunk.
This fixes the failure reported in:
https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01799.html
Committed as obvious.
ChangeLog:
2018-06-28 Wilco Dijkstra
* gcc.target/aarch64/f16_mov_immediate_3.c: Fix testcase
Joseph Myers wrote:
> On Tue, 26 Jun 2018, Wilco Dijkstra wrote:
> > That looks incorrect indeed but that's mostly a problem with -fmath-errno
> > as it
> > would result in GCC assuming the function is const/pure when in fact it
> > isn't.
> > Does
Eric Botcazou wrote:
>> This test can easily be changed not to use optimize since it doesn't look
>> like it needs it. We really need to tests these builtins properly,
>> otherwise they will continue to fail on most targets.
>
> As far as I can see PR target/84521 has been reported only for Aarch6
Eric Botcazou wrote:
> > The AArch64 parts are OK. I've been holding off approving the patch while
> > I wait for Eric to reply on the x86_64 fails with your new testcase.
>
> The test is not portable in any case since it uses the "optimize" attribute
> so
> I'd just make it specific to Aarch64
Joseph Myers wrote:
> On Thu, 21 Jun 2018, Jeff Law wrote:
>
> > I think all this implies that the setting of -fno-math-errno by default
> > really depends on the math library in use since it's the library that
> > has to arrange for either errno to get set or for an exception to be raised.
>
> If
Richard Biener wrote:
> There are a number of regression tests that check for errno handling
> (I added some to avoid aliasing for example). Please make sure to
> add explicit -fmath-errno to those that do not already have it set
> (I guess such patch would be obvious and independent of this one)
:
f:
str x30, [sp, -16]!
bl lroundf
add x0, x0, 1
ldr x30, [sp], 16
ret
With -fno-math-errno:
f:
fcvtas x0, s0
add x0, x0, 1
ret
Passes regress on AArch64. OK for commit?
ChangeLog:
2018-06-18 Wilco Dijkstra
Add missing target pthread to ensure test doesn't fail on bare-metal
targets. Committed as obvious.
ChangeLog:
2018-06-18 Wilco Dijkstra
PR tree-optimization/86076
* gcc.dg/pr86076.c: Add target pthread for bare-metal targets.
--
diff --git a/gcc/testsuite/gcc.dg/pr8607
Since PR64946 has been fixed, we can remove the xfail from this test.
Committed as obvious.
ChangeLog:
2018-06-18 Wilco Dijkstra
PR tree-optimization/64946
* gcc.target/aarch64/vect-abs-compile.c: Remove xfail.
--
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-abs
Richard Sandiford wrote:
>> This has probably been reported elsewhere already but I can't find
>> such a report, so sorry for possible duplicate,
>> but this patch is causing ICEs on aarch64
>> FAIL: gcc.target/aarch64/sve/reduc_1.c -march=armv8.2-a+sve
>> (internal compiler error)
>> FAIL:
Richard Sandiford
> The "?" change seems to make intrinsic sense given the extra cost of the
> GPR alternative. But I think the real reason for this failure is that
> we define no V1DF patterns, and target-independent code falls back to
> using moves in the corresponding *integer* mode. So for
James Greenhalgh wrote:
> > Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
>
> > I'd prefer more detail than this for a workaround; which test, why did it
> > start to fail, why is this the right solution, etc.
It was gcc.target/aarch64/vect_copy_lane_1.c generating:
test
explicitly checking for a subset of GENERAL_REGS and FP_REGS.
Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
Passes regress, OK for commit? Since it is a regression introduced in GCC8, OK
to
backport to GCC8?
ChangeLog:
2018-05-25 Wilco Dijkstra
* config/aarch64/
Richard Sandiford wrote:
> - if (allocno_class != ALL_REGS)
> + if (allocno_class != POINTER_AND_FP_REGS)
> return allocno_class;
>
> - if (best_class != ALL_REGS)
> + if (best_class != POINTER_AND_FP_REGS)
> return best_class;
>
> mode = PSEUDO_REGNO_MODE (regno);
> I think
ND_FP_REGS register class which is now used instead of
ALL_REGS.
Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.
Passes regress, OK for commit?
Since it is a regression introduced in GCC8, OK to backport to GCC8?
ChangeLog:
2018-05-22 Wilco Dijkstra
* config/aarch64
James Greenhalgh wrote:
> +/* Determine whether a frame chain needs to be generated. */
> +static bool
> +aarch64_needs_frame_chain (void)
> +{
> + /* Force a frame chain for EH returns so the return address is at FP+8. */
> + if (frame_pointer_needed || crtl->calls_eh_return)
> + return tr
Kyrill Tkachov wrote:
> That patch would look like the attached. Is this preferable?
> For the above example it generates the desired:
> foo_v4sf:
> ldr s0, [x0]
> ldr s1, [x1, 8]
> ins v0.s[1], v1.s[0]
> ld1 {v0.s}[2], [x2]
> ld1 {v0.s}[3], [x3]
>
Richard Earnshaw wrote:
>>> Which doesn't appear to have been approved. Did you follow up with Jeff?
>>
>> I'll get back to that one at some point - it'll take some time to agree on a
>> way
>> forward with the callback.
>>
>> Wilco
>>
>>
>
> So it seems to me that this should then be q
Hi,
James Greenhalgh wrote:
>
> This seems like a fairly horrible hack around the register allocator
> behaviour.
That is why I proposed to improve the register allocator so one can explicitly
specify the copy preference in the md syntax. However that wasn't accepted,
so we'll have to use a hack
Hi,
> Which doesn't appear to have been approved. Did you follow up with Jeff?
I'll get back to that one at some point - it'll take some time to agree on a way
forward with the callback.
Wilco
401 - 500 of 1132 matches
Mail list logo