[Bug target/91130] [9/10 Regression] -MF clashes with -flto on aarch64

2019-07-16 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91130

--- Comment #12 from Richard Earnshaw  ---
How do I invoke lto-wrapper inside gdb? it seems to pick up some magic options
via the environment...

[Bug target/91130] [9/10 Regression] -MF clashes with -flto on aarch64

2019-07-16 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91130

--- Comment #11 from Richard Earnshaw  ---
Some pass in the compilation process must create that temporary file with the
options to gcc; but whichever pass this is doesn't appear in the output of "gcc
-v"

[Bug target/91130] [9/10 Regression] -MF clashes with -flto on aarch64

2019-07-16 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91130

--- Comment #10 from Richard Earnshaw  ---
I'm not particularly familiar with how LTO is supposed to work.  I can
reproduce the crash on ARM as Martin described (but not on AArch64).  The
problem seems to be an assert that the number of files to analyse passed on the
command line matches the number of files described in the resolution file.  We
have:

lto1 -quiet -dumpbase  -mcpu=cortex-a72 -mfloat-abi=hard -mtls-dialect=gnu
-mcpu=cortex-a72 -mfloat-abi=hard -mtls-dialect=gnu -marm
-march=armv8-a+crc+simd -auxbase  -version -fno-openmp -fno-openacc
-fno-pie -fltrans-output-list=/tmp/ccoC6W9f.ltrans.out -fwpa
-fresolution=main.res -flinker-output=exec @/tmp/ccHCf4bh

Where ccHCf4bh contains:

main.o

and main.res contains:
1
main.o 1
195 4bf9aa6f81679e65 PREVAILING_DEF main

So if I've understood this correctly, the resolution file says to expect one
file to analyse, but two are passed.  It looks like something is interpreting
 as an additional file.

Going back another step, we see

/home/rearnsha/scratch/gnu/gcc-install/armv8l/gcc-9.1.0/libexec/gcc/armv8l-unkno
wn-linux-gnueabihf/9.1.0/lto-wrapper -fresolution=main.res -flinker-output=exec 
main.o 
/home/rearnsha/scratch/gnu/gcc-install/armv8l/gcc-9.1.0/bin/gcc @/tmp/cc8Su59l

where cc8Su59l contains:
-xlto
-c
-fno-openmp
-fno-openacc
-fno-pie
-mcpu=cortex-a72
-mfloat-abi=hard
-mtls-dialect=gnu
-marm
-march=armv8-a+crc+simd
-v

-save-temps
-mcpu=cortex-a72
-mfloat-abi=hard
-mtls-dialect=gnu
-marm
-march=armv8-a+crc+simd
-fltrans-output-list=/tmp/ccoC6W9f.ltrans.out
-fwpa
-fresolution=main.res
-flinker-output=exec
main.o

I think the problem is that '' mid-way through the list of options.  It
looks as though it has had a preceding -MF gobbled but not its argument.

Does this help any?

[Bug lto/91163] ARM lto optimalization fail in big-endian case (error: could not unlink output file)

2019-07-16 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91163

Richard Earnshaw  changed:

   What|Removed |Added

 Resolution|WONTFIX |INVALID

--- Comment #10 from Richard Earnshaw  ---
Changing to invalid, as not a bug in gcc.

[Bug lto/91163] ARM lto optimalization fail in big-endian case (error: could not unlink output file)

2019-07-15 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91163

--- Comment #7 from Richard Earnshaw  ---
Suggest you run the application under "strace -f" to try to identify what is
being duplicated.

[Bug lto/91163] ARM lto optimalization fail in big-endian case (error: could not unlink output file)

2019-07-15 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91163

--- Comment #6 from Richard Earnshaw  ---
Sounds like a dup of PR93069

[Bug bootstrap/91034] In tree build of gmp fails on Raspberry Pi4 (ARM Cortex A72) with `mls r1,r4,r8,r11' not supported in ARM mode

2019-07-01 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91034

--- Comment #10 from Richard Earnshaw  ---
(In reply to Andrew Roberts from comment #9)
> For completeness I've also built gcc 8.3.0 with in tree gmp 6.1.2 using the
> newly built 9.1.0. And then in turn used this gcc 8.3.0 to rebuild gcc 9.1.0
> with in tree gmp.
> 
> So the host gcc 8.3.0 doesn't work building gcc with in tree gmp.
> But all the versions I have built (9.1.0 and 8.3.0) build this correctly.

So if I've understood all this correctly, this is happening when assembling a
file written in assembly, rather than one generated by GCC itself.

The most likely source of the problem here is that your system compiler is
trying to set the CPU for the assembler to use on the command line, but that
CPU is then older than the one the file is expecting (MLS was new in ARMv7
(strictly, ARMv6t2, but that's probably not relevant to the pi)).

So can you run the command with -v and show what options are being passed to
the assembler?

[Bug target/9663] [arm] gcc-20030127 misses an optimization opportunity

2019-06-06 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=9663

--- Comment #11 from Richard Earnshaw  ---
(In reply to Fredrik Hederstierna from comment #10)
> Created attachment 46457 [details]
> Testcase from CSiBE teem sources
> 
> Testcase from CSiBE teem sources
> Tested with gcc-9.1.0 for ARM 32bit targets.
> 
> Without peephole2
> 
>  :
>0: e92d407fpush{r0, r1, r2, r3, r4, r5, r6, lr}
>4: e2504000subsr4, r0, #0
>8: 0a3fbeq 10c 
>c: e351cmp r1, #0
>   10: e1a05001mov r5, r1
> 
> With peephole2
> 
>  :
>0: e92d407fpush{r0, r1, r2, r3, r4, r5, r6, lr}
>4: e2504000subsr4, r0, #0
>8: 0a3ebeq 108 
>c: e2515000subsr5, r1, #0
> 
> /Fredrik

Can you run this through your preprocessor to remove the dependencies on
external headers?

[Bug target/90678] [10 Regression] ICE in aarch64_return_address_signing_enabled, at config/aarch64/aarch64.c:4865 since r271735

2019-05-31 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90678

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Richard Earnshaw  ---
Fixed as per comment #4

[Bug target/88656] [7/8/9/10 Regression] lr clobbered by thumb prologue before __builtin_return_address(0) reads from it

2019-05-28 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88656

--- Comment #5 from Richard Earnshaw  ---
(In reply to gerd from comment #4)
> (In reply to Richard Earnshaw from comment #3)
> > A regression is not a bug that applies in all previous releases.  A
> > regression is where something worked in some previous releases but does not
> > work now.
> 
> using __builtin_return_address(0) as described in the bug report worked in
> gcc 5.4.0 and before. To be more specific, it worked before the fix for bug
> 77933 was incorporated. So to me this is a regression.

then please update the known-to-work field accordingly.

[Bug target/88656] [7/8/9/10 Regression] lr clobbered by thumb prologue before __builtin_return_address(0) reads from it

2019-05-28 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88656

--- Comment #3 from Richard Earnshaw  ---
(In reply to gerd from comment #2)
> Could it be that this is a duplicate of bug 88167?
> 
> I compiled a gcc 7.4.0 patched with the fix for 88167 and now get this
> result:
> 
> push{r4, lr}
> mov r3, r8
> mov r4, r9
> push{r3, r4}
> mov r0, lr
> 
> The patch for bug 88167 seems to be just in trunk for now. As the problem is
> a regression in all releases till gcc 7 I'd prefer if it could be backported
> into the corresponding branches.

A regression is not a bug that applies in all previous releases.  A regression
is where something worked in some previous releases but does not work now.

[Bug demangler/88783] integer overflow in libiberty, heap overflow will be triggered in nm

2019-05-24 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88783

--- Comment #3 from Richard Earnshaw  ---
(In reply to Nick Clifton from comment #2)
> Hi tfx,
> 
>   Thank you vert much for reporting this bug.  Unfortunately the binutils
>   project does not maintain the libiberty library (which contains the 
>   cplus-dem.c source file).  Instead this is handled by the gcc project.
>   So please could you refile your bug report on the gcc bugzilla system:
> 
>https://gcc.gnu.org/bugzilla/enter_bug.cgi?product=gcc
> 
>   Thank you.
> 
> Cheers
>   Nick

??? This is the gcc bugzilla.

[Bug c++/90459] gcc-arm-none-eabi-8-2018-q4-major

2019-05-14 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90459

Richard Earnshaw  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Richard Earnshaw  ---
You've asked for both -p (prof format) and -pg (gprof) format profiling at the
same time.  I don't think you can have both - just pick one or the other.

[Bug target/90405] [10 Regression] ICE in thumb_find_work_register, at config/arm/arm.c:7701

2019-05-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90405

--- Comment #4 from Richard Earnshaw  ---
Author: rearnsha
Date: Thu May  9 16:35:56 2019
New Revision: 271037

URL: https://gcc.gnu.org/viewcvs?rev=271037=gcc=rev
Log:
[arm] PR target/90405 New test.

This time really add the test.

gcc/testsuite:
PR target/90405
* gcc.target/arm/pr90405.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/arm/pr90405.c
Modified:
trunk/gcc/testsuite/ChangeLog

[Bug tree-optimization/88709] Improve store-merging

2019-05-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88709

--- Comment #11 from Richard Earnshaw  ---
And in the testcase that prompted Ramana's original patch it clearly wanted to
ask something else.

We can't have it both ways.

[Bug target/90405] [10 Regression] ICE in thumb_find_work_register, at config/arm/arm.c:7701

2019-05-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90405

Richard Earnshaw  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Richard Earnshaw  ---
Fixed

[Bug target/90405] [10 Regression] ICE in thumb_find_work_register, at config/arm/arm.c:7701

2019-05-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90405

--- Comment #2 from Richard Earnshaw  ---
Author: rearnsha
Date: Thu May  9 16:00:23 2019
New Revision: 271036

URL: https://gcc.gnu.org/viewcvs?rev=271036=gcc=rev
Log:
[arm] PR target/90405 fix regression for thumb1 with -mtpcs-leaf-frame

-mtpcs-leaf-frame causes an APCS-style backtrace frame to be created
on the stack.  This should probably be deprecated, but it did reveal
an issue with the patch I committed previously to improve the code
generation when pushing high registers, in that
thumb_find_work_register had a different idea as to which registers
were available as scratch registers.

The new code actually does a better job of finding a viable work
register and doesn't rely so much on assumptions about the ABI, so it
seems better to adapt thumb_find_work_register to the new approach.
This way we can eliminate some rather crufty code.

gcc:
PR target/90405
* config/arm/arm.c (callee_saved_reg_p): Move before
thumb_find_work_register.
(thumb1_prologue_unused_call_clobbered_lo_regs): Move before
thumb_find_work_register.  Only call df_get_live_out once.
(thumb1_epilogue_unused_call_clobbered_lo_regs): Likewise.
(thumb_find_work_register): Use
thumb1_prologue_unused_call_clobbered_lo_regs instead of ad hoc
algorithms to locate a spare call clobbered reg.

gcc/testsuite:
PR target/90405
* gcc.target/arm/pr90405.c: New test.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c

[Bug tree-optimization/88709] Improve store-merging

2019-05-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88709

--- Comment #9 from Richard Earnshaw  ---
(In reply to Jakub Jelinek from comment #7)
> (In reply to Christophe Lyon from comment #6)
> > I've noticed that the new test store_merging_29.c fails on
> > arm-none-eabi --with-cpu cortex-a9
> > FAIL: gcc.dg/store_merging_29.c scan-tree-dump store-merging "New sequence
> > of 3 stores to replace old one of 6 stores"
> 
> That is because target-supports.exp lies on arm, even for -mcpu=cortex-a9
> STRICT_ALIGNMENT is 1 on arm (it is 1 unconditionally), but
> check_effective_target_non_strict_align returns true on arm anyway.
> I've already added hacks for it in r256783 in another testcase though, guess
> I'll do something similar now, but I must say I'm not very excited about
> that.

Support for misaligned accesses is a three(.5!)-valued problem:

1) There's no support in the architecture at all
2) There's some support with a limited set of instructions
3) There's full support: any memory access can handle any alignment.
3.5) There's full support: but some accesses may be very slow

I would think that these days most CPU architectures actually fall into either
1 or 2.  Many architectures have limitations, for example on atomic accesses
that are unaligned.

STRICT_ALIGNMENT only covers, in reality case 3.  I'm not even sure if it would
be defined on a machine with case 3.5.

I think the real problem here is that it's not clear what question this
target-supports macro is really asking - does the CPU have the capability to do
(some) unaligned acceses?  or can it arbitrarily support casts from unaligned
pointers to standard types?

[Bug target/90405] [10 Regression] ICE in thumb_find_work_register, at config/arm/arm.c:7701

2019-05-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90405

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |rearnsha at gcc dot 
gnu.org

--- Comment #1 from Richard Earnshaw  ---
Mine

[Bug target/88167] [ARM] Function __builtin_return_address returns invalid address

2019-05-08 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88167

Richard Earnshaw  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |10.0

--- Comment #2 from Richard Earnshaw  ---
Fixed on trunk.

[Bug target/88167] [ARM] Function __builtin_return_address returns invalid address

2019-05-08 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88167

--- Comment #1 from Richard Earnshaw  ---
Author: rearnsha
Date: Wed May  8 14:36:15 2019
New Revision: 271012

URL: https://gcc.gnu.org/viewcvs?rev=271012=gcc=rev
Log:
[arm][PR88167] Fix __builtin_return_address returns invalid address

This patch fixes a problem with the thumb1 prologue code where the link
register could be unconditionally used as a scratch register even if the
return value was still live at the end of the prologue.

Additionally, the patch improves the code generated when we are not
using many low call-saved registers to make use of any unused call
clobbered registers to help with the saving of high registers that
cannot be pushed directly (quite rare in normal code as the register
allocator correctly prefers low registers).

2019-05-08  Mihail Ionescu  
Richard Earnshaw  

gcc:

PR target/88167
* config/arm/arm.c (thumb1_prologue_unused_call_clobbered_lo_regs): New
function.
(thumb1_epilogue_unused_call_clobbered_lo_regs): New function.
(thumb1_compute_save_core_reg_mask): Don't force a spare work
register if both the epilogue and prologue can use call-clobbered
regs.
(thumb1_unexpanded_epilogue): Use
thumb1_epilogue_unused_call_clobbered_lo_regs.  Reverse the logic for
picking temporaries for restoring high regs to match that of the
prologue where possible.
(thumb1_expand_prologue): Add any usable call-clobbered low registers
to
the list of work registers.  Detect if the return address is still live
at the end of the prologue and avoid using it for a work register if
so.
If the return address is not live, add LR to the list of pushable regs
after the first pass.

gcc/testsuite:

PR target/88167
* gcc.target/arm/pr88167-1.c: New test.
* gcc.target/arm/pr88167-2.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/arm/pr88167-1.c
trunk/gcc/testsuite/gcc.target/arm/pr88167-2.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c
trunk/gcc/testsuite/ChangeLog

[Bug target/48429] ARM __attribute__((interrupt("FIQ"))) not optimizing register allocation

2019-05-08 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48429

Richard Earnshaw  changed:

   What|Removed |Added

   Priority|P3  |P4
   Severity|normal  |enhancement

--- Comment #3 from Richard Earnshaw  ---
This is a missed optimization, not a bug.  Given that we have many more
pressing issues it is unlikely to be addressed soon.

Of course, patches are always welcome...

[Bug target/89400] [7/8/9 Regression] ICE: output_operand: invalid %-code with -march=armv6kz -mthumb -munaligned-access

2019-05-03 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89400

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
Summary|[7/8/9/10 Regression] ICE:  |[7/8/9 Regression] ICE:
   |output_operand: invalid |output_operand: invalid
   |%-code with -march=armv6kz  |%-code with -march=armv6kz
   |-mthumb -munaligned-access  |-mthumb -munaligned-access

--- Comment #7 from Richard Earnshaw  ---
Fixed on trunk so far.

[Bug target/89400] [7/8/9/10 Regression] ICE: output_operand: invalid %-code with -march=armv6kz -mthumb -munaligned-access

2019-05-03 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89400

--- Comment #6 from Richard Earnshaw  ---
Author: rearnsha
Date: Fri May  3 13:45:59 2019
New Revision: 270853

URL: https://gcc.gnu.org/viewcvs?rev=270853=gcc=rev
Log:
[arm] PR target/89400 fix thumb1 unaligned access expansion

Armv6 has support for unaligned accesses to memory.  However, the
thumb1 code patterns were trying to use the 32-bit code constraints.
One failure mode from this was that the patterns are designed to be
compatible with conditional execution and this was then causing an
assert in the compiler.

The unaligned_loadhis pattern is only used for expanding extv, which
in turn is only enabled for systems supporting thumb2.  Given that
there is no simple expansion for a thumb1 sign-extending load (the
instruction has no immediate offset form and requires two registers in
the address) it seems simpler to just disable this for thumb1.

Fixed thusly:

PR target/89400
* config/arm/arm.md (unaligned_loadsi): Add variant for thumb1.
Restrict 'all' variant to 32-bit configurations.
(unaligned_loadhiu): Likewise.
(unaligned_storehi): Likewise.
(unaligned_storesi): Likewise.
(unaligned_loadhis): Disable when compiling for thumb1.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.md

[Bug target/89400] [7/8/9/10 Regression] ICE: output_operand: invalid %-code with -march=armv6kz -mthumb -munaligned-access

2019-05-02 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89400

Richard Earnshaw  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |rearnsha at gcc dot 
gnu.org

--- Comment #5 from Richard Earnshaw  ---
testing patch

[Bug middle-end/90075] [7/8 Regression] [AArch64] ICE during RTL pass when member of union passed to copysignf

2019-04-30 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90075

Richard Earnshaw  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Richard Earnshaw  ---
Fixed on gcc-7 and gcc-8

[Bug middle-end/90075] [7/8 Regression] [AArch64] ICE during RTL pass when member of union passed to copysignf

2019-04-30 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90075

--- Comment #8 from Richard Earnshaw  ---
Author: rearnsha
Date: Tue Apr 30 09:31:04 2019
New Revision: 270684

URL: https://gcc.gnu.org/viewcvs?rev=270684=gcc=rev
Log:
PR target/90075 Prefer bsl/bit/bif for copysignf. (backport GCC-7)

This patch is to fix the ICE caused by expand pattern of copysignf 
builtin. This is a back port to r267019 of trunk.

gcc:

2019-04-30  Srinath Parvathaneni  

PR target/90075
* config/aarch64/iterators.md (V_INT_EQUIV): Add mode for
integer equivalent of floating point values.

Backport from mainline
2018-12-11  Richard Earnshaw  

PR target/37369
* config/aarch64/iterators.md (sizem1): Add sizes for
SFmode and DFmode.
(Vbtype): Add SFmode mapping.
* config/aarch64/aarch64.md (copysigndf3, copysignsf3): Delete.
(copysign3): New expand pattern.
(copysign3_insn): New insn pattern.

testsuite:

2019-04-30  Srinath Parvathaneni  

PR target/90075
* gcc.target/aarch64/pr90075.c: New test.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/aarch64/pr90075.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/aarch64/aarch64.md
branches/gcc-7-branch/gcc/config/aarch64/iterators.md
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug c/37369] ice for legal C code

2019-04-30 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37369

--- Comment #5 from Richard Earnshaw  ---
Author: rearnsha
Date: Tue Apr 30 09:31:04 2019
New Revision: 270684

URL: https://gcc.gnu.org/viewcvs?rev=270684=gcc=rev
Log:
PR target/90075 Prefer bsl/bit/bif for copysignf. (backport GCC-7)

This patch is to fix the ICE caused by expand pattern of copysignf 
builtin. This is a back port to r267019 of trunk.

gcc:

2019-04-30  Srinath Parvathaneni  

PR target/90075
* config/aarch64/iterators.md (V_INT_EQUIV): Add mode for
integer equivalent of floating point values.

Backport from mainline
2018-12-11  Richard Earnshaw  

PR target/37369
* config/aarch64/iterators.md (sizem1): Add sizes for
SFmode and DFmode.
(Vbtype): Add SFmode mapping.
* config/aarch64/aarch64.md (copysigndf3, copysignsf3): Delete.
(copysign3): New expand pattern.
(copysign3_insn): New insn pattern.

testsuite:

2019-04-30  Srinath Parvathaneni  

PR target/90075
* gcc.target/aarch64/pr90075.c: New test.

Added:
branches/gcc-7-branch/gcc/testsuite/gcc.target/aarch64/pr90075.c
Modified:
branches/gcc-7-branch/gcc/ChangeLog
branches/gcc-7-branch/gcc/config/aarch64/aarch64.md
branches/gcc-7-branch/gcc/config/aarch64/iterators.md
branches/gcc-7-branch/gcc/testsuite/ChangeLog

[Bug middle-end/90075] [7/8 Regression] [AArch64] ICE during RTL pass when member of union passed to copysignf

2019-04-30 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90075

--- Comment #7 from Richard Earnshaw  ---
Author: rearnsha
Date: Tue Apr 30 09:25:31 2019
New Revision: 270683

URL: https://gcc.gnu.org/viewcvs?rev=270683=gcc=rev
Log:
PR target/90075 Prefer bsl/bit/bif for copysignf. (backport GCC-8)

This patch is to fix the ICE caused in expand pattern of copysignf 
builtin. This is a back port to r267019 of trunk.

gcc:

2019-04-29  Srinath Parvathaneni  

Backport from mainline
2018-12-11  Richard Earnshaw 

PR target/37369
* config/aarch64/iterators.md (sizem1): Add sizes for
SFmode and DFmode.
(Vbtype): Add SFmode mapping.
* config/aarch64/aarch64.md (copysigndf3, copysignsf3): Delete.
(copysign3): New expand pattern.
(copysign3_insn): New insn pattern.

testsuite:

2019-04-29  Srinath Parvathaneni  

PR target/90075
* gcc.target/aarch64/pr90075.c: New test.


Added:
branches/gcc-8-branch/gcc/testsuite/gcc.target/aarch64/pr90075.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/aarch64/aarch64.md
branches/gcc-8-branch/gcc/config/aarch64/iterators.md
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug c/37369] ice for legal C code

2019-04-30 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37369

--- Comment #4 from Richard Earnshaw  ---
Author: rearnsha
Date: Tue Apr 30 09:25:31 2019
New Revision: 270683

URL: https://gcc.gnu.org/viewcvs?rev=270683=gcc=rev
Log:
PR target/90075 Prefer bsl/bit/bif for copysignf. (backport GCC-8)

This patch is to fix the ICE caused in expand pattern of copysignf 
builtin. This is a back port to r267019 of trunk.

gcc:

2019-04-29  Srinath Parvathaneni  

Backport from mainline
2018-12-11  Richard Earnshaw 

PR target/37369
* config/aarch64/iterators.md (sizem1): Add sizes for
SFmode and DFmode.
(Vbtype): Add SFmode mapping.
* config/aarch64/aarch64.md (copysigndf3, copysignsf3): Delete.
(copysign3): New expand pattern.
(copysign3_insn): New insn pattern.

testsuite:

2019-04-29  Srinath Parvathaneni  

PR target/90075
* gcc.target/aarch64/pr90075.c: New test.


Added:
branches/gcc-8-branch/gcc/testsuite/gcc.target/aarch64/pr90075.c
Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/aarch64/aarch64.md
branches/gcc-8-branch/gcc/config/aarch64/iterators.md
branches/gcc-8-branch/gcc/testsuite/ChangeLog

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-04-29 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

--- Comment #8 from Richard Earnshaw  ---
(In reply to Richard Earnshaw from comment #7)
> (In reply to Segher Boessenkool from comment #4)
> > That is code *size*.  Code size is expected to grow a tiny bit, because of
> > *better* register allocation.
> > 
> > But we could not do make_more_copies at -Os, if that helps?  (The hard
> > register
> > changes themselves are required for correctness).
> 
> In this case, however, we get *worse* register allocation, since it is using
> the the expensive register more frequently than a cheaper register which is
> hardly used at all.
> 
> In this particular case, all the uses of the "cheap" register (r7) could use
> the 'expensive' register at no additional cost, since the cheap register is
> being used only to hold a value that will be moved to another register (a
> cheap operation regardless of the register used).

FTR, I don't think the combine changes are directly implicated in this
regression.  They just expose a latent issue with register allocation and its
costing.

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-04-29 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

--- Comment #7 from Richard Earnshaw  ---
(In reply to Segher Boessenkool from comment #4)
> That is code *size*.  Code size is expected to grow a tiny bit, because of
> *better* register allocation.
> 
> But we could not do make_more_copies at -Os, if that helps?  (The hard
> register
> changes themselves are required for correctness).

In this case, however, we get *worse* register allocation, since it is using
the the expensive register more frequently than a cheaper register which is
hardly used at all.

In this particular case, all the uses of the "cheap" register (r7) could use
the 'expensive' register at no additional cost, since the cheap register is
being used only to hold a value that will be moved to another register (a cheap
operation regardless of the register used).

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-04-26 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

--- Comment #3 from Richard Earnshaw  ---
(In reply to Segher Boessenkool from comment #2)
> What difference is there on some code of significant size?  Do you see
> regressions then?
> 
> Of course there are some tiny examples where it now does worse, just like
> there are examples where it now does better.

Across the entirety of CSiBE thumb2 regresses by 0.05% (tested by effectively
disabling r265398 on tip of tree).

It seems to be specific to Thumb2 code, though.  Thumb1 and Arm code now get
worse when that specific patch is disabled.  Though all three are still worse
than gcc-8 overall.

[Bug rtl-optimization/90255] [9 regression] r266385 caused code size regressions on Arm, thumb and thumb2

2019-04-25 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90255

Richard Earnshaw  changed:

   What|Removed |Added

   Keywords||missed-optimization, ra

--- Comment #3 from Richard Earnshaw  ---
If the same testcase is compiled with the additional options -mfpu=vfp
-mfloat-abi=softfp then we also see an example of some poor register allocation
leading to additional spills.

Before the patch we have

ldmib   r4, {r6, r8}  // r6 callee saved, so no need to spill
ldr r1, .L14+12
mov r0, r6
add r2, sp, #28
ldr r7, [r4, #12]

bl  sscanf
cmp r0, #1
beq .L3
ldr r3, .L14+16
ldr r0, [r3]
mov r3, r6  // Now we can copy r6 into r3
ldr r2, [r5]

and afterwards

ldmib   r4, {r3, r8}  // r3 call-clobbered...
ldr r1, .L14+12
mov r0, r3
add r2, sp, #36
ldr r7, [r4, #12]
str r3, [sp, #28] // So must spill here
bl  sscanf
cmp r0, #1
beq .L3
ldr r2, .L14+16
ldr r3, [sp, #28] // and reload it again here
ldr r0, [r2]

ldr r1, .L14+20
ldr r2, [r5]

As far as I can see r6 is not live in the new version of the code, so this
looks just like a poor choice of register.

[Bug rtl-optimization/90255] [9 regression] r266385 caused code size regressions on Arm, thumb and thumb2

2019-04-25 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90255

--- Comment #2 from Richard Earnshaw  ---
Command to reproduce

cc1 -fpreprocessed bow.i -quiet -dumpbase bow.i -marm -mcpu=arm7tdmi
-march=armv4t -auxbase-strip test/bow.o -Os -w -version -fno-short-enums
-fgnu89-inline -o bow.s

[Bug rtl-optimization/90255] [9 regression] r266385 caused code size regressions on Arm, thumb and thumb2

2019-04-25 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90255

Richard Earnshaw  changed:

   What|Removed |Added

 CC||ramana.radhakrishnan at arm 
dot co
   ||m, vmakarov at redhat dot com,
   ||wdijkstr at arm dot com

--- Comment #1 from Richard Earnshaw  ---
[committed too early]

It looks like a 64-bit constant 0 is held over a function call when the code
could just initialize the registers directly.

Code before commit:
main:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 24
@ frame_needed = 0, uses_anonymous_args = 0
push{r4, r5, r6, r7, r8, r9, r10, lr}  // 8 registers saved.
ldr r3, [r1]
ldr r5, .L14
cmp r0, #4
mov r4, r1
str r3, [r5]
sub sp, sp, #48   // 48 bytes stack space
blneusage
.L2:
ldmib   r4, {r6, r8}
ldr r1, .L14+4
mov r0, r6
add r2, sp, #28
ldr r7, [r4, #12]
bl  sscanf
cmp r0, #1
beq .L3
ldr r3, .L14+8
ldr r0, [r3]
mov r3, r6
ldr r2, [r5]
ldr r1, .L14+12
.L12:
ldr r0, [r0, #8]
bl  fprintf
mov r0, #1
.L13:
bl  exit
.L3:
mov r0, r8
ldr r1, .L14+16
add r2, sp, #44
bl  sscanf
cmp r0, #1
beq .L4
ldr r3, .L14+8
ldr r2, [r5]
ldr r0, [r3]
ldr r1, .L14+20
mov r3, r8
b   .L12
.L4:
mov r0, r7
ldr r1, .L14+24
bl  fopen
subsr4, r0, #0
bne .L5
ldr r3, .L14+8
ldr r2, [r5]
ldr r0, [r3]
ldr r1, .L14+28
mov r3, r7
b   .L12
.L5:
mov r1, r4
ldr r0, .L14+32
bl  fputs
mov r5, #0
mov r8, #1065353216
ldr r9, .L14+36
.L6:
ldr r10, [sp, #28]
cmp r10, r5
bgt .L7
mov r0, r4
bl  fclose
mov r0, #0
b   .L13
.L7:
mov r0, r5
bl  __aeabi_i2d
mov r6, r0
mov r0, r10
mov r7, r1
bl  __aeabi_i2d
mov r2, r0
mov r3, r1
mov r0, r6
mov r1, r7
bl  __aeabi_ddiv
mov r2, #0
mov r3, #0
bl  __aeabi_dadd
bl  __aeabi_d2f
mov r6, r0
mov r3, r0
add r2, sp, #40
add r1, sp, #36
add r0, sp, #32
str r8, [sp, #4]@ float
str r8, [sp]@ float
bl  dyeHSVtoRGB
mov r0, r6
bl  __aeabi_f2d
ldr r10, [sp, #44]  @ float
mov r6, r0
mov r7, r1
mov r0, r10
ldr r1, [sp, #40]   @ float
bl  __aeabi_fmul
bl  __aeabi_f2d
str r0, [sp, #16]
str r1, [sp, #20]
ldr r1, [sp, #36]   @ float
mov r0, r10
bl  __aeabi_fmul
bl  __aeabi_f2d
str r0, [sp, #8]
str r1, [sp, #12]
ldr r1, [sp, #32]   @ float
mov r0, r10
bl  __aeabi_fmul
bl  __aeabi_f2d
mov r2, r6
stm sp, {r0-r1}
mov r3, r7
mov r1, r9
mov r0, r4
bl  fprintf
add r5, r5, #1
b   .L6

after r 266385
main:
@ Function supports interworking.
@ args = 0, pretend = 0, frame = 32
@ frame_needed = 0, uses_anonymous_args = 0
push{r4, r5, r6, r7, r8, r9, r10, fp, lr}  // 9 regs saved
ldr r3, [r1]
ldr r5, .L14
cmp r0, #4
mov r4, r1
str r3, [r5]
sub sp, sp, #60 // 60 bytes stack space
blneusage
.L2:
ldmib   r4, {r6, r8}
ldr r1, .L14+4
mov r0, r6
add r2, sp, #36
ldr r7, [r4, #12]
bl  sscanf
cmp r0, #1
beq .L3
ldr r3, .L14+8
ldr r0, [r3]
mov r3, r6
ldr r2, [r5]
ldr r1, .L14+12
.L12:
ldr r0, [r0, #8]
bl  fprintf
mov r0, #1
.L13:
bl  exit
.L3:
mov r0, r8
ldr r1, .L14+16
add r2, sp, #52
bl  sscanf
cmp r0, #1
beq .L4
ldr r3, .L14+8
ldr r2, [r5]
ldr r0, [r3]
ldr r1, .L14+20
mov r3, r8
 

[Bug rtl-optimization/90255] New: [9 regression] r266385 caused code size regressions on Arm, thumb and thumb2

2019-04-25 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90255

Bug ID: 90255
   Summary: [9 regression] r266385 caused code size regressions on
Arm, thumb and thumb2
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rearnsha at gcc dot gnu.org
  Target Milestone: ---

Created attachment 46247
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46247=edit
testcase

Overall r266385 caused
 0.16% regression on Arm and thumb1 -Os
 0.08% regression on thumb2 -Os.
when building CSiBE

Some non-trivial files, however, regressed significantly, some by over 3%.

For example, teem-1.6.0-src src/dye/test/bow regresses by 3.36% on Arm due to
additional spills and the need for another register to be allocated.

[Bug rtl-optimization/90249] New: [9 regression] Code size regression on thumb2 due to sub-optimal register allocation.

2019-04-25 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

Bug ID: 90249
   Summary: [9 regression] Code size regression on thumb2 due to
sub-optimal register allocation.
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: ra
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rearnsha at gcc dot gnu.org
CC: ramana.radhakrishnan at arm dot com, vmakarov at redhat dot 
com,
wdijkstr at arm dot com
  Target Milestone: ---
Target: arm

Created attachment 46244
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46244=edit
testcase

GCC 9 has regressed on code size due to some sub-optimal register allocation. 
For this example, the only difference in the output is that the assignments for
r7 and r8 have been switched, but the result is significant growth in code size
since r8 requires predominantly 32-bit instructions to be used while r7
requires predominantly 16-bit instructions.

cc1 -fpreprocessed binding2.i -quiet -dumpbase binding2.i -mthumb
-mcpu=cortex-a8 -march=armv7-a -auxbase-strip binding.o -Os -w -version
-fno-short-enums -fgnu89-inline -o binding2.s

In gcc-8 the output was
DefineConnectorBinding:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
push{r0, r1, r2, r3, r4, r5, r6, r7, r8, lr}
mov r4, r1
mov r8, r0
mov r1, r2
mov r0, r4
mov r5, r2
mov r7, r3
bl  LookupBinding
mov r6, r0
cbz r0, .L2
ldr r7, .L5
mov r1, r4
ldr r0, [r7] // 16-bit instruction
bl  GetAtomString
mov r1, r5
mov r4, r0
ldr r0, [r7] // 16-bit instruction
bl  GetAtomString
ldrhr1, [r6, #8]
mov r5, r0
ldr r0, [r7] // 16-bit instruction
bl  GetAtomString
ldrhr3, [r6, #10]
ldr r2, .L5+4
movsr1, #103
str r5, [sp]
strdr0, r3, [sp, #4]
mov r3, r4
mov r0, r8
bl  SemanticError
add sp, sp, #16
@ sp needed
pop {r4, r5, r6, r7, r8, pc}
.L2:
mov r3, r7
mov r2, r5
mov r1, r4
mov r0, r8
bl  NewConnectorBindingTree
add sp, sp, #16
@ sp needed
pop {r4, r5, r6, r7, r8, lr}
b   AddBinding

In gcc-9 we get

DefineConnectorBinding:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
push{r0, r1, r2, r3, r4, r5, r6, r7, r8, lr}
mov r4, r1
mov r7, r0
mov r1, r2
mov r0, r4
mov r5, r2
mov r8, r3
bl  LookupBinding
mov r6, r0
cbz r0, .L2
ldr r8, .L5+4
mov r1, r4
ldr r0, [r8] // 32-bit instruction
bl  GetAtomString
mov r1, r5
mov r4, r0
ldr r0, [r8] // 32-bit instruction
bl  GetAtomString
ldrhr1, [r6, #8]
mov r5, r0
ldr r0, [r8] // 32-bit instruction
bl  GetAtomString
ldrhr3, [r6, #10]
ldr r2, .L5
movsr1, #103
str r5, [sp]
strdr0, r3, [sp, #4]
mov r3, r4
mov r0, r7
bl  SemanticError
add sp, sp, #16
@ sp needed
pop {r4, r5, r6, r7, r8, pc}
.L2:
mov r3, r8
mov r2, r5
mov r1, r4
mov r0, r7
bl  NewConnectorBindingTree
add sp, sp, #16
@ sp needed
pop {r4, r5, r6, r7, r8, lr}
b   AddBinding

R8 is used more often than R7, so it seems odd that it is preferred over the
latter.

[Bug middle-end/90075] [7/8 Regression] [AArch64] ICE during RTL pass when member of union passed to copysignf

2019-04-24 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90075

--- Comment #6 from Richard Earnshaw  ---
(In reply to ramana.radhakrish...@arm.com from comment #5)
> For the release branches, I think backporting your patch (and any followups
> , do you remember any ?) should be fine and we should just do it ./

I don't recall any.  Certainly none are recorded in the PR.

[Bug middle-end/90075] [7/8 Regression] [AArch64] ICE during RTL pass when member of union passed to copysignf

2019-04-23 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90075

--- Comment #4 from Richard Earnshaw  ---
(In reply to Ramana Radhakrishnan from comment #3)
> Seems to have been "fixed" by the commit to fix PR87369,
> 
> Richard, is this something to backport ? Prima-facie , it appears not and we
> will need an appropriate fix for the release branches.

Given that the patch for PR87369 eliminates the ICE, it's probably preferable
for backporting to a separate patch that is only used on the release branches. 
That patch has now been soaking on trunk for a while now, so is likely to be
pretty safe.

I am a bit worried however, that the patch papers over a likely trunk ICE that
isn't really fixed.  It would be nice to investigate further if some additional
mitigation is warranted.

[Bug rtl-optimization/87871] [9 Regression] testcases fail after r265398 on arm

2019-04-18 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871

--- Comment #51 from Richard Earnshaw  ---
(In reply to Segher Boessenkool from comment #50)
> The insn is
> 
> (insn 7 3 8 2 (parallel [
> (set (reg:CC 100 cc)
> (compare:CC (reg:SI 0 r0 [116])
> (const_int 0 [0])))
> (set (reg/v:SI 4 r4 [orig:112 a ] [112])
> (reg:SI 0 r0 [116]))
> ]) "ira-shrinkwrap-prep-1.c":17:6 188 {*movsi_compare0}
>  (nil))
> 
> and that isn't split, and then prepare_shrink_wrap gives up on it.

In the more general case splitting this would produce worse code, not better,
since then we'd end up with two instructions rather than one.

[Bug rtl-optimization/87871] [9 Regression] testcases fail after r265398 on arm

2019-04-18 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871

--- Comment #36 from Richard Earnshaw  ---
(In reply to Segher Boessenkool from comment #35)
> Peter's patch solves this particular problem, but not the PR unfortunately.
> 
> I finally understand Jakub's comment 30.  This patch solves the PR (also
> without Peter's patch):
> 
> ===
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index 0aecd03..67dddb2 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -6340,7 +6340,7 @@ (define_insn "*movsi_compare0"
> (const_int 0)))
> (set (match_operand:SI 0 "s_register_operand" "=r,r")
> (match_dup 1))]
> -  "TARGET_32BIT"
> +  "TARGET_32BIT && reload_completed"
>"@
> cmp%?\\t%0, #0
> subs%?\\t%0, %1, #0"
> ===

And what about all the cases where the move and compare are not adjacent in the
instruction stream so don't get matched by peepholing?

[Bug target/89146] arm: "nor" constraint prefers memory reference over constant

2019-04-17 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89146

--- Comment #2 from Richard Earnshaw  ---
(In reply to Jakub Jelinek from comment #1)
> I've looked for constraints that include [ijnIJKLMNO] together with [mo] and
> couldn't find any.  So, not really sure what note_invalid_constants is
> supposed to handle (why would reload let a constant get through as constant
> if it required only memory).

GCC simply doesn't know how to deal with architectures that don't have
unlimited offsets from the PC for the constant pool without generating stupidly
bad code.  So we run an additional pass late on to fix up constants that aren't
valid by dumping them into 'minipools' that get inlined within the function
code.  We do this by using special constraints to handle this, knowing that the
final pass will deal with them.

All *real* patterns in the back-end can deal with this; but this artificial asm
is confusing things.  Perhaps for ASM insns we should just skip them entirely
and assume that the user knows what they are doing, but I'm worried that users
might somehow be relying on the existing behaviour.

[Bug target/90016] aarch64: reference to undeclared N in help for command line option

2019-04-10 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90016

Richard Earnshaw  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |9.0

--- Comment #2 from Richard Earnshaw  ---
Fixed

[Bug target/90016] aarch64: reference to undeclared N in help for command line option

2019-04-10 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90016

--- Comment #1 from Richard Earnshaw  ---
Author: rearnsha
Date: Wed Apr 10 09:51:16 2019
New Revision: 270248

URL: https://gcc.gnu.org/viewcvs?rev=270248=gcc=rev
Log:
[aarch64] PR90016 - aarch64: reference to undeclared N in help for command line
option

'to N' is now redundant and misleading given the earlier change to use
.

Removed.

PR target/90016
* config/aarch64/aarch64.opt (msve-vector-bits): Remove redundant and
obsolete reference to N.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.opt

[Bug target/89794] combine incorrectly forwards register value through auto-inc operation

2019-04-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89794

--- Comment #6 from Richard Earnshaw  ---
There seems to be more to this than initially thought.  Another insn is in
play.

(insn 12 10 14 2 (set (reg:SI 129)
(bswap:SI (subreg:SI (reg:DI 127 [ i ]) 4))) "/tmp/test3.c":10:7 331
{*arm_rev}
 (expr_list:REG_DEAD (reg:DI 127 [ i ])
(nil)))

Which uses the value loaded by the pre-modify instruction.

Combine manages to combine (and simplify insns 10 and 12, but the
simplification is to

(set (reg:SI 129) (const_int 0))

and we've lost the pre-inc entirely.

[Bug target/89794] combine incorrectly forwards register value through auto-inc operation

2019-04-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89794

Richard Earnshaw  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org
Summary|wrong code with -Og |combine incorrectly
   |-fno-forward-propagate  |forwards register value
   ||through auto-inc operation

--- Comment #5 from Richard Earnshaw  ---
This appears to be combine missing a PRE_MODIFY operation.

After expand we have:

(insn 10 7 11 2 (set (reg:DI 127)
(zero_extend:DI (mem/c:HI (plus:SI (reg/f:SI 103 afp)
(const_int 8 [0x8])) [1 i+0 S2 A32]))) "/tmp/test3.c":10:7
160 {zero_extendhidi2}
 (nil))
...
(insn 24 23 25 2 (set (reg:SI 133)
(plus:SI (reg/f:SI 103 afp)
(const_int 8 [0x8]))) "/tmp/test3.c":12:3 4 {*arm_addsi3}
 (nil))
...
(insn 33 32 34 2 (set (mem/c:HI (reg:SI 133) [0 MEM[(void *)]+0 S2 A16])
(reg:HI 141)) "/tmp/test3.c":12:3 189 {*movhi_insn_arch4}
 (nil))

The auto-inc-dec pass transforms this into:

(insn 50 7 10 2 (set (reg/f:SI 133)
(reg/f:SI 103 afp)) "/tmp/test3.c":10:7 -1
 (nil))
(insn 10 50 12 2 (set (reg:DI 127 [ i ])
(zero_extend:DI (mem/c:HI (pre_modify:SI (reg/f:SI 133)
(plus:SI (reg/f:SI 133)
(const_int 8 [0x8]))) [1 i+0 S2 A32])))
"/tmp/test3.c":10:7 160 {zero_extendhidi2}
 (expr_list:REG_INC (reg/f:SI 133)
(nil)))
...
(insn 33 49 34 2 (set (mem/c:HI (reg/f:SI 133) [0 MEM[(void *)]+0 S2 A16])
(subreg:HI (reg:SI 140) 0)) "/tmp/test3.c":12:3 189 {*movhi_insn_arch4}
 (expr_list:REG_DEAD (reg:SI 140)
(expr_list:REG_DEAD (reg/f:SI 133)
(nil

And combine, missing the pre_modify, then substitutes insn 50 directly into
insn 33

Trying 50 -> 33:
   50: r133:SI=afp:SI
   33: [r133:SI]=r140:SI#0
  REG_DEAD r140:SI
  REG_DEAD r133:SI
Successfully matched this instruction:
(set (mem/c:HI (reg/f:SI 103 afp) [0 MEM[(void *)]+0 S2 A16])
(subreg:HI (reg:SI 140) 0))

Which is clearly wrong as it has now lost the pre-modify operation.

[Bug target/89794] wrong code with -Og -fno-forward-propagate

2019-04-09 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89794

--- Comment #4 from Richard Earnshaw  ---
(In reply to Jakub Jelinek from comment #3)
> Guess with PR89475 fix this will be latent, unless one disables ccp.
> Anyway, to me this looks like a backend bug.  The function is leaf, but for
> some strange reason LRA uses the lr register and so lr needs to be pushed
> and poped, but that push/pop doesn't seem to be accounted for in the afp to
> sp elimination offset computation.

I'm still seeing it in a build from 2019/04/04, so not latent.

Current suspect is the code in arm_compute_elimination_offset (in arm.c), where
we eliminate from the arg pointer to the stack pointer.  The comment says that
if there has been nothing pushed on the stack at all, then the offset result
should be '-4' (and asserts strongly in the comments that this is the correct
result) --- I don't understand why that should be the case.  However, that code
is essentially 18 years old, so I'm not going to try messing with it until I
understand it better.

[Bug other/89863] [meta-bug] Issues that cppcheck finds that gcc misses

2019-04-08 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89863
Bug 89863 depends on bug 83033, which changed state.

Bug 83033 Summary: aarch64/cortex-a57-fma-steering.c: 3 * poor C++ style ?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83033

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

[Bug target/83033] aarch64/cortex-a57-fma-steering.c: 3 * poor C++ style ?

2019-04-08 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83033

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |9.0

--- Comment #5 from Richard Earnshaw  ---
Fixed

[Bug target/83033] aarch64/cortex-a57-fma-steering.c: 3 * poor C++ style ?

2019-04-08 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83033

--- Comment #4 from Richard Earnshaw  ---
Author: rearnsha
Date: Mon Apr  8 12:59:24 2019
New Revision: 270207

URL: https://gcc.gnu.org/viewcvs?rev=270207=gcc=rev
Log:
The fma_forest, fma_root_node and func_fma_steering classes lack a
copy constructor.  However, they contain pointers to allocated memory
so this omission can be regarded as poor style.  We don't need to copy
such objects, so declare the copy constructor private to inhibit
accidental copying.

2019-04-08  Andrea Corallo  

PR target/83033
* config/aarch64/cortex-a57-fma-steering.c (fma_forest): Prohibit copy
construction.
(fma_root_node): Likewise.
(func_fma_steering): Likewise.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/cortex-a57-fma-steering.c

[Bug rtl-optimization/87871] [9 Regression] testcases fail after r265398 on arm

2019-04-05 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871

Richard Earnshaw  changed:

   What|Removed |Added

 CC||rearnsha at gcc dot gnu.org

--- Comment #10 from Richard Earnshaw  ---
I wonder if this could be picked up in the post-reload CSE pass?  (ie rewriting
the CBZ to use the incoming hard reg?)

[Bug rtl-optimization/87871] [9 Regression] testcases fail after r265398 on arm

2019-04-05 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871

--- Comment #9 from Richard Earnshaw  ---
(In reply to Wilco from comment #8)
> (In reply to Segher Boessenkool from comment #5)
> > The first one just needs an xfail.  I don't know if it should be *-*-* there
> > or only arm*-*-* should be added.
> > 
> > The other two need some debugging by someone who knows the target and/or
> > these tests.
> 
> The previous code for Arm was:
> 
>   cbz r0, .L5
>   push{r4, lr}
>   mov r4, r0
>   bl  foo
>   movwr2, #:lower16:.LANCHOR0
>   movtr2, #:upper16:.LANCHOR0
>   add r4, r4, r0
>   str r4, [r2]
>   pop {r4, pc}
> .L5:
>   movsr0, #1
>   bx  lr
> 
> Now it fails to shrinkwrap:
> 
>   push{r4, lr}
>   mov r4, r0
>   cmp r4, #0
>   moveq   r0, #1
>   beq .L3
>   bl  foo
>   ldr r2, .L7
>   add r3, r4, r0
>   str r3, [r2]
> .L3:
>   pop {r4, lr}
>   bx  lr
> 
> It seems shrinkwrapping is more random, sometimes it's done as expected,
> sometimes it is not. It was more consistent on older GCC's.

This looks like another fallout of not allowing combine to merge with hard
regs.  Previously the CBZ could be moved outside of the prologue because it
operated directly on the incoming hard reg.  Now it only sees the value after
the copy into the pseudo, which is a call-saved reg because it's live over the
call.

[Bug middle-end/89544] Argument marshalling incorrectly assumes stack slots are naturally aligned.

2019-03-01 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89544

--- Comment #4 from Richard Earnshaw  ---
An alternative way of fixing this might be if the backend could somehow control
DECL_ARG_TYPE for the parameter, to set it to a variant without the additional
alignment.

[Bug middle-end/89544] Argument marshalling incorrectly assumes stack slots are naturally aligned.

2019-03-01 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89544

--- Comment #1 from Richard Earnshaw  ---
I think things start to go wrong in assign_parm_find_data_types.  That calls
promote_function_mode, but that then has no target-specific action when the
type is a RECORD_TYPE, and it never calls the back-end in this case.

If it did, then the backend could change the promoted mode to BLKmode to
represent the fact that the object was not correctly aligned on the stack, and
presumably the code would then be forced into a more sensible action.

[Bug middle-end/89544] New: Argument marshalling incorrectly assumes stack slots are naturally aligned.

2019-03-01 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89544

Bug ID: 89544
   Summary: Argument marshalling incorrectly assumes stack slots
are naturally aligned.
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: wrong-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rearnsha at gcc dot gnu.org
CC: bernd.edlinger at hotmail dot de, ebotcazou at gcc dot 
gnu.org
  Target Milestone: ---
Target: arm

The AAPCS requires that an object which is artificially overaligned is passed
at either its natural alignment (if <= 8) or on an 8-byte boundary if it is
more aligned than that.  So a struct of the form

struct s
{ int a; int b;} __attribute__((aligned(8));

has a natural alignment of 4 and must be passed by value with 4-byte alignment.

The middle end code ignores this and assign_parm_find_stack_rtl ends up
generating invalid rtl of the form

(mem/c:DI (plus:SI (reg/f:SI 104 virtual-incoming-args)
(const_int 4 [0x4])) [1 f+0 S8 A32])

ie a DImode object with 32-bit alignment.  It then proceeds to ignore the under
alignment when expanding to RTL and incorrectly calls gen_movdi directly,
rather than falling back to the misaligned move code.

Test case:

struct s {
  int a, b;
} __attribute__((aligned(8)));

struct s f0;
int f(int a, int b, int c, int d, int e, struct s f)
{
  f0 = f;
  return __alignof(f);
}

When compiled with "-O -march=armv5te" generates:

@ args = 12, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
ldrdr0, [sp, #4]<== Invalid alignment for ldrd
ldr r3, .L2
strdr0, [r3]
mov r0, #8
bx  lr

[Bug target/88469] [7/8 regression] AAPCS/AAPCS64 - Struct with 64-bit bitfield (128-bit on AArch64) may be passed in wrong registers

2019-02-26 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469

--- Comment #13 from Richard Earnshaw  ---
(In reply to Stefan Ring from comment #12)
> Unfortunately my armv5 device has died in the meantime, so I cannot verify
> my original use case. The behavior is indeed different on armv7. It does not
> trap, even for the original misaligned code. And contrary to x86, where the
> alignment check flag can be changed by user space, this is a privileged
> operation on arm, so I cannot even selectively enable it.

Note that if you have root access on your board you can modify the kernel's
behaviour for various unaligned accesses by changing /proc/cpu/alignment (see
Documentation/arm/mem_alignment in the kernel sources).  You might want to try
setting this to 3 to get the kernel to report (but fix up) any misaligned
accesses).

[Bug target/89101] [Aarch64] vfmaq_laneq_f32 generates unnecessary dup instrcutions

2019-02-25 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89101

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Richard Earnshaw  ---
Fixed on trunk (aka gcc-9).  Not a regression, so no backport.

[Bug target/88469] [7/8 regression] AAPCS/AAPCS64 - Struct with 64-bit bitfield (128-bit on AArch64) may be passed in wrong registers

2019-01-25 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469

--- Comment #11 from Richard Earnshaw  ---
Author: rearnsha
Date: Fri Jan 25 17:09:33 2019
New Revision: 268273

URL: https://gcc.gnu.org/viewcvs?rev=268273=gcc=rev
Log:
This is pretty unlikely in real code, but similar to Arm, the AArch64
ABI has a bug with the handling of 128-bit bit-fields, where if the
bit-field dominates the overall alignment the back-end code may end up
passing the argument correctly.  This is a regression that started in
gcc-6 when the ABI support code was updated to support overaligned
types.  The fix is very similar in concept to the Arm fix.  128-bit
bit-fields are fortunately extremely rare, so I'd be very surprised if
anyone has been bitten by this.

PR target/88469
gcc/
* config/aarch64/aarch64.c (aarch64_function_arg_alignment): Add new
argument ABI_BREAK.  Set to true if the calculated alignment has
changed in gcc-9.  Check bit-fields for their base type alignment.
(aarch64_layout_arg): Warn if argument passing has changed in gcc-9.
(aarch64_function_arg_boundary): Likewise.
(aarch64_gimplify_va_arg_expr): Likewise.

gcc/testsuite/
* gcc.target/aarch64/aapcs64/test_align-10.c: New test.
* gcc.target/aarch64/aapcs64/test_align-11.c: New test.
* gcc.target/aarch64/aapcs64/test_align-12.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/test_align-10.c
trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/test_align-11.c
trunk/gcc/testsuite/gcc.target/aarch64/aapcs64/test_align-12.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/testsuite/ChangeLog

[Bug target/88469] [7/8 regression] AAPCS - Struct with 64-bit bitfield may be passed in wrong registers

2019-01-24 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469

--- Comment #10 from Richard Earnshaw  ---
Author: rearnsha
Date: Thu Jan 24 16:10:06 2019
New Revision: 268241

URL: https://gcc.gnu.org/viewcvs?rev=268241=gcc=rev
Log:
Mitigation for PR target/88469 on arm-based systems bootstrapping with
gcc-6/7/8

This patch, for gcc 8/9 is a mitigation patch for PR target/88469
where gcc-6/7/8 miscompile a structure whose alignment is dominated by
a 64-bit bitfield member.  Since the PCS rules for such a type must
ignore any overalignment of the base type we cannot address this by
simply adding a larger alignment to the class.  We can, however, force
the alignment of the bit-field itself and GCC will handle that as
desired.

PR target/88469
* profile-count.h (profile_count): On ARM systems using GCC 6/7/8
force the alignment of m_val.


Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/profile-count.h

[Bug target/88469] [7/8 regression] AAPCS - Struct with 64-bit bitfield may be passed in wrong registers

2019-01-24 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469

--- Comment #9 from Richard Earnshaw  ---
Author: rearnsha
Date: Thu Jan 24 16:06:34 2019
New Revision: 268240

URL: https://gcc.gnu.org/viewcvs?rev=268240=gcc=rev
Log:
Mitigation for PR target/88469 on arm-based systems bootstrapping with
gcc-6/7/8

This patch, for gcc 8/9 is a mitigation patch for PR target/88469
where gcc-6/7/8 miscompile a structure whose alignment is dominated by
a 64-bit bitfield member.  Since the PCS rules for such a type must
ignore any overalignment of the base type we cannot address this by
simply adding a larger alignment to the class.  We can, however, force
the alignment of the bit-field itself and GCC will handle that as
desired.

PR target/88469
* profile-count.h (profile_count): On ARM systems using GCC 6/7/8
force the alignment of m_val.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/profile-count.h

[Bug middle-end/89037] New: checking ice emitting 128-bit bit-field initializer

2019-01-24 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89037

Bug ID: 89037
   Summary: checking ice emitting 128-bit bit-field initializer
   Product: gcc
   Version: 9.0
Status: UNCONFIRMED
  Keywords: ice-checking, ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rearnsha at gcc dot gnu.org
  Target Milestone: ---
Target: aarch64

/* ./cc1 testcase.c  */
struct s
{
  __int128 y : 65;
};
typedef struct s T;
T a = { 1 };

Results in 

besttry.c:7:1: internal compiler error: tree check: accessed elt 2 of
tree_int_cst with 1 elts in output_constructor_bitfield, at varasm.c:5410
7 | T a = { 1 };
  | ^
0x5be4f1 tree_int_cst_elt_check_failed(int, int, char const*, int, char const*)
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/tree.c:
0x740049 tree_int_cst_elt_check(tree_node*, int, char const*, int, char const*)
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/tree.h:3378
0xf8d530 output_constructor_bitfield
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/varasm.c:5410
0xf8db12 output_constructor
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/varasm.c:5524
0xf8c27f output_constant
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/varasm.c:5037
0xf81a96 assemble_variable_contents
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/varasm.c:2144
0xf82519 assemble_variable(tree_node*, int, int, int)
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/varasm.c:2323
0xfa03c4 varpool_node::assemble_decl()
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/varpool.c:586
0x7ac56c output_in_order
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/cgraphunit.c:2443
0x7ac56c symbol_table::compile()
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/cgraphunit.c:2683
0x7aef96 symbol_table::compile()
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/cgraphunit.c:2865
0x7aef96 symbol_table::finalize_compilation_unit()
/home/rearnsha/gnusrc/gcc-cross/trunk/gcc/cgraphunit.c:2862
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398

2019-01-23 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763

--- Comment #28 from Richard Earnshaw  ---
Yes, it's always possible to write patterns for this, but as you point out, we
end up with many variants: insert in bottom (no left shift), insert in top
(left shift then doesn't need an additional AND mask because there are no top
bits to remove) and insert in middle.

The matching of all the immediate values to ensure that the insn makes sense is
not all that trivial - and you have to then convert those into the relevant bit
offsets during assembly output.

Finally, of course, we still have to deal with the fact that the compiler
might, somehow decide to canonicalize a pattern into the existing zero_extract
bit-field insert idiom, so we don't get to remove any insns.

[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398

2019-01-23 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763

--- Comment #26 from Richard Earnshaw  ---
(In reply to Jakub Jelinek from comment #25)
> We have BIT_INSERT_EXPR on GIMPLE, which in the end is a quarternary
> operation previous value, value to insert, bit position and bit size (the
> last one is implicit in this GIMPLE op), so you're arguing we should have a
> similar expression in RTL, right?  Say BIT_INSERT or INSV?

Yes, something like that.  I know new RTL codes can be problematic, but clearly
zero_extract on a SET_DEST isn't cutting it any more, if it ever really did.

[Bug rtl-optimization/87763] [9 Regression] aarch64 target testcases fail after r265398

2019-01-23 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87763

--- Comment #24 from Richard Earnshaw  ---
(In reply to Steve Ellcey from comment #21)
> Successfully matched this instruction:
> (set (zero_extract:SI (reg/i:SI 0 x0)
> (const_int 8 [0x8])
> (const_int 12 [0xc]))
> (zero_extend:SI (reg:QI 1 x1 [ y ])))
> allowing combination of insns 8, 9 and 15
> original costs 4 + 4 + 4 = 12
> replacement cost 4
> deferring deletion of insn with uid = 9.

zero_extract on a destination register is a read-modify write operation, which
means that we'll almost never generate this through combine now as it would
require the same pseudo register as both a source and a destination in the
insns to be combined.  In the past we'd sometimes see this in real code due to
hard registers appearing to combine and giving it the opportunity to create the
pattern.

Perhaps its time for a new way of expressing bit-field insert operations in the
compiler so that the entire operation is expressed on the right hand side of
the set and the entire result can then be assigned to a pseudo (the or-and-and
mess with constants is a nightmare to match, and a nightmare to emit the final
instructions).  Register allocation can then tie that to an input register if
required.  This would be a much better match for RISC based ISAs with bitfield
insert operations, but probably wouldn't be much use on CISC architectures that
can do bit-field inserts directly to memory.

But clearly that's not a gcc-9 type change.

[Bug target/88469] [7/8 regression] AAPCS - Struct with 64-bit bitfield may be passed in wrong registers

2019-01-22 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469

--- Comment #8 from Richard Earnshaw  ---
Author: rearnsha
Date: Tue Jan 22 17:56:02 2019
New Revision: 268160

URL: https://gcc.gnu.org/viewcvs?rev=268160=gcc=rev
Log:
[arm] Further fixes for PR88469

A bitfield that is exactly the same size as an integral type and
naturally aligned will have DECL_BIT_FIELD cleared.  So we need to
check DECL_BIT_FIELD_TYPE to be sure whether or not the underlying
type was declared with a bitfield declaration.

I've also added a test for bitfields that are based on overaligned types.

PR target/88469
gcc:
* config/arm/arm.c (arm_needs_double_word_align): Check
DECL_BIT_FIELD_TYPE.

gcc/testsuite:
* gcc.target/arm/aapcs/bitfield2.c: New test.
* gcc.target/arm/aapcs/bitfield3.c: New test.


Added:
trunk/gcc/testsuite/gcc.target/arm/aapcs/bitfield2.c
trunk/gcc/testsuite/gcc.target/arm/aapcs/bitfield3.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c
trunk/gcc/testsuite/ChangeLog

[Bug target/88469] [7/8 regression] AAPCS - Struct with 64-bit bitfield may be passed in wrong registers

2019-01-22 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469

Richard Earnshaw  changed:

   What|Removed |Added

Summary|[7/8/9 regression] AAPCS -  |[7/8 regression] AAPCS -
   |Struct with 64-bit bitfield |Struct with 64-bit bitfield
   |may be passed in wrong  |may be passed in wrong
   |registers   |registers

--- Comment #7 from Richard Earnshaw  ---
Fixed on trunk.  Still need mitigation for gcc-7/8 and to deal with
boostrapping gcc-9 with gcc-6/7/8.

[Bug target/88469] [7/8/9 regression] AAPCS - Struct with 64-bit bitfield may be passed in wrong registers

2019-01-22 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469

--- Comment #6 from Richard Earnshaw  ---
Author: rearnsha
Date: Tue Jan 22 14:03:22 2019
New Revision: 268151

URL: https://gcc.gnu.org/viewcvs?rev=268151=gcc=rev
Log:
[arm] PR target/88469 fix incorrect argument passing with 64-bit bitfields

Unfortunately another PCS bug has come to light with the layout of
structs whose alignment is dominated by a 64-bit bitfield element.
Such fields in the type list appear to have alignment 1, but in
reality, for the purposes of alignment of the underlying structure,
the alignment is derived from the underlying bitfield's type.  We've
been getting this wrong since support for over-aligned record types
was added several releases back.  Worse still, the existing code may
generate unaligned memory accesses that may fault on some versions of
the architecture.

I've taken the opportunity to add a few more tests that check the
passing arguments with overalignment in the PCS.  Looking through the
existing tests it looked like they were really only checking
self-consistency and not the precise location of the arguments.

PR target/88469

gcc:
* config/arm/arm.c (arm_needs_doubleword_align): Return 2 if a record's
alignment is dominated by a bitfield with 64-bit aligned base type.
(arm_function_arg): Emit a warning if the alignment has changed since
earlier GCC releases.
(arm_function_arg_boundary): Likewise.
(arm_setup_incoming_varargs): Likewise.

gcc/testsuite:
* gcc.target/arm/aapcs/bitfield1.c: New test.
* gcc.target/arm/aapcs/overalign_rec1.c: New test.
* gcc.target/arm/aapcs/overalign_rec2.c: New test.
* gcc.target/arm/aapcs/overalign_rec3.c: New test.

Added:
trunk/gcc/testsuite/gcc.target/arm/aapcs/bitfield1.c
trunk/gcc/testsuite/gcc.target/arm/aapcs/overalign_rec1.c
trunk/gcc/testsuite/gcc.target/arm/aapcs/overalign_rec2.c
trunk/gcc/testsuite/gcc.target/arm/aapcs/overalign_rec3.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm.c
trunk/gcc/testsuite/ChangeLog

[Bug target/88799] [8/9 Regression] Arm -mcpu=PROCESSOR does not result in assembly directives for .arch and .arch_extension

2019-01-18 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88799

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Richard Earnshaw  ---
Fixed on trunk and gcc-8 branch.

[Bug target/88799] [8/9 Regression] Arm -mcpu=PROCESSOR does not result in assembly directives for .arch and .arch_extension

2019-01-18 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88799

--- Comment #3 from Richard Earnshaw  ---
Author: rearnsha
Date: Fri Jan 18 13:25:37 2019
New Revision: 268077

URL: https://gcc.gnu.org/viewcvs?rev=268077=gcc=rev
Log:
[arm] PR target/88799 Add +mp and +sec extensions to ARMv7-a (gcc-8 backport)

Most armv7-a implementations support a number of basic extensions to
the architecture which are not particularly important to the compiler,
but can matter if code contains inline assembly.  This patch adds
support for these extensions, based on the capabilities that GAS
already provides for the appropriate CPUs.  For the purposes of
multilib selection we ignore these extensions entirely and map the
extended architecture versions down to the base versions we have
already support for.

gcc:
PR target/88799
* config/arm/arm-cpus.in (mp): New feature.
(sec): New feature.
(fgroup ARMv7ve): Add mp and sec features.
(arch armv7-a): Add options to allow mp and sec extensions.
(cpu generic-armv7-a): Add options to allow mp and sec extensions.
(cpu cortex-a5, cpu cortex-7, cpu cortex-a9): Add mp and sec
extenstions to the base architecture.
(cpu cortex-a8): Add sec extension to the base architecture.
(cpu marvell-pj4): Add mp and sec extensions to the base architecture.
* config/arm/t-aprofile (MULTILIB_MATCHES): Map all armv7-a arch
variants down to the base v7-a varaint.
* config/arm/t-multilib (v7_a_arch_variants): New variable.
* doc/invoke.texi (ARM Options): Add +mp and +sec to the list
of permitted extensions for -march=armv7-a and for
-mcpu=generic-armv7-a.

testsuite:
* gcc.target/arm/multilib.exp (config "aprofile"): Add tests for
mp and sec extensions to armv7-a.


Modified:
branches/gcc-8-branch/gcc/ChangeLog
branches/gcc-8-branch/gcc/config/arm/arm-cpus.in
branches/gcc-8-branch/gcc/config/arm/t-aprofile
branches/gcc-8-branch/gcc/config/arm/t-multilib
branches/gcc-8-branch/gcc/doc/invoke.texi
branches/gcc-8-branch/gcc/testsuite/ChangeLog
branches/gcc-8-branch/gcc/testsuite/gcc.target/arm/multilib.exp

[Bug target/88799] [8/9 Regression] Arm -mcpu=PROCESSOR does not result in assembly directives for .arch and .arch_extension

2019-01-18 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88799

--- Comment #2 from Richard Earnshaw  ---
Author: rearnsha
Date: Fri Jan 18 11:49:56 2019
New Revision: 268072

URL: https://gcc.gnu.org/viewcvs?rev=268072=gcc=rev
Log:
PR target/88799 Add +mp and +sec extensions to ARMv7-a

Most armv7-a implementations support a number of basic extensions to
the architecture which are not particularly important to the compiler,
but can matter if code contains inline assembly.  This patch adds
support for these extensions, based on the capabilities that GAS
already provides for the appropriate CPUs.  For the purposes of
multilib selection we ignore these extensions entirely and map the
extended architecture versions down to the base versions we have
already support for.

gcc:
PR target/88799
* config/arm/arm-cpus.in (mp): New feature.
(sec): New feature.
(fgroup ARMv7ve): Add mp and sec features.
(arch armv7-a): Add options to allow mp and sec extensions.
(cpu generic-armv7-a): Add options to allow mp and sec extensions.
(cpu cortex-a5, cpu cortex-7, cpu cortex-a9): Add mp and sec
extenstions to the base architecture.
(cpu cortex-a8): Add sec extension to the base architecture.
(cpu marvell-pj4): Add mp and sec extensions to the base architecture.
* config/arm/t-aprofile (MULTILIB_MATCHES): Map all armv7-a arch
variants down to the base v7-a varaint.
* config/arm/t-multilib (v7_a_arch_variants): New variable.
* doc/invoke.texi (ARM Options): Add +mp and +sec to the list
of permitted extensions for -march=armv7-a and for
-mcpu=generic-armv7-a.

testsuite:
* gcc.target/arm/multilib.exp (config "aprofile"): Add tests for
mp and sec extensions to armv7-a.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/arm/arm-cpus.in
trunk/gcc/config/arm/t-aprofile
trunk/gcc/config/arm/t-multilib
trunk/gcc/doc/invoke.texi
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/arm/multilib.exp

[Bug target/86891] [9 Regression] __builtin_sub_overflow incorrect for unsigned types

2019-01-16 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86891

--- Comment #12 from Richard Earnshaw  ---
Author: rearnsha
Date: Wed Jan 16 15:22:08 2019
New Revision: 267972

URL: https://gcc.gnu.org/viewcvs?rev=267972=gcc=rev
Log:
__builtin__overflow issues on AArch64 (redux) (cont)

And the ChangeLog for PR target/86891 fix.


Modified:
trunk/gcc/ChangeLog

[Bug target/86891] [9 Regression] __builtin_sub_overflow incorrect for unsigned types

2019-01-16 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86891

--- Comment #11 from Richard Earnshaw  ---
Author: rearnsha
Date: Wed Jan 16 15:18:05 2019
New Revision: 267971

URL: https://gcc.gnu.org/viewcvs?rev=267971=gcc=rev
Log:
__builtin__overflow issues on AArch64 (redux)

Further investigation showed that my previous patch for this issue was
still incomplete.

The problem stemmed from what I suspect was a mis-understanding of the
way overflow is calculated on aarch64 when values are subtracted (and
hence in comparisons).  In this case, unlike addition, the carry flag
is /cleared/ if there is overflow (technically, underflow) and set
when that does not happen.  This patch clears up this issue by using
CCmode for all subtractive operations (this can fully describe the
normal overflow conditions without anything particularly fancy);
clears up the way we express normal unsigned overflow using CC_Cmode
(the result of a sum is less than one of the operands) and adds a new
mode, CC_ADCmode to handle expressing overflow of an add-with-carry
operation, where the standard idiom is no-longer sufficient to
describe the overflow condition.

PR target/86891
* config/aarch64/aarch64-modes.def: Add comment about how the carry
bit is set by add and compare.
(CC_ADC): New CC_MODE.
* config/aarch64/aarch64.c (aarch64_select_cc_mode): Use variables
to cache the code and mode of X.  Adjust the shape of a CC_Cmode
comparison.  Add detection for CC_ADCmode.
(aarch64_get_condition_code_1): Update code support for CC_Cmode.  Add
CC_ADCmode.
* config/aarch64/aarch64.md (uaddv4): Use LTU with CCmode.
(uaddvti4): Comparison result is in CC_ADCmode and the condition is
GEU.
(add3_compareC_cconly_imm): Delete.  Merge into...
(add3_compareC_cconly): ... this.  Restructure the comparison
to eliminate the need for zero-extending the operands.
(add3_compareC_imm): Delete.  Merge into ...
(add3_compareC): ... this.  Restructure the comparison to
eliminate the need for zero-extending the operands.
(add3_carryin): Use LTU for the overflow detection.
(add3_carryinC): Use CC_ADCmode for the result of the carry out.
Reexpress comparison for overflow.
(add3_carryinC_zero): Update for change to add3_carryinC.
(add3_carryinC): Likewise.
(add3_carryinV): Use LTU for carry between partials.
* config/aarch64/predicates.md (aarch64_carry_operation): Update
handling of CC_Cmode and add CC_ADCmode.
(aarch64_borrow_operation): Likewise.


Modified:
trunk/gcc/config/aarch64/aarch64-modes.def
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/config/aarch64/aarch64.md
trunk/gcc/config/aarch64/predicates.md

[Bug target/88799] Arm -mcpu=PROCESSOR does not result in assembly directives for .arch and .arch_extension

2019-01-11 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88799

Richard Earnshaw  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2019-01-11
 Ever confirmed|0   |1

--- Comment #1 from Richard Earnshaw  ---
GCC needs to be taught about the mp extension to armv7-a.

[Bug middle-end/88739] [7,8,9 Regression ] Big-endian union bug

2019-01-08 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739

--- Comment #5 from Richard Earnshaw  ---
Also on Arm and probably other big-endian machines as well.

[Bug middle-end/88739] Big-endian union bug

2019-01-07 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739

--- Comment #4 from Richard Earnshaw  ---
manual inspection of the output from gcc-5.4.0 suggests this version produces
correct code.

[Bug middle-end/88739] Big-endian union bug

2019-01-07 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88739

--- Comment #2 from Richard Earnshaw  ---
>   _23 = BIT_FIELD_REF <_2, 16, 0>;// WRONG: should be _2, 14, 0

_2 is declared as a 30-bit integer, so perhaps the statement is right, but
expand needs to understand that the shift extract of the top 16 bits comes from
a different location in big-endian.

[Bug target/86891] [9 Regression] __builtin_sub_overflow incorrect for unsigned types

2019-01-07 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86891

Richard Earnshaw  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Richard Earnshaw  ---
Fixed.

[Bug target/86891] [9 Regression] __builtin_sub_overflow incorrect for unsigned types

2019-01-07 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86891

--- Comment #8 from Richard Earnshaw  ---
Author: rearnsha
Date: Mon Jan  7 14:49:00 2019
New Revision: 267650

URL: https://gcc.gnu.org/viewcvs?rev=267650=gcc=rev
Log:
Investigating PR target/86891 revealed a number of issues with the way
the AArch64 backend was handing overflow detection patterns.  Firstly,
expansion for signed and unsigned types is not the same as in one form
the overflow is detected via the C flag and in the other it is done
via the V flag in the PSR.  Secondly, particular care has to be taken
when describing overflow of signed types: the comparison has to be
performed conceptually on a value that cannot overflow and compared to
a value that might have overflowed.

It became apparent that some of the patterns were simply unmatchable
(they collapse to NEG in the RTL rather than subtracting from zero)
and a number of patterns were overly restrictive in terms of the
immediate constants that they supported.  I've tried to address all of
these issues as well.

gcc:

PR target/86891
* config/aarch64/aarch64.c (aarch64_expand_subvti): New parameter
unsigned_p.  Handle signed and unsigned overflow correction as
required.
* config/aarch64/aarch64-protos.h (aarch64_expand_subvti): Update
prototype.
* config/aarch64/aarch64.md (addv4): Use aarch64_plus_operand
for operand 2.
(add3_compareV_imm): Make this callable for expanding.
(subv4): Use register_operand for operand 1.  Use
aarch64_plus_operand for operand 2.
(subv_insn): New insn pattern.
(subv_imm): Likewise.
(negv3): New expand pattern.
(negv_insn): New insn pattern.
(negv_cmp_only): Likewise.
(cmpv_insn): Likewise.
(subvti4): Use register_operand for operand 1.  Update call to
aarch64_expand_subvti.
(usubvti4): Likewise.
(negvti3): New expand pattern.
(negdi_carryout): New insn pattern.
(negvdi_carryinV): New insn pattern.
(sub_compare1_imm): Delete named insn pattern, make anonymous
version the named version.
(peepholes to convert to sub_compare1_imm): Adjust order of
operands.
(usub3_carryinC, usub3_carryinC_z1): New insn
patterns.
(usub3_carryinC_z2, usub3_carryinC): New insn
patterns.
(sub3_carryinCV, sub3_carryinCV_z1_z2): Delete.
(sub3_carryinCV_z1, sub3_carryinCV_z2): Delete.
(sub3_carryinCV): Delete.
(sub3_carryinV): New expand pattern.
sub3_carryinV, sub3_carryinV_z2): New insn patterns.

testsuite:

* gcc.target/aarch64/subs_compare_2.c: Make '#' immediate prefix
optional in scan pattern.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/aarch64/aarch64-protos.h
trunk/gcc/config/aarch64/aarch64.c
trunk/gcc/config/aarch64/aarch64.md
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/aarch64/subs_compare_2.c

[Bug target/86891] [9 Regression] __builtin_sub_overflow incorrect for unsigned types

2019-01-07 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86891

Richard Earnshaw  changed:

   What|Removed |Added

   Assignee|wilco at gcc dot gnu.org   |rearnsha at gcc dot 
gnu.org

--- Comment #7 from Richard Earnshaw  ---
(In reply to Jakub Jelinek from comment #6)
> Not everything is optimized during GIMPLE optimizations, we have
> simplify-rtx.c and combine.c for a reason.  So, if it is possible to
> accurately describe the behavior of the instruction without UNSPECs, it is
> generally preferable to do so.

It is possible, and I'm just polishing up a patch that does so.

[Bug target/86891] [9 Regression] __builtin_sub_overflow incorrect for unsigned types

2019-01-03 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86891

--- Comment #4 from Richard Earnshaw  ---
Yes, the extension should be zero-extend, not sign extend.  The plus operation
is correct, however, since decrementing the first operand could lead to
underflow if it was zero.  So the correct rtl would be 

  (compare ((zero_x(a)) (plus (zero_x(b) (ltu(cc, 0)
  (minus (...))

[Bug target/86891] [9 Regression] wrong code with -O -frerun-cse-after-loop -fno-tree-dominator-opts -fno-tree-fre

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86891

--- Comment #2 from Richard Earnshaw  ---
The abort goes away after r266897.  It might have just gone latent, however.

[Bug rtl-optimization/15482] can't find a register in class `GENERAL_REGS' while reloading `asm'

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15482

Richard Earnshaw  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |8.0
  Known to fail||

--- Comment #7 from Richard Earnshaw  ---
This seems to have been fixed in gcc-8.  Whippee!

[Bug c++/85396] _M_t._M_emplace_hint_unique

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85396

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Richard Earnshaw  ---
No feedback in 6 months, please reopen if this is an issue that can be
reproduced on a direct build of the FSF releases.

[Bug target/82258] [8/9 regression] allocate_zerosize_3.f fails since r251949

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82258

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |UNCONFIRMED
 Ever confirmed|1   |0

--- Comment #19 from Richard Earnshaw  ---
Feedback provided

[Bug target/82162] Internal compiler error in Raspbian

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82162

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Richard Earnshaw  ---
No feedback

[Bug target/82034] SMMLAR pattern not detected on ARMv7-M

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82034

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #4 from Richard Earnshaw  ---
Your example command line has no optimization flags on it, so it's not
surprising that you don't see the idiom reduced.  With gcc-7 and -O3 I get:

sub sp, sp, #16
movsr1, #1
movsr2, #2
movsr3, #3
str r1, [sp, #4]
str r2, [sp, #8]
str r3, [sp, #12]
ldr r0, [sp, #4]
ldr r3, [sp, #8]
ldr r2, [sp, #12]
.syntax unified
@ 15 "test.c" 1
smmlar r3, r3, r2, r0
@ 0 "" 2
.thumb
.syntax unified
add sp, sp, #16
@ sp needed
bx  lr

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |UNCONFIRMED
 Ever confirmed|1   |0

--- Comment #11 from Richard Earnshaw  ---
testcase added

[Bug rtl-optimization/78932] [ARM] -O2 generates wrong code

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78932

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #4 from Richard Earnshaw  ---
No feedback in almost 2 years.

[Bug middle-end/78233] compute_idf fails quick_push size check when compiling libgcc for Debian armel with qemu-arm-static

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78233

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Richard Earnshaw  ---
No feedback in 18 months.  Please reopen and add the requested details if this
is still an issue.

[Bug middle-end/77996] Miscompilation due to LTO on aarch64

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77996

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #13 from Richard Earnshaw  ---
No further feedback in 2 years.  Presumed invalid aliasing of data types in
LLVM

[Bug c++/77662] arm-linux-gnueabihf-g++: internal compiler error: Killed (program cgcc)

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77662

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Richard Earnshaw  ---
No feedback in 2 years.  Presumed invalid

[Bug rtl-optimization/70223] [ARM] Optimization level -O2 results in wrong code

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70223

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Richard Earnshaw  ---
No feedback in 2 years.  Changes have been made in the past to address issues
like this, so presuming fixed.

[Bug rtl-optimization/70030] [LRA]ICE when reload insn with output scratch operand

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70030

--- Comment #9 from Richard Earnshaw  ---
Did the need for this patch go away?

[Bug target/68494] [ARM] Use vector multiply by lane

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68494

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |NEW

--- Comment #4 from Richard Earnshaw  ---
Still looks like we are using vdup for the testcase.

[Bug sanitizer/68100] runtime segfault ARM boost::regex_replace -fsanitize=undefined member access within misaligned address

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68100

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #3 from Richard Earnshaw  ---
No testcase supplied

[Bug target/65325] float/interger operation needs cast with 02 switch.

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65325

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #11 from Richard Earnshaw  ---
No testcase supplied

[Bug target/58490] __sync_bool_compare_and_swap sign bit failure

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58490

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WONTFIX

--- Comment #4 from Richard Earnshaw  ---
No progress in 5 years and gcc-4.7 is no-longer maintained.  If the problem
persists in a currently maintained version of gcc, please can you open a new
bug report with full details of the issue and how to reproduce.

[Bug target/57911] alignment of arrays allocated stack on ARM : 4 bytes ?

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57911

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Richard Earnshaw  ---
No testcase forthcoming

[Bug bootstrap/56116] failed to build ARM native compiler by ARM cross compiler

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56116

Richard Earnshaw  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |WORKSFORME

--- Comment #8 from Richard Earnshaw  ---
This bug report is very old and has gone stale.  If the problem persists with a
currently maintained version of gcc, please file a new bug report with full
details of how to reproduce the problem.

  1   2   3   4   5   6   7   8   9   >