[AArch64] Upgrade integer MLA intrinsics to GCC vector extensions

2020-08-12 Thread James Greenhalgh
Hi, As subject, this patch rewrites the mla intrinsics to use a + b * c rather than inline assembler, thereby opening them to CSE, scheduling, etc. Bootstrapped and tested on aarch64-none-linux-gnu. OK? Thanks, James --- gcc/Changelog: 2020-08-11 James Greenhalgh config/aarch64

[AArch64] Move vmull_* to intrinsics

2020-02-18 Thread James Greenhalgh
Hi, As title, move some arm_neon.h functions which currently use assembly over to intrinsics. Bootstrapped and tested on aarch64-none-linux-gnu. OK, if so can someone please apply on my behalf? Thanks, James --- gcc/ 2020-02-18 James Greenhalgh * config/aarch64/aarch64-simd

Re: [AArch64] Fix cost of (plus ... (const_int -C))

2019-09-25 Thread James Greenhalgh
On Mon, Sep 23, 2019 at 10:45:29AM +0100, Richard Sandiford wrote: > The PLUS handling in aarch64_rtx_costs only checked for nonnegative > constants, meaning that simple immediate subtractions like: > > (set (reg R1) (plus (reg R2) (const_int -8))) > > had a cost of two instructions. > >

Re: [PATCH][AArch64] Don't split 64-bit constant stores to volatile location

2019-09-25 Thread James Greenhalgh
On Tue, Sep 24, 2019 at 02:40:20PM +0100, Kyrill Tkachov wrote: > Hi all, > > On 8/22/19 10:16 AM, Kyrill Tkachov wrote: > > Hi all, > > > > The optimisation to optimise: > >    typedef unsigned long long u64; > > > >    void bar(u64 *x) > >    { > > *x = 0xabcdef10abcdef10; > >    } > > > >

Re: [AArch64] Split built-in function codes into major and minor codes

2019-09-25 Thread James Greenhalgh
On Wed, Aug 07, 2019 at 08:28:50PM +0100, Richard Sandiford wrote: > It was easier to add the SVE ACLE support without enumerating every > function at build time. This in turn meant that it was easier if the > SVE builtins occupied a distinct numberspace from the existing AArch64 > ones, which

Re: [PATCH][AArch64] Add support for missing CPUs

2019-09-02 Thread James Greenhalgh
On Thu, Aug 22, 2019 at 12:03:33PM +0100, Kyrill Tkachov wrote: > Hi Dennis, > > On 8/21/19 10:27 AM, Dennis Zhang wrote: > > Hi all, > > > > This patch adds '-mcpu' options for following CPUs: > > Cortex-A77, Cortex-A76AE, Cortex-A65, Cortex-A65AE, and Cortex-A34. > > > > Related specifications

Re: [PATCH][AArch64] Add Linux hwcap strings for some extensions

2019-09-02 Thread James Greenhalgh
On Fri, Aug 23, 2019 at 05:42:30PM +0100, Kyrill Tkachov wrote: > Hi all, > > This patch adds feature strings for some of the extensions. This string > is what is read from /proc/cpuinfo on Linux systems > and used during -march=native detection. > > The strings are taken from the kernel source

Re: [PATCH][AArch64] Add support for __jcvt intrinsic

2019-09-02 Thread James Greenhalgh
On Mon, Sep 02, 2019 at 01:16:32PM +0100, Kyrill Tkachov wrote: > Hi all, > > This patch implements the __jcvt ACLE intrinsic [1] that maps down to > the FJCVTZS [2] instruction from Armv8.3-a. > No fancy mode iterators or nothing. Just a single builtin, UNSPEC and > define_insn and the

Re: [PATCH][AArch64] Expand DImode constant stores to two SImode stores when profitable

2019-08-21 Thread James Greenhalgh
On Mon, Oct 24, 2016 at 03:27:10PM +0100, Kyrill Tkachov wrote: > Hi all, > > When storing a 64-bit immediate that has equal bottom and top halves we > currently > synthesize the repeating 32-bit pattern twice and perform a single X-store. > With this patch we synthesize the 32-bit pattern once

Re: [AArch64] Tweak handling of fp moves via int registers

2019-08-19 Thread James Greenhalgh
On Wed, Aug 07, 2019 at 07:12:19PM +0100, Richard Sandiford wrote: > The AArch64 port uses define_splits to prefer moving certain float > constants via integer registers over loading them from memory. E.g.: > > (set (reg:SF X) (const_double:SF C)) > > splits to: > > (set (reg:SI tmp)

Re: PR90724 - ICE with __sync_bool_compare_and_swap with -march=armv8.2-a

2019-08-19 Thread James Greenhalgh
On Thu, Aug 15, 2019 at 02:11:25PM +0100, Prathamesh Kulkarni wrote: > On Thu, 8 Aug 2019 at 11:22, Prathamesh Kulkarni > wrote: > > > > On Thu, 1 Aug 2019 at 15:34, Prathamesh Kulkarni > > wrote: > > > > > > On Thu, 25 Jul 2019 at 11:56, Prathamesh Kulkarni > > > wrote: > > > > > > > > On Wed,

Re: [PING][AArch64] Use scvtf fbits option where appropriate

2019-08-19 Thread James Greenhalgh
On Mon, Jul 08, 2019 at 04:41:06PM +0100, Joel Hutton wrote: > On 01/07/2019 18:03, James Greenhalgh wrote: > > >> gcc/testsuite/ChangeLog: > >> > >> 2019-06-12 Joel Hutton > >> > >> * gcc.target/aarch64/fmul_scvtf_1.c: New test.

Re: [patch][aarch64]: add intrinsics for vld1(q)_x4 and vst1(q)_x4

2019-08-19 Thread James Greenhalgh
On Thu, Aug 15, 2019 at 12:28:27PM +0100, Kyrill Tkachov wrote: > Hi all, > > On 8/6/19 10:51 AM, Richard Earnshaw (lists) wrote: > On 18/07/2019 18:18, James Greenhalgh wrote: > > On Mon, Jun 10, 2019 at 06:21:05PM +0100, Sylvia Taylor wrote: > >> Greetings

Re: [PATCH][AArch64] Increase default function alignment

2019-08-12 Thread James Greenhalgh
On Fri, May 31, 2019 at 12:52:32PM +0100, Wilco Dijkstra wrote: > With -mcpu=generic the function alignment is currently 8, however almost all > supported cores prefer 16 or higher, so increase the default to 16:12. > This gives ~0.2% performance increase on SPECINT2017, while codesize is 0.12% >

Re: [PATCH][aarch64] Use neoversen1 tuning struct for -mcpu=cortex-a76

2019-08-12 Thread James Greenhalgh
On Tue, Jul 30, 2019 at 05:59:15PM +0100, Kyrill Tkachov wrote: > Hi all, > > The neoversen1 tuning struct gives better performance on the Cortex-A76, > so use that. > The only difference from the current tuning is the function and label > alignment settings. > > This gives about 1.3%

Re: [PATCH][AArch64] Fix PR81800

2019-08-12 Thread James Greenhalgh
On Tue, May 28, 2019 at 06:11:29PM +0100, Wilco Dijkstra wrote: > PR81800 is about the lrint inline giving spurious FE_INEXACT exceptions. > The previous change for PR81800 didn't fix this: when lrint is disabled > in the backend, the midend will simply use llrint. This actually makes > things

Re: [AArch64] Add a "y" constraint for V0-V7

2019-08-12 Thread James Greenhalgh
On Wed, Aug 07, 2019 at 07:19:12PM +0100, Richard Sandiford wrote: > Some indexed SVE FCMLA operations have a 3-bit register field that > requires one of Z0-Z7. This patch adds a public "y" constraint for that. > > The patch also documents "x", which is again intended to be a public >

Re: [AArch64] Make aarch64_classify_vector_mode use a switch statement

2019-08-12 Thread James Greenhalgh
On Wed, Aug 07, 2019 at 07:24:18PM +0100, Richard Sandiford wrote: > aarch64_classify_vector_mode used properties of a mode to test whether > the mode was a single Advanced SIMD vector, a single SVE vector, or a > tuple of SVE vectors. That works well for current trunk and is simpler > than

Re: [AArch64] Make the complete mnemonic

2019-08-12 Thread James Greenhalgh
On Wed, Aug 07, 2019 at 08:23:48PM +0100, Richard Sandiford wrote: > The Advanced SIMD and SVE permute patterns both split the permute > operation into a base name and a hilo suffix. That works well, but it > means that for "@" patterns, we need to pass the permute code twice, > once for the base

Re: [PATCH, GCC, AArch64] Enable Transactional Memory Extension

2019-07-22 Thread James Greenhalgh
On Wed, Jul 10, 2019 at 07:55:42PM +0100, Sudakshina Das wrote: > Hi > > This patch enables the new Transactional Memory Extension announced > recently as part of Arm's new architecture technologies. > We introduce a new optional extension "tme" to enable this. The > following instructions are

Re: [patch][aarch64]: add usra and ssra combine patterns

2019-07-22 Thread James Greenhalgh
egarding aarch64_sra_n, this patch shouldn't affect it. > > I am also not aware of any way of enabling this combine inside the pattern > used for those intrinsics, so I kept them separate. > > Cheers, > Syl > > -Original Message- > From: James Greenhalgh > Sent:

Re: [patch][aarch64]: add intrinsics for vld1(q)_x4 and vst1(q)_x4

2019-07-18 Thread James Greenhalgh
On Mon, Jun 10, 2019 at 06:21:05PM +0100, Sylvia Taylor wrote: > Greetings, > > This patch adds the intrinsic functions for: > - vld1__x4 > - vst1__x4 > - vld1q__x4 > - vst1q__x4 > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk? If yes, I don't have any commit rights, so

Re: [PATCH][GCC][AArch64] Make processing less fragile in config.gcc

2019-07-08 Thread James Greenhalgh
On Tue, Jun 25, 2019 at 09:30:30AM +0100, Tamar Christina wrote: > Hi All, > > This is an update to the patch rebased to after the SVE2 options have been > merged. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for trunk? OK. Thanks, James > > Thanks, > Tamar >

Re: [patch 1/2][aarch64]: redefine aes patterns

2019-07-08 Thread James Greenhalgh
On Fri, Jul 05, 2019 at 12:24:42PM +0100, Sylvia Taylor wrote: > Greetings, > > This first patch removes aarch64 usage of the aese/aesmc and aesd/aesimc > fusions (i.e. aes fusion) implemented in the scheduler due to unpredictable > behaviour observed in cases such as: > - when register

Re: [PING][AArch64] Use scvtf fbits option where appropriate

2019-07-01 Thread James Greenhalgh
On Wed, Jun 26, 2019 at 10:35:00AM +0100, Joel Hutton wrote: > Ping, plus minor rework (mostly non-functional changes) > > gcc/ChangeLog: > > 2019-06-12 Joel Hutton > > * config/aarch64/aarch64-protos.h (aarch64_fpconst_pow2_recip): New > prototype > *

Re: [PATCH][AArch64] Remove constraint strings from define_expand constructs in the back end

2019-07-01 Thread James Greenhalgh
On Mon, Jun 24, 2019 at 04:33:40PM +0100, Dennis Zhang wrote: > Hi, > > A number of AArch64 define_expand patterns have specified constraints > for their operands. But the constraint strings are ignored at expand > time and are therefore redundant/useless. We now avoid specifying > constraints

Re: [PATCH][arm/AArch64] Assume unhandled NEON types are neon_arith_basic types when scheduling for Cortex-A5

2019-07-01 Thread James Greenhalgh
On Mon, Jul 01, 2019 at 04:13:40PM +0100, Kyrill Tkachov wrote: > Hi all, > > Some scheduling descriptions, like the Cortex-A57 one, are reused for > multiple -mcpu options. > Sometimes those other -mcpu cores support more architecture features > than the Armv8-A Cortex-A57. > For example, the

Re: [PATCH] aarch64: fix asm visibility for extern symbols

2019-06-04 Thread James Greenhalgh
On Tue, Jun 04, 2019 at 03:58:07PM +0100, Szabolcs Nagy wrote: > Commit r271869 broke visibility declarations in asm for extern symbols, > because > the new ASM_OUTPUT_EXTERNAL hook failed to call the default hook for elf. OK. In future, you can consider a patch like this to fall under the

Re: [PATCH][GCC][AArch64] Add support for hint intrinsics: __yield, __wfe, __wfi, __sev and __sevl.

2019-06-03 Thread James Greenhalgh
On Wed, May 29, 2019 at 03:48:29PM +0100, Srinath Parvathaneni wrote: > Hi All, > > This patch implements the __yield(), __wfe(), __wfi(), __sev() and > __sevl() ACLE (hint) intrinsics for AArch64 as yield, wfe, wfi, sev and > sevl (hint) instructions respectively. > > The instructions are

Re: [PATCH v2] aarch64: emit .variant_pcs for aarch64_vector_pcs symbol references

2019-06-03 Thread James Greenhalgh
On Wed, May 29, 2019 at 11:00:46AM +0100, Richard Sandiford wrote: > Szabolcs Nagy writes: > > v2: > > - use aarch64_simd_decl_p to check for aarch64_vector_pcs. > > - emit the .variant_pcs directive even for local functions. > > - don't require .variant_pcs asm support in compile only tests. > >

Re: [PATCH][AArch64] Emit TARGET_DOTPROD-specific sequence for sadv16qi

2019-06-03 Thread James Greenhalgh
On Mon, May 13, 2019 at 12:18:25PM +0100, Kyrill Tkachov wrote: > Hi Richard, > > On 5/9/19 9:06 AM, Richard Sandiford wrote: > > Kyrill Tkachov writes: > >> +;; Helper expander for aarch64_abd_3 to save the callers > >> +;; the hassle of constructing the other arm of the MINUS. > >>

Re: [PATCH] AARCH64: ILP32: Fix aarch64_asan_shadow_offset

2019-06-03 Thread James Greenhalgh
On Thu, May 23, 2019 at 04:54:30AM +0100, Andrew Pinski wrote: > aarch64_asan_shadow_offset is using the wrong > offset for ILP32. Change it to be a decent one. > > OK? Bootstrapped and tested on aarch64-linux-gnu > with no regressions, OK. Thanks, James > > Thanks, > Andrew Pinski > >

Re: [PATCH][AArch64] PR tree-optimization/90332: Implement vec_init where N is a vector mode

2019-06-03 Thread James Greenhalgh
On Fri, May 10, 2019 at 10:32:22AM +0100, Kyrill Tkachov wrote: > Hi all, > > This patch fixes the failing gcc.dg/vect/slp-reduc-sad-2.c testcase on > aarch64 > by implementing a vec_init optab that can handle two half-width vectors > producing a full-width one > by concatenating them. > > In

Re: [patch][aarch64]: add usra and ssra combine patterns

2019-06-03 Thread James Greenhalgh
On Thu, May 30, 2019 at 03:25:19PM +0100, Sylvia Taylor wrote: > Greetings, > > This patch adds support to combine: > > 1) ushr and add into usra, example: > > ushr v0.16b, v0.16b, 2 > add v0.16b, v0.16b, v2.16b > --- > usra v2.16b, v0.16b, 2 > > 2) sshr and add into ssra, example: > >

Re: [PATCH, GCC, AARCH64] Add GNU note section with BTI and PAC.

2019-04-18 Thread James Greenhalgh
On Thu, Apr 04, 2019 at 05:01:06PM +0100, Sudakshina Das wrote: > Hi Richard > > On 03/04/2019 11:28, Richard Henderson wrote: > > On 4/3/19 5:19 PM, Sudakshina Das wrote: > >> + /* PT_NOTE header: namesz, descsz, type. > >> + namesz = 4 ("GNU\0") > >> + descsz = 16 (Size of the program

Re: Re : add tsv110 pipeline scheduling

2019-04-08 Thread James Greenhalgh
* config/aarch64/tsv110.md: New file. > > Thanks, > wuyuan > > -邮件原件- > 发件人: James Greenhalgh [mailto:james.greenha...@arm.com] > 发送时间: 2019年4月4日

Re: Re : add tsv110 pipeline scheduling

2019-04-03 Thread James Greenhalgh
is OK for trunk. Thank you for your many clarifications. Will you need one of us to apply this to trunk on your behalf? If you would like me to apply your patch, please provide the full ChangeLog with author information, like so: 2019-04-03 James Greenhalgh Second Author

Re: [Patch, aarch64] PR 89628 - fix register allocation in SIMD functions

2019-03-22 Thread James Greenhalgh
On Fri, Mar 22, 2019 at 05:35:02PM +, James Greenhalgh wrote: > On Mon, Mar 11, 2019 at 04:10:15PM +, Steve Ellcey wrote: > > Richard, > > > > I don't necessarily disagree with anything in your comments and long > > term I think that is the right direction,

Re: [PATCH/AARCH64] Fix zero_extendsidi2_aarch64 type attribute

2019-03-22 Thread James Greenhalgh
On Sun, Mar 10, 2019 at 06:26:07PM +, Andrew Pinski wrote: > Hi, > "uxtw x0, w1" is an alias for "mov w0, w1" but currently the > back-end marks it as extend type rather than mov_reg. This patch > fixes that. For most schedule models, this does not matter; I am > adding one where mov

Re: [Patch, aarch64] PR 89628 - fix register allocation in SIMD functions

2019-03-22 Thread James Greenhalgh
On Mon, Mar 11, 2019 at 04:10:15PM +, Steve Ellcey wrote: > Richard, > > I don't necessarily disagree with anything in your comments and long > term I think that is the right direction, but I wonder if that level of > change is appropriate for GCC Stage 4 which is where we are now. Your >

Re: [PATCH, wwwdocs] Mention -march=armv8.5-a and other new command line options for AArch64 and Arm for GCC 9

2019-03-22 Thread James Greenhalgh
On Wed, Mar 20, 2019 at 10:17:41AM +, Sudakshina Das wrote: > Hi Kyrill > > On 12/03/2019 12:03, Kyrill Tkachov wrote: > > Hi Sudi, > > > > On 2/22/19 10:45 AM, Sudakshina Das wrote: > >> Hi > >> > >> This patch documents the addition of the new Armv8.5-A and corresponding > >> extensions in

Re: Re : add tsv110 pipeline scheduling

2019-03-14 Thread James Greenhalgh
On Sat, Feb 23, 2019 at 01:28:22PM +, wuyuan (E) wrote: > Hi ,James: > Sorry for not responding to your email in time because of Chinese New Year’s > holiday and urgent work. The three questions you mentioned last email are due > to my misunderstanding of pipeline. > the first question,

Re: [PATCH][GCC][AArch64] Have empty HWCAPs string ignored during native feature detection

2019-02-27 Thread James Greenhalgh
On Thu, Feb 07, 2019 at 04:43:24AM -0600, Tamar Christina wrote: > Hi All, > > Since this hasn't been reviewed yet anyway I've updated this patch to also > fix the memory leaks etc. > > -- > > This patch makes the feature detection code for AArch64 GCC not add features > automatically when the

Re: [PATCH] Improve arm and aarch64 casesi (PR target/70341)

2019-02-27 Thread James Greenhalgh
On Fri, Feb 22, 2019 at 06:20:51PM -0600, Jakub Jelinek wrote: > Hi! > > The testcase in the PR doesn't hoist any memory loads from the large switch > before the switch on aarch64 and arm (unlike e.g. x86), because the > arm/aarch64 casesi patterns don't properly annotate the memory load from the

Re: [Patch] [aarch64] PR target/89324 Handle stack pointer for SUBS/ADDS instructions

2019-02-22 Thread James Greenhalgh
On Fri, Feb 22, 2019 at 09:39:59AM -0600, Matthew Malcomson wrote: > Hi James, > > On 22/02/19 00:09, James Greenhalgh wrote: > > On Mon, Feb 18, 2019 at 08:40:12AM -0600, Matthew Malcomson wrote: > >> > >> Additionally, this patch contains two tidy-ups (

Re: [Patch] [aarch64] PR target/89324 Handle stack pointer for SUBS/ADDS instructions

2019-02-21 Thread James Greenhalgh
On Mon, Feb 18, 2019 at 08:40:12AM -0600, Matthew Malcomson wrote: > Handle stack pointer with SUBS/ADDS instructions. > > In general the stack pointer was not handled for many SUBS/ADDS patterns in > aarch64.md. > Both the "extended register" and "immediate" forms allow the stack pointer to >

Re: [PATCH, GCC, AArch64] Fix a couple of bugs in BTI

2019-02-21 Thread James Greenhalgh
On Thu, Feb 21, 2019 at 06:19:10AM -0600, Sudakshina Das wrote: > Hi > > While doing more testing I found a couple of issues with my BTI patches. > This patch fixes them: > 1) Remove a reference to return address key. The original patch was > written based on a different not yet committed patch

Re: [PATCH 1/2][GCC][AArch64] Update Armv8.4-a's FP16 FML intrinsics

2019-02-21 Thread James Greenhalgh
On Wed, Feb 20, 2019 at 08:00:13AM -0600, Tamar Christina wrote: > Hi All, > > This patch updates the Armv8.4-a FP16 FML intrinsics's suffixes from u32 to > f16 > to be more consistent with the naming convention for intrinsics. > > The specifications for these intrinsics have not been published

Re: [PATCH][AArch64] Add support for Neoverse E1

2019-02-21 Thread James Greenhalgh
On Thu, Feb 21, 2019 at 11:43:08AM -0600, Kyrill Tkachov wrote: > Hi all, > > This patch adds -mcpu and -mtune support for the Neoverse E1 CPU [1]. > The new option is -mcpu=neoverse-e1. > Bootstrapped and tested on aarch64-none-linux-gnu. OK. Thanks, James > [1] >

Re: [PATCH][AArch64] Add support for Neoverse N1

2019-02-21 Thread James Greenhalgh
On Thu, Feb 21, 2019 at 11:42:56AM -0600, Kyrill Tkachov wrote: > Hi all, > > This patch adds support for the Neoverse N1 CPU [1]. This was supported > in GCC earlier through the codename Ares, > which it now replaces. -mcpu=ares is still accepted as there's been a > binutils release supporting

Re: [PATCH][GCC][AArch64] Fix command line options canonicalization version #2. (PR target/88530)

2019-02-21 Thread James Greenhalgh
On Wed, Feb 20, 2019 at 08:00:38AM -0600, Tamar Christina wrote: > Hi All, > > Commandline options on AArch64 don't get canonicalized into the smallest > possible set before output to the assembler. This means that overlapping > feature > sets are emitted with superfluous parts. > > Normally

Re: [Aarch64][SVE] Vectorise sum-of-absolute-differences

2019-02-06 Thread James Greenhalgh
On Mon, Feb 04, 2019 at 07:34:05AM -0600, Alejandro Martinez Vicente wrote: > Hi, > > This patch adds support to vectorize sum of absolute differences (SAD_EXPR) > using SVE. It also uses the new functionality to ensure that the resulting > loop > is masked. Therefore, it depends on > >

Re: [PATCH][AArch64] Change representation of SABD in RTL

2019-02-06 Thread James Greenhalgh
On Mon, Feb 04, 2019 at 04:23:32AM -0600, Kyrill Tkachov wrote: > Hi all, > > Richard raised a concern about the RTL we use to represent the AdvSIMD SABD > (vector signed absolute difference) instruction. > We currently represent it as ABS (MINUS op1 op2). > > This isn't exactly what SABD does.

Re: [PATCH][AArch64] Use neon_dot_q type for 128-bit [US]DOT instructions where appropriate

2019-02-06 Thread James Greenhalgh
On Tue, Feb 05, 2019 at 11:52:10AM -0600, Kyrill Tkachov wrote: > Hi all, > > For the Dot Product instructions we have the scheduling types neon_dot and > neon_dot_q for the 128-bit versions. > It seems that we're only using the former though, not assigning the > neon_dot_q type anywhere. > >

Re: [PATCH][AArch64] Use implementation namespace consistently in arm_neon.h

2019-02-06 Thread James Greenhalgh
On Wed, Feb 06, 2019 at 07:52:42AM -0600, Kyrill Tkachov wrote: > [resending with patch compressed] > > Hi all, > > We're somewhat inconsistent in arm_neon.h when it comes to using the > implementation namespace for local > identifiers. This means things like: > #define hash_abcd 0 > #define

Re: [PATCH][wwwdocs][Arm][AArch64] Update changes with new features and flags.

2019-01-30 Thread James Greenhalgh
On Wed, Jan 23, 2019 at 04:43:02AM -0600, Tamar Christina wrote: > Hi All, > > This patch adds the documentation for Stack clash protection and Armv8.3-a > support to > changes.html for GCC 9. > I have validated the html using the W3C validator. > > Ok for cvs? Almost OK by me. > > Thanks, >

Re: add tsv110 pipeline scheduling

2019-01-17 Thread James Greenhalgh
; "cortex_a57_neon_load_d" 11 > (and (eq_attr "tune" "cortexa57") >(eq_attr "cortex_a57_neon_type" "neon_load_d")) > "ca57_cx1_issue+ca57_cx2_issue, >ca57_ls_issue+ca57_ls_issue,ca57_ldr*2") This model says you wi

Re: [PATCH] Fix arm_neon.h #pragma GCC target syntax (PR target/88734)

2019-01-17 Thread James Greenhalgh
On Thu, Jan 17, 2019 at 07:47:32AM -0600, Jakub Jelinek wrote: > Hi! > > arm_neon.h on both targets contained a couple of spots with invalid > #pragma GCC target syntax. This doesn't result in errors, just warnings and > those warnings are surpressed in system headers, so are visible with >

Re: [PATCH] PR target/85596 Add --with-multilib-list doc for aarch64

2019-01-17 Thread James Greenhalgh
On Mon, Jan 07, 2019 at 09:07:35AM -0600, Christophe Lyon wrote: > Hi, > > This small patch adds a short description of --with-multilib-list for aarch64. > OK? OK. Thanks, James > > Thanks, > > Christophe > 2019-01-07 Christophe Lyon > > PR target/85596 > * doc/install.texi

Re: [PATCH][AArch64] Initial -mcpu=ares tuning

2019-01-16 Thread James Greenhalgh
On Tue, Jan 15, 2019 at 09:29:46AM -0600, Kyrill Tkachov wrote: > Hi all, > > This patch adds a tuning struct for the Arm Ares CPU and uses it for > -m{cpu,tune}=ares. > The tunings are an initial attempt and may be improved upon in the future, > but they serve > as a decent starting point for

Re: [PATCH][GCC][AArch64] Rename stack-clash CFA register to avoid clash.

2019-01-16 Thread James Greenhalgh
On Wed, Jan 16, 2019 at 11:03:41AM -0600, Tamar Christina wrote: > Hi All, > > We had multiple patches in flight that required used of scratch registers in > frame layout code. As it happens two of these features picked the same > register > and landed at around the same time. As such there is

Re: [PATCH][GCC][AArch64] Fix big-endian neon-intrinsics ICEs

2019-01-16 Thread James Greenhalgh
On Mon, Jan 14, 2019 at 08:01:47AM -0600, Tamar Christina wrote: > Hi All, > > > This patch fixes some ICEs when the fcmla_lane intrinsics are used on > big endian by correcting the lane indices and removing the hardcoded byte > offset from subreg calls and instead use subreg_lowpart_offset.

Re: [RFC][AArch64] Add support for system register based stack protector canary access

2019-01-10 Thread James Greenhalgh
On Mon, Dec 03, 2018 at 03:55:36AM -0600, Ramana Radhakrishnan wrote: > For quite sometime the kernel guys, (more specifically Ard) have been > talking about using a system register (sp_el0) and an offset from that > for a canary based access. This patchset adds support for a new set of >

Re: [PATCH][AArch64] Use Q-reg loads/stores in movmem expansion

2019-01-09 Thread James Greenhalgh
On Fri, Dec 21, 2018 at 06:30:49AM -0600, Kyrill Tkachov wrote: > Hi all, > > Our movmem expansion currently emits TImode loads and stores when copying > 128-bit chunks. > This generates X-register LDP/STP sequences as these are the most preferred > registers for that mode. > > For the purpose

Re: [PATCH 6/9][GCC][AArch64] Add Armv8.3-a complex intrinsics

2019-01-09 Thread James Greenhalgh
On Fri, Dec 21, 2018 at 11:57:55AM -0600, Tamar Christina wrote: > Hi All, > > This updated patch adds NEON intrinsics and tests for the Armv8.3-a complex > multiplication and add instructions with a rotate along the Argand plane. > > The instructions are documented in the ArmARM[1] and the

Re: [PATCH 3/3][GCC][AARCH64] Add support for pointer authentication B key

2019-01-07 Thread James Greenhalgh
On Fri, Dec 21, 2018 at 09:00:10AM -0600, Sam Tebbs wrote: > On 11/9/18 11:04 AM, Sam Tebbs wrote: > Attached is an improved patch with "hint" removed from the test scans, > pauth_hint_num_a and pauth_hint_num_b merged into pauth_hint_num and the > "gcc_assert

Re: [PATCH 2/3][GCC][AARCH64] Add new -mbranch-protection option to combine pointer signing and BTI

2019-01-07 Thread James Greenhalgh
On Thu, Dec 20, 2018 at 10:38:42AM -0600, Sam Tebbs wrote: > On 11/22/18 4:54 PM, Sam Tebbs wrote: > > Hi all, > > Attached is an updated patch with branch_protec_type renamed to > branch_protect_type, some unneeded ATTRIBUTE_USED removed and an added > use of ARRAY_SIZE. > > Below is the

Re: [PATCH, GCC, AARCH64, 5/6] Enable BTI : Add new pass for BTI.

2018-12-19 Thread James Greenhalgh
On Fri, Dec 14, 2018 at 10:09:03AM -0600, Sudakshina Das wrote: > I have updated the patch according to our discussions offline. > The md pattern is now split into 4 patterns and i have added a new > test for the setjmp case along with some comments where missing. This is OK for trunk.

Re: [ping] allow target (OS) SUBTARGET_OVERRIDE_OPTIONS on aarch64

2018-12-12 Thread James Greenhalgh
On Wed, Dec 12, 2018 at 09:42:05AM -0600, Olivier Hainque wrote: > Ping for one of the changes last proposed here: > > https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00761.html > > Submitted separately as an attempt to facilitate the review > process. > > This one proposes the possibility for

Re: [PATCH, GCC, AARCH64, 3/6] Restrict indirect tail calls to x16 and x17

2018-12-07 Thread James Greenhalgh
On Thu, Nov 29, 2018 at 10:56:46AM -0600, Sudakshina Das wrote: > Hi > > On 02/11/18 18:37, Sudakshina Das wrote: > > Hi > > > > This patch is part of a series that enables ARMv8.5-A in GCC and > > adds Branch Target Identification Mechanism. > >

Re: [PATCH 5/9][GCC][AArch64/Arm] Add auto-vectorization tests.

2018-11-28 Thread James Greenhalgh
On Sun, Nov 11, 2018 at 04:27:33AM -0600, Tamar Christina wrote: > Hi All, > > This patch adds tests for AArch64 and Arm to test the autovectorization > of complex numbers using the Armv8.3-a instructions. > > This patch enables them only for AArch64 at this point. > > Bootstrapped Regtested on

Re: [PATCH 4/9][GCC][AArch64/Arm] Add new testsuite directives to check complex instructions.

2018-11-28 Thread James Greenhalgh
On Sun, Nov 11, 2018 at 04:27:04AM -0600, Tamar Christina wrote: > Hi All, > > This patch adds new testsuite directive for both Arm and AArch64 to support > testing of the Complex Arithmetic operations form Armv8.3-a. > > Bootstrap and Regtest on aarch64-none-linux-gnu, arm-none-gnueabihf and >

Re: [PATCH 3/9][GCC][AArch64] Add autovectorization support for Complex instructions

2018-11-28 Thread James Greenhalgh
On Mon, Nov 12, 2018 at 06:31:45AM -0600, Tamar Christina wrote: > Hi Kyrill, > > > Hi Tamar, > > > > On 11/11/18 10:26, Tamar Christina wrote: > > > Hi All, > > > > > > This patch adds the expander support for supporting autovectorization of > > > complex number operations > > > such as

Re: [PATCH][GCC][AARCH64] Replace calls to strtok with strtok_r in aarch64 attribute handling code

2018-11-28 Thread James Greenhalgh
On Fri, Nov 23, 2018 at 08:22:49AM -0600, Sam Tebbs wrote: > Hi all, > > They AArch64 general attribute handling code uses the strtok function to > separate comma-delimited attributes in a string. This causes problems for and > interfers with attribute-specific handling code that also uses strtok

Re: [PATCH][AArch64][2/3] Correct type attribute for mul and mneg instructions

2018-11-28 Thread James Greenhalgh
On Mon, Nov 26, 2018 at 11:36:43AM -0600, Kyrill Tkachov wrote: > Hi all, > > In the AAarch64 ISA the MUL and MNEG instructions are actually aliases of > MADD and MSUB. > Therefore they should have the type attribute mla, rather than mul, which > should only be used > for AArch32 32-bit

Re: [PATCH][AArch64][3/3] Introduce mla64 type

2018-11-28 Thread James Greenhalgh
On Mon, Nov 26, 2018 at 11:36:47AM -0600, Kyrill Tkachov wrote: > Hi all, > > On some cores the X-register MADD/MSUB (and hence MUL and MNEG) instructions > may behave differently > than the W-register forms and the scheduling models may want to reflect that. > That is currently not possible

Re: [PATCH v3] [aarch64] Correct the maximum shift amount for shifted operands.

2018-11-28 Thread James Greenhalgh
On Wed, Nov 28, 2018 at 07:08:02AM -0600, Philipp Tomsich wrote: > > > On 28.11.2018, at 13:10, Richard Earnshaw (lists) > mailto:richard.earns...@arm.com>> wrote: > > On 26/11/2018 19:50, Christoph Muellner wrote: > The aarch64 ISA specification allows a left shift amount to be applied >

Re: Patch ping (was Re: [PATCH] Fix aarch64_compare_and_swap* constraints (PR target/87839))

2018-11-21 Thread James Greenhalgh
On Tue, Nov 20, 2018 at 11:04:46AM -0600, Jakub Jelinek wrote: > Hi! > > On Tue, Nov 13, 2018 at 10:28:16AM +0100, Jakub Jelinek wrote: > > 2018-11-13 Jakub Jelinek > > > > PR target/87839 > > * config/aarch64/atomics.md (@aarch64_compare_and_swap): Use > > rIJ constraint for

Re: [PATCH][AArch64] PR79262: Adjust vector cost

2018-11-09 Thread James Greenhalgh
On Mon, Jan 22, 2018 at 09:22:27AM -0600, Richard Biener wrote: > On Mon, Jan 22, 2018 at 4:01 PM, Wilco Dijkstra > wrote: > > PR79262 has been fixed for almost all AArch64 cpus, however the example is > > still > > vectorized in a few cases, resulting in lower performance. Increase the > >

Re: [PATCH][AArch64] PR79262: Adjust vector cost

2018-11-09 Thread James Greenhalgh
On Fri, Nov 09, 2018 at 08:14:27AM -0600, Wilco Dijkstra wrote: > PR79262 has been fixed for almost all AArch64 cpus, however the example is > still > vectorized in a few cases, resulting in lower performance.  Increase the cost > of > vector-to-scalar moves so it is more similar to the other

Re: [PATCH] Remove extra memory allocation of strings.

2018-11-08 Thread James Greenhalgh
On Tue, Oct 23, 2018 at 08:17:43AM -0500, Martin Liška wrote: > Hello. > > As a follow up patch I would like to remove redundant string allocation > on string which is not needed in my opinion. > > That bootstrap on aarch64-linux. OK, Thanks, James > From

Re: [PATCH, GCC, AARCH64, 6/6] Enable BTI: Add configure option for BTI and PAC-RET

2018-11-07 Thread James Greenhalgh
On Fri, Nov 02, 2018 at 01:38:46PM -0500, Sudakshina Das wrote: > Hi > > This patch is part of a series that enables ARMv8.5-A in GCC and > adds Branch Target Identification Mechanism. > (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools) > > This patch

Re: [PATCH, GCC, AARCH64, 4/6] Enable BTI: Add new to -mbranch-protection.

2018-11-07 Thread James Greenhalgh
On Fri, Nov 02, 2018 at 01:38:25PM -0500, Sudakshina Das wrote: > Hi > > This patch is part of a series that enables ARMv8.5-A in GCC and > adds Branch Target Identification Mechanism. > (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools) > > NOTE: This

Re: [PATCH, GCC, AARCH64, 2/6] Add new arch command line feaures from ARMv8.5-A

2018-11-07 Thread James Greenhalgh
On Fri, Nov 02, 2018 at 01:37:41PM -0500, Sudakshina Das wrote: > Hi > > This patch is part of a series that enables ARMv8.5-A in GCC and > adds Branch Target Identification Mechanism. > (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools) > > This patch

Re: [PATCH, GCC, AARCH64, 1/6] Enable ARMv8.5-A in gcc

2018-11-07 Thread James Greenhalgh
On Fri, Nov 02, 2018 at 01:37:33PM -0500, Sudakshina Das wrote: > Hi > > This patch is part of a series that enables ARMv8.5-A in GCC and > adds Branch Target Identification Mechanism. > (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools) > > This patch

Re: [PATCH, AArch64 v2 05/11] aarch64: Emit LSE st instructions

2018-10-31 Thread James Greenhalgh
On Wed, Oct 31, 2018 at 04:55:26PM -0500, Richard Henderson wrote: > On 10/31/18 5:51 PM, Will Deacon wrote: > > Aha, maybe this is the problem. An acquire fence on AArch64 is implemented > > using a DMB LD instruction, which orders prior reads against subsequent > > reads and writes. However, the

Re: [PATCH, AArch64 v2 06/11] Add visibility to libfunc constructors

2018-10-30 Thread James Greenhalgh
This one needs some other reviewers copied in, who may have missed that it is not an AARch64 only patch (it looks fine to me). James On Tue, Oct 02, 2018 at 11:19:10AM -0500, Richard Henderson wrote: > * optabs-libfuncs.c (build_libfunc_function_visibility): > New, split out from...

Re: [PATCH, AArch64 v2 09/11] aarch64: Force TImode values into even registers

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:13AM -0500, Richard Henderson wrote: > The LSE CASP instruction requires values to be placed in even > register pairs. A solution involving two additional register > classes was rejected in favor of the much simpler solution of > simply requiring all TImode values to

Re: [PATCH, AArch64 v2 05/11] aarch64: Emit LSE st instructions

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:09AM -0500, Richard Henderson wrote: > When the result of an operation is not used, we can ignore the > result by storing to XZR. For two of the memory models, using > XZR with LD has a preferred assembler alias, ST. ST has different semantics to LD, in particular,

Re: [PATCH, AArch64 v2 04/11] aarch64: Improve atomic-op lse generation

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:08AM -0500, Richard Henderson wrote: > Fix constraints; avoid unnecessary split. Drop the use of the atomic_op > iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more > logical for ldclr aka bic. OK. Thanks, James > > *

Re: [PATCH, AArch64 v2 03/11] aarch64: Improve swp generation

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:07AM -0500, Richard Henderson wrote: > Allow zero as an input; fix constraints; avoid unnecessary split. OK. James > > * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove. > (aarch64_gen_atomic_ldop): Don't call it. > *

Re: [PATCH, AArch64 v2 02/11] aarch64: Improve cas generation

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:06AM -0500, Richard Henderson wrote: > Do not zero-extend the input to the cas for subword operations; > instead, use the appropriate zero-extending compare insns. > Correct the predicates and constraints for immediate expected operand. OK, modulo two very dull style

Re: [PATCH, AArch64 v2 01/11] aarch64: Simplify LSE cas generation

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:05AM -0500, Richard Henderson wrote: > The cas insn is a single insn, and if expanded properly need not > be split after reload. Use the proper inputs for the insn. OK. Thanks, James > > * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap): >

Re: [PATCH] Provide extension hint for aarch64 target (PR driver/83193).

2018-10-30 Thread James Greenhalgh
On Thu, Oct 25, 2018 at 05:53:22AM -0500, Martin Liška wrote: > On 10/24/18 7:48 PM, Martin Sebor wrote: > > On 10/24/2018 03:52 AM, Martin Liška wrote: > >> On 10/23/18 6:31 PM, Martin Sebor wrote: > >>> On 10/22/2018 07:05 AM, Martin Liška wrote: > >>&

Re: [AArch64] Add Saphira pipeline description.

2018-10-30 Thread James Greenhalgh
On Tue, Oct 30, 2018 at 05:12:58AM -0500, Sameera Deshpande wrote: > On Fri, 26 Oct 2018 at 13:33, Sameera Deshpande > wrote: > > > > Hi! > > > > Please find attached the patch to add a pipeline description for the > > Qualcomm Saphira core. It is tested with a bootstrap and make check, > > with

Re: [PATCH] Provide extension hint for aarch64 target (PR driver/83193).

2018-10-16 Thread James Greenhalgh
On Mon, Oct 08, 2018 at 05:34:52AM -0500, Martin Liška wrote: > Hi. > > I'm attaching updated version of the patch. Can't say I'm thrilled by the allocation/free (aarch64_parse_extension allocates, everyone else has to free) responsibilities here. If you can clean that up I'd be much happier.

Re: [PATCH v4] [aarch64] Add HiSilicon tsv110 CPU support

2018-09-20 Thread James Greenhalgh
On Wed, Sep 19, 2018 at 04:53:52AM -0500, Shaokun Zhang wrote: > This patch adds HiSilicon's an mcpu: tsv110, which supports v8_4A. > It has been tested on aarch64 and no regressions from this patch. This patch is OK for Trunk. Do you need someone to commit it on your behalf? Thanks, James >

Re: [PATCH][AAarch64][v3] Add support for TARGET_COMPUTE_FRAME_LAYOUT

2018-09-12 Thread James Greenhalgh
On Wed, Sep 12, 2018 at 08:07:41AM -0500, Vlad Lazar wrote: > On 11/09/18 17:53, James Greenhalgh wrote: > > On Mon, Aug 06, 2018 at 11:14:17AM -0500, Vlad Lazar wrote: > >> Hi, > >> > >> The patch adds support for the TARGET_COMPUTE_FRAME_LAYOUT hook on AAr

Re: [PATCH][AAarch64][v3] Add support for TARGET_COMPUTE_FRAME_LAYOUT

2018-09-11 Thread James Greenhalgh
On Mon, Aug 06, 2018 at 11:14:17AM -0500, Vlad Lazar wrote: > Hi, > > The patch adds support for the TARGET_COMPUTE_FRAME_LAYOUT hook on AArch64 > and removes unneeded frame layout recalculation. > > The removed aarch64_layout_frame calls are unnecessary because the functions > in which > they

Re: [PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/7)]

2018-09-11 Thread James Greenhalgh
On Fri, Sep 07, 2018 at 11:03:28AM -0500, Tamar Christina wrote: > Hi Richard, > > The 08/28/2018 21:58, Richard Sandiford wrote: > > Tamar Christina writes: > > > + HOST_WIDE_INT guard_used_by_caller = STACK_CLASH_CALLER_GUARD; > > > + /* When doing the final adjustment for the outgoing

  1   2   3   4   5   6   7   8   9   10   >