Re: [PATCH] Fix PR68067

2015-11-06 Thread Alan Lawrence
On 06/11/15 10:39, Richard Biener wrote: ../spec2000/benchspec/CINT2000/254.gap/src/polynom.c:358:11: error: location references block not in block tree l1_279 = PHI <1(28), l1_299(33)> ^^^ this is the error to look at! It means that the GC heap will be corrupted quite easily. Thanks, I'll

Re: [PATCH 1/2][ARM] PR/65956 AAPCS update for alignment attribute

2015-11-06 Thread Alan Lawrence
On 04/11/15 13:13, Jakub Jelinek wrote: On Mon, Jul 06, 2015 at 05:38:35PM +0100, Alan Lawrence wrote: Trying to push these now (svn!), patch 2 is going first. I realize my second iteration of patch 1/2, dropped the testcases from the first version. Okay to include those as per https

Re: [PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices

2015-11-05 Thread Alan Lawrence
On 30/10/15 10:54, Eric Botcazou wrote: > On 30/10/15 10:44, Richard Biener wrote: >> >> I think you want to use wide-ints here and >> >> wide_int idx = wi::from (minidx, TYPE_PRECISION (TYPE_DOMAIN >> (...)), TYPE_SIGN (TYPE_DOMAIN (..))); >> wide_int maxidx = ... >> >> you can then

Re: [PATCH 6/6] Make SRA replace constant-pool loads

2015-11-05 Thread Alan Lawrence
On 3 November 2015 at 14:01, Richard Biener wrote: > > Hum. I still wonder why we need all this complication ... Well, certainly I'd love to make it simpler, and if the complication is because I've gone about trying to deal with especially Ada in the wrong way... >

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-11-05 Thread Alan Lawrence
On 03/11/15 13:39, Richard Biener wrote: > On Tue, Oct 27, 2015 at 6:38 PM, Alan Lawrence <alan.lawre...@arm.com> wrote: >> >> Say I...P are consecutive, the input would have gaps 0 1 1 1 1 1 1 1. If we >> split the load group, we would want subgroups with gaps 0 1 1

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-11-05 Thread Alan Lawrence
On 3 November 2015 at 11:35, Richard Biener wrote: > > I think this should simply re-write A << B to (type) (unsigned-type) A > * (1U << B). > > Does that then still vectorize the signed case? I didn't realize our representation of chrec's could express that. Yes, it

Re: [PATCH 3/6] Share code from fold_array_ctor_reference with fold.

2015-11-04 Thread Alan Lawrence
> s/explicitely/explicitly/ And remove the '*' from the 2nd and 3rd lines > of the comment. > > It looks like get_ctor_element_at_index has numerous formatting > problems. In particular you didn't indent the braces across the board > properly. Also check for tabs vs spaces issues please. Yes,

Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-11-03 Thread Alan Lawrence
On 30/10/15 05:35, Jeff Law wrote: > On 10/29/2015 01:18 PM, Alan Lawrence wrote: >> This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs >> (with >> pointer type to the array element type). >> >> gcc/ChangeLog: >> >> * t

Re: [PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-11-03 Thread Alan Lawrence
On 3 November 2015 at 10:27, Alan Lawrence <alan.lawre...@arm.com> wrote: > That is, ssa-dom-cse-7.c passes (and the patch series solves PR/63679) if > instead of my patch 2 (normalization of MEM_REFs) we have this: > > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index 43

[PATCH/RFTesting][MIPS] Migrate reduction optabs in mips-ps-3d.md

2015-11-03 Thread Alan Lawrence
There are still a few uses of the old reduc_[us](plus|min|max)_ optabs remaining. This migrates the instances in mips-ps-3d.md. This seemed straightforward, as mips-ps-3d.md also provides a vec_extractv2sf. I tried to be conservative and handle all the possible cases for endianness, this may be

[PATCH][i386]Migrate reduction optabs to reduc__scal

2015-11-03 Thread Alan Lawrence
This migrates the various reduction optabs in sse.md to use the reduce-to-scalar form. I took the straightforward approach (equivalent to the migration code in expr.c/optabs.c) of generating a vector temporary, using the existing code to reduce to that, and extracting lane 0, in each pattern.

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-11-03 Thread Alan Lawrence
On 27/10/15 22:27, H.J. Lu wrote: > > It caused: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68112 Bah :(. So yes, in general case, we can't rewrite (a << 1) to (a * 2) as for signed types (0x7f...f) << 1 == -2 whereas (0x7f...f * 2) is undefined behaviour. Oh well :(... I don't have a

Re: [PATCH 3/4] bb-reorder: Add -freorder-blocks-algorithm= and wire it up

2015-11-02 Thread Alan Lawrence
On 02/11/15 14:38, Alan Lawrence wrote: > I'm a bit puzzled as to why nobody else has been seeing this, as it's been happening to me as part of building gcc on x86_64, but since this patch I've been seeing an ICE in vec::operator[] in reorder_basic_blocks_simple, building libitm/beginend

Re: [PATCH][AArch64] Fix ICE on (const_double:HF 0.0)

2015-11-02 Thread Alan Lawrence
On 26/10/15 16:26, Alan Lawrence wrote: The included testcase demonstrates the ICE: aarch64_valid_floating_const (via aarch64_float_const_representable_p) disables HFmode immediates, but allows 0.0. However, *movhf_aarch64 does not allow this insn: (insn 7 6 10 2 (set (mem:HF (reg/f:DI 73) [0

[PATCH 1/6]tree-ssa-dom.c: Normalize exprs, starting with ARRAY_REF to MEM_REF

2015-10-29 Thread Alan Lawrence
This patch just teaches DOM that ARRAY_REFs can be equivalent to MEM_REFs (with pointer type to the array element type). gcc/ChangeLog: * tree-ssa-dom.c (dom_normalize_single_rhs): New. (dom_normalize_gimple_stmt): New. (lookup_avail_expr): Call dom_normalize_gimple_stmt.

[PATCH 0/6 v2] PR/63679 Make SRA scalarize constant-pool loads

2015-10-29 Thread Alan Lawrence
This is a revision of previous series at https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01485.html , and follows on from the first two patches of that series, which have been pushed already. A few things have happened since. The previous patch 3, making SRA generate ARRAY_REFS, is removed. As

[PATCH 4/6][Trivial] tree-sra.c: A few comment fixes/additions.

2015-10-29 Thread Alan Lawrence
gcc/ChangeLog: * tree-sra.c (scalarizable_type_p): Comment variable-length arrays. (completely_scalarize): Comment zero-length arrays. (get_access_replacement): Correct comment re. precondition. --- gcc/tree-sra.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)

[PATCH 3/6] Share code from fold_array_ctor_reference with fold.

2015-10-29 Thread Alan Lawrence
This is in response to https://gcc.gnu.org/ml/gcc/2015-10/msg00097.html, where Richi points out that CONSTRUCTOR elements are not necessarily ordered. I wasn't sure of a good naming convention for the new get_ctor_element_at_index, other suggestions welcome. gcc/ChangeLog: *

[PATCH 6/6] Make SRA replace constant-pool loads

2015-10-29 Thread Alan Lawrence
This has changed quite a bit since the previous revision (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01484.html), mostly due to Ada and specifically Ada on ARM. I didn't find a good alternative to scanning for constant-pool accesses "as we go" through the function, and although I didn't find

[PATCH 5/6]tree-sra.c: Fix completely_scalarize for negative array indices

2015-10-29 Thread Alan Lawrence
The code I added to completely_scalarize for arrays isn't right in some cases of negative array indices (e.g. arrays with indices from -1 to 1 in the Ada testsuite). On ARM, this prevents a failure bootstrapping Ada with the next patch, as well as a few ACATS tests (e.g. c64106a). Some discussion

[PATCH 2/6] tree-ssa-dom.c: Normalize data types in MEM_REFs.

2015-10-29 Thread Alan Lawrence
This makes dom2 identify e.g. MEM[(int[8] *)...] with MEM[(int *)...]. These are not generally equivalent as they have different aliasing behaviour but they have the same value as far as dom is concerned and so this helps find more equivalences. There is some question over the best policy here,

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-27 Thread Alan Lawrence
--in-reply-to <cafiyyc3tepgber2jqc8-x_ij4ghtjjoxfzffcnyzhxhgqbe...@mail.gmail.com> On 26/10/15 08:58, Richard Biener wrote: > > On Fri, Oct 23, 2015 at 5:15 PM, Alan Lawrence <alan.lawre...@arm.com> wrote: >> + chrec2 = fold_build2 (LSHI

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-27 Thread Alan Lawrence
On 26/10/15 15:04, Richard Biener wrote: apart from the fact that you'll post a new version you need to adjust GROUP_GAP. You also seem to somewhat "confuse" "first I stmts" and "a group of size I", those are not the same when the group has haps. I'd say "a group of size i" makes the most

[PATCH][AArch64] Fix ICE on (const_double:HF 0.0)

2015-10-26 Thread Alan Lawrence
The included testcase demonstrates the ICE: aarch64_valid_floating_const (via aarch64_float_const_representable_p) disables HFmode immediates, but allows 0.0. However, *movhf_aarch64 does not allow this insn: (insn 7 6 10 2 (set (mem:HF (reg/f:DI 73) [0 *f_2(D)+0 S2 A16]) (const_double:HF

Re: [PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-25 Thread Alan Lawrence
On 23 October 2015 at 16:20, Alan Lawrence <alan.lawre...@arm.com> wrote: > diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > index ab54a48..b012d78 100644 > --- a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c > +++ b/gcc/testsuite/g

[PATCH] PR/67682, break SLP groups up if only some elements match

2015-10-23 Thread Alan Lawrence
vect_analyze_slp_instance currently only creates an slp_instance if _all_ stores in a group fitted the same pattern. This patch splits non-matching groups up on vector boundaries, allowing only part of the group to be SLP'd, or multiple subgroups to be SLP'd differently. The algorithm could be

Re: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-23 Thread Alan Lawrence
On 19/10/15 12:49, Richard Biener wrote: > Err, you should always do the shift in the type of rhs1. You should also > avoid the chrec_convert of rhs2 above for shifts. Err, yes, indeed. Needed to keep the chrec_convert before the chrec_fold_multiply, and the rest followed. How's this?

Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-10-22 Thread Alan Lawrence
Just one very small point... On 19/10/15 09:17, Alan Hayward wrote: > - if (check_reduction > - && (!commutative_tree_code (code) || !associative_tree_code (code))) > + if (check_reduction) > { > - if (dump_enabled_p ()) > -report_vect_op (MSG_MISSED_OPTIMIZATION,

Re: [PATCH, MIPS, PR/61114] Migrate to reduc_..._scal optabs.

2015-10-22 Thread Alan Lawrence
On closer inspection I think you can also remove this guy (from loongson.md): (define_insn "reduc_uplus_v8qi" [(set (match_operand:V8QI 0 "register_operand" "=f") (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "f")] UNSPEC_LOONGSON_BIADD))]

[PATCH][Testsuite] Add --param sra-max-scalarization-size-Ospeed to sra-12.c

2015-10-21 Thread Alan Lawrence
gcc.dg/tree-ssa/sra-12.c is skipped on a bunch of targets, including AArch64, because the default max-scalarization-size depends on MOVE_RATIO, and on those targets thus ends up being too small for SRA to optimize the testcase. Recently I noticed that the test has been failing for some time on ARM

[PATCH][AArch64 Testsuite][Trivial?] Remove divisions-to-produce-NaN from vdiv_f.c

2015-10-20 Thread Alan Lawrence
The test vdiv_f.c #define's NAN to (0.0 / 0.0). This produces extra scalar fdiv's, which complicate the scan-assembler testing. We can remove these by using __builtin_nan instead. Tested on AArch64 Linux. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vdiv_f.c: Use __builtin_nan. ---

Re: [PATCH 1/3] [ARM] PR63870 Add qualifiers for NEON builtins

2015-10-19 Thread Alan Lawrence
On 14/10/15 23:02, Charles Baylis wrote: On 12 October 2015 at 11:58, Alan Lawrence <alan.lawre...@arm.com> wrote: > Given we are making changes here to how this all works on bigendian, have you tested armeb at all? I tested on big endian, and it passes, except Well, I aske

[PATCH][Testsuite] Turn on 64-bit-vector tests for AArch64

2015-10-16 Thread Alan Lawrence
This enables tests bb-slp-11.c and bb-slp-26.c for AArch64. Both of these are currently passing on little- and big-endian. (Tested on aarch64-none-linux-gnu and aarch64_be-none-elf). OK for trunk? gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_vect64): Add

[PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

2015-10-16 Thread Alan Lawrence
This lets the vectorizer handle some simple strides expressed using left-shift rather than mul, e.g. a[i << 1] (whereas previously only a[i * 2] would have been handled). This patch does *not* handle the general case of shifts - neither a[i << j] nor a[1 << i] will be handled; that would be a

Re: [PATCH 1/3] [ARM] PR63870 Add qualifiers for NEON builtins

2015-10-12 Thread Alan Lawrence
On 07/10/15 00:59, charles.bay...@linaro.org wrote: diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c ... case NEON_ARG_MEMORY: /* Check if expand failed. */ if (op[argc] == const0_rtx) { - va_end

Re: [[Boolean Vector, patch 5/5] Support boolean vectors in vector lowering

2015-10-12 Thread Alan Lawrence
On 09/10/15 22:01, Jeff Law wrote: So my question for the series as a whole is whether or not we need to do something for the other languages, particularly Fortran. I was a bit surprised to see this stuff bleed into the C/C++ front-ends and obviously wonder if it's bled into Fortran, Ada,

Re: [PATCH 2/3] [ARM] PR63870 Mark lane indices of vldN/vstN with appropriate qualifier

2015-10-12 Thread Alan Lawrence
On 07/10/15 00:59, charles.bay...@linaro.org wrote: diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 2667866..251afdc 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -4261,8 +4261,9 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VLD1_LANE))]

Re: [PATCH, MIPS, PR/61114] Migrate to reduc_..._scal optabs.

2015-10-07 Thread Alan Lawrence
On 07/10/15 11:50, Simon Dardis wrote: On the change from smin/smax it was a deliberate change as I managed to confuse myself of the mode patterns, correct version follows. Reverted back to VWHB for smax/smin. Stylistic point addressed. No new regression, ok for commit? Well, I'm not a MIPS

Re: [PATCH, MIPS, PR/61114] Migrate to reduc_..._scal optabs.

2015-10-06 Thread Alan Lawrence
Thanks for working on this, Simon! On 01/10/15 15:43, Simon Dardis wrote: -(define_expand "reduc_smax_" - [(match_operand:VWHB 0 "register_operand" "") - (match_operand:VWHB 1 "register_operand" "")] +(define_expand "reduc_smax_scal_" + [(match_operand:HI 0 "register_operand" "") +

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-21 Thread Alan Lawrence
On 21/09/15 15:38, James Greenhalgh wrote: On Mon, Sep 21, 2015 at 10:44:32AM +0100, Alan Lawrence wrote: [Resending in plain text] This makes sense to me now, although I find your comment slightly confusing: [] in that +;; the meaning of HI and LO is always taken with a little-endian

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-21 Thread Alan Lawrence
[Resending in plain text] This makes sense to me now, although I find your comment slightly confusing: [] in that +;; the meaning of HI and LO is always taken with a little-endian view of +;; the vector You mean vec_unpacks_{hi,lo} (which seems to go against the *architectural* bit after

Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-09-18 Thread Alan Lawrence
On 18/09/15 13:17, Richard Biener wrote: Ok, I see. That this case is already vectorized is because it implements MAX_EXPR, modifying it slightly to int foo (int *a) { int val = 0; for (int i = 0; i < 1024; ++i) if (a[i] > val) val = a[i] + 1; return val; } makes it no

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-18 Thread Alan Lawrence
On 18/09/15 09:35, Richard Biener wrote: Btw, we ditched the original reduce-to-vector variant due to its endianess issues (it only had _one_ element of the vector contain the reduction result). Re-introducing reduce-to-vector but with the reduction result in all elements wouldn't have any

[PATCH][RS6000] Migrate from reduc_xxx to reduc_xxx_scal optabs

2015-09-18 Thread Alan Lawrence
This is a respin of https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01024.html after discovering that patch was broken on power64le - thanks to Bill Schmidt for pointing out that gcc112 is the opposite endianness to gcc110... This time I decided to avoid any funny business with making RTL match

Re: [PR64164] drop copyrename, integrate into expand

2015-09-18 Thread Alan Lawrence
On 02/09/15 23:12, Alexandre Oliva wrote: On Sep 2, 2015, Alan Lawrence <alan.lawre...@arm.com> wrote: One more failure to report, I'm afraid. On AArch64 Bigendian, aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from r227348): Thanks. The failure mode was dif

Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-17 Thread Alan Lawrence
On 15/09/15 08:43, Richard Biener wrote: > > Sorry for chiming in so late... Not at all, TYVM for your help! > TREE_CONSTANT isn't the correct thing to test. You should use > TREE_CODE () == INTEGER_CST instead. Done (in some cases, via tree_fits_shwi_p). > Also you need to handle > NULL_TREE

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Alan Lawrence
On 16/09/15 15:28, Bill Schmidt wrote: 2015-09-16 Bill Schmidt * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX, UNSPEC_REDUC_SMIN, UNSPEC_REDUC_UMAX, UNSPEC_REDUC_UMIN, UNSPEC_REDUC_SMAX_SCAL, UNSPEC_REDUC_SMIN_SCAL,

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Alan Lawrence
On 16/09/15 17:10, Bill Schmidt wrote: On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote: On 16/09/15 15:28, Bill Schmidt wrote: 2015-09-16 Bill Schmidt <wschm...@linux.vnet.ibm.com> * config/rs6000/altivec.md (UNSPEC_REDUC_SMAX, UNSPEC_REDU

Re: [PATCH, rs6000] Add expansions for min/max vector reductions

2015-09-16 Thread Alan Lawrence
On 16/09/15 17:19, Bill Schmidt wrote: On Wed, 2015-09-16 at 16:29 +0100, Alan Lawrence wrote: I proposed a patch to migrate PPC off the old patterns, but have forgotten to ping it recently - last at https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01024.html ... (ping?!) Hi Alan, Thanks

Re: [PATCH][AArch64 0/8] Add D-registers to TARGET_ARRAY_MODE_SUPPORTED_P

2015-09-15 Thread Alan Lawrence
Here's a rebased version, which fixes conflicts with float16 and Christophe's fixes for bigendian lane indices. Also fiddled around with whitespace in aarch64-simd.md

[PATCH][AArch64 array_mode 1/8] Rename vec_store_lanes_lane to aarch64_vec_store_lanes_lane

2015-09-15 Thread Alan Lawrence
vec_store_lanes{oi,ci,xi}_lane are not standard pattern names, so using them in aarch64-simd.md is misleading. This adds an aarch64_ prefix to those pattern names, paralleling aarch64_vec_load_lanes_lane. bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog: *

[PATCH][AArch64 array_mode 5/8] Remove V_FOUR_ELEM, again using BLKmode + set_mem_size.

2015-09-15 Thread Alan Lawrence
This removes V_FOUR_ELEM in the same way that patch 3 removed V_THREE_ELEM, again using BLKmode + set_mem_size. (This makes the four-lane expanders very similar to the three-lane expanders, and they will be combined in patch 7.) bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog:

[PATCH][AArch64 array_mode 7/8] Combine the expanders using VSTRUCT:nregs

2015-09-15 Thread Alan Lawrence
The previous patches leave ld[234]_lane, st[234]_lane, and ld[234]r expanders all nearly identical, so we can easily parameterize across the number of lanes and combine them. For the ld_lane pattern, I switched from the VCONQ attribute to just using the MODE attribute, this is identical for

[PATCH][AArch64 array_mode 4/8] Remove EImode

2015-09-15 Thread Alan Lawrence
This removes EImode from the (AArch64) compiler, and all mention of or support for it. bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_simd_attr_length_rglist): Update comment. * config/aarch64/aarch64-builtins.c

[PATCH][AArch64 array_mode 3/8] Stop using EImode in aarch64-simd.md and iterators.md

2015-09-15 Thread Alan Lawrence
The V_THREE_ELEM attribute used BLKmode for most sizes, but occasionally EImode. This patch changes to BLKmode in all cases, explicitly setting memory size (thus, preserving size for the cases that were EImode, and setting size for the first time for cases that were already BLKmode). The patterns

[PATCH][AArch64 array_mode 2/8] Remove VSTRUCT_DREG, use BLKmode for d-reg aarch64_st/ld expands

2015-09-15 Thread Alan Lawrence
aarch64_st and aarch64_ld expanders back onto 12 insns aarch64_{ld,st}{2,3,4}_dreg (for VD and DX modes), using the VSTRUCT_DREG iterator over TI/EI/OI modes to represent the block of memory transferred. Instead, use BLKmode for all memory transfers, explicitly setting mem_size. Bootstrapped and

[PATCH][AArch64 array_mode 6/8] Remove V_TWO_ELEM, again using BLKmode + set_mem_size.

2015-09-15 Thread Alan Lawrence
Same logic as previous; this makes the 2-, 3-, and 4-lane expanders all follow the same pattern. bootstrapped and check-gcc on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_simd_ld2r, aarch64_vec_load_lanesoi_lane,

[PATCH][AArch64 array_mode 8/8] Add d-registers to TARGET_ARRAY_MODE_SUPPORTED_P

2015-09-15 Thread Alan Lawrence
This adds an AARCH64_VALID_SIMD_DREG_MODE exactly paralleling the existing ...QREG... macro. The new test now compiles (at -O3) to: test_1: add v1.2s, v1.2s, v5.2s add v2.2s, v2.2s, v6.2s add v3.2s, v3.2s, v7.2s add v0.2s, v0.2s, v4.2s ret

Re: [PATCH][AArch64 array_mode 7/8] Combine the expanders using VSTRUCT:nregs

2015-09-15 Thread Alan Lawrence
On 15/09/15 10:43, James Greenhalgh wrote: > > It is convenient that this falls out, but likely surprising for nregs. > Please add a comment to nregs explaining the dual use of nregs to represent > both the number of Q registers used for the type, and the number of elements > touched by the

Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-14 Thread Alan Lawrence
Ping. (Rerevert with 5 lines extra paranoia in scalarizable_type_p). Thanks, Alan On 08/09/15 13:43, Martin Jambor wrote: Hi, On Mon, Sep 07, 2015 at 02:15:45PM +0100, Alan Lawrence wrote: In-Reply-To: <55e0697d.2010...@arm.com> On 28/08/15 16:08, Alan Lawrence wrote: Alan Lawrence

Re: [PATCH] vectorizing conditional expressions (PR tree-optimization/65947)

2015-09-14 Thread Alan Lawrence
On 11/09/15 14:19, Bill Schmidt wrote: A secondary concern for powerpc is that REDUC_MAX_EXPR produces a scalar that has to be broadcast back to a vector, and the best way to implement it for us already has the max value in all positions of a vector. But that is something we should be able to

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-10 Thread Alan Lawrence
On 09/09/15 11:31, Alan Lawrence wrote: Hmmm, hang on. I'm not quite sure what the actual issue/bug is here, but is this the same issue as my patch 12 "with BE RTL fix"? (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01482.html, explanation last at https://gcc.gnu.org/ml/gcc-patch

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

2015-09-09 Thread Alan Lawrence
Hmmm, hang on. I'm not quite sure what the actual issue/bug is here, but is this the same issue as my patch 12 "with BE RTL fix"? (https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01482.html, explanation last at https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02365.html) I pushed this as r227551

Re: [PATCH 14/15][ARM/AArch64 Testsuite]Add test of vcvt{,_high}_i{f32_f16,f16_f32}

2015-09-08 Thread Alan Lawrence
Ping. (Thanks, Christophe!) Correct version here: https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01501.html Cheers, Alan On 25/08/15 15:21, Christophe Lyon wrote: On 25 August 2015 at 15:57, Alan Lawrence <alan.lawre...@arm.com> wrote: Sorry - wrong version posted. Th

Re: [PATCH 13/15][ARM/AArch64 Testsuite] Add float16 tests to advsimd-intrinsics testsuite

2015-09-08 Thread Alan Lawrence
Ping. (Thanks, Christophe!). Original message: https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02366.html On 25/08/15 14:28, Alan Lawrence wrote: Christophe Lyon wrote: On 28 July 2015 at 13:26, Alan Lawrence <alan.lawre...@arm.com> wrote: This is a respin of https://gcc.gnu.org/ml/gcc-p

Re: [PATCH][AArch64] Improve code generation for float16 vector code

2015-09-08 Thread Alan Lawrence
On 08/09/15 09:26, James Greenhalgh wrote: On Tue, Sep 08, 2015 at 09:21:08AM +0100, James Greenhalgh wrote: On Mon, Sep 07, 2015 at 02:09:01PM +0100, Alan Lawrence wrote: On 04/09/15 13:32, James Greenhalgh wrote: In that case, these should be implemented as inline assembly blocks

Re: [PATCH 15/15][ARM] Update sourcebuild.texi with testsuite/effective-target hooks

2015-09-08 Thread Alan Lawrence
Original message here: https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02363.html On 28/07/15 12:27, Alan Lawrence wrote: > This documents the change to arm_neon_fp16_ok in the first patch; the addition > of arm_neon_fp16_hw_ok in the last patch; and corrects a cross-reference. > > (I

Re: [PATCH 2/5] completely_scalarize arrays as well as records.

2015-09-07 Thread Alan Lawrence
In-Reply-To: <55e0697d.2010...@arm.com> On 28/08/15 16:08, Alan Lawrence wrote: > Alan Lawrence wrote: >> >> Right. I think VLA's are the problem with pr64312.C also. I'm testing a fix >> (that declares arrays with any of these properties as unscalarizable). > ... &

[PATCH][AArch64] Improve code generation for float16 vector code

2015-09-07 Thread Alan Lawrence
On 04/09/15 13:32, James Greenhalgh wrote: > In that case, these should be implemented as inline assembly blocks. As it > stands, the code generation for these intrinsics will be very poor with this > patch applied. > > I'm going to hold off OKing this until I see a follow-up to fix the code >

Re: [PR64164] drop copyrename, integrate into expand

2015-09-03 Thread Alan Lawrence
On 02/09/15 23:12, Alexandre Oliva wrote: On Sep 2, 2015, Alan Lawrence <alan.lawre...@arm.com> wrote: One more failure to report, I'm afraid. On AArch64 Bigendian, aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from r227348): Thanks. The failure mode was dif

Re: [PR64164] drop copyrename, integrate into expand

2015-09-02 Thread Alan Lawrence
On 14/08/15 19:57, Alexandre Oliva wrote: I'm glad it appears to be working to everyone's satisfaction now. I've just committed it as r226901, with only a context adjustment to account for a change in use_register_for_decl in function.c. /me crosses fingers :-) Here's the patch as checked

Re: [testsuite] Don't xfail gcc.dg/vect/no-scevccp-outer-11.c

2015-09-01 Thread Alan Lawrence
Rainer Orth wrote: It seems that since 20150717, gcc.dg/vect/no-scevccp-outer-11.c XPASSes everywhere: XPASS: gcc.dg/vect/no-scevccp-outer-11.c scan-tree-dump-times vect "OUTER LOOP VECTORIZED." 1 To reduce testsuite noise, I'd like to remove the xfail as follows. Tested with the appropriate

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-28 Thread Alan Lawrence
Christophe Lyon wrote: I asked because I assumed that Alan saw it pass in his configuration. Bah. No - I now discover a problem in my C++ testsuite setup that was causing a large number of tests to not be executed. I see the problem too now, investigating --Alan

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-28 Thread Alan Lawrence
Richard Biener wrote: On Fri, 28 Aug 2015, Alan Lawrence wrote: Christophe Lyon wrote: I asked because I assumed that Alan saw it pass in his configuration. Bah. No - I now discover a problem in my C++ testsuite setup that was causing a large number of tests to not be executed. I see

[PATCH] Tidy tree-ssa-dom.c: Use dom_valueize more.

2015-08-28 Thread Alan Lawrence
The code in the dom_valueize function is duplicated a number of times; so, call the function. Also remove a comment in lookup_avail_expr re const_and_copies, describing one of said duplicates, that looks like it was superceded in r87787. Bootstrapped + check-gcc on x86-none-linux-gnu.

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-28 Thread Alan Lawrence
Alan Lawrence wrote: Right. I think VLA's are the problem with pr64312.C also. I'm testing a fix (that declares arrays with any of these properties as unscalarizable). Monday is a bank holiday in UK and so I expect to get back to you on Tuesday. --Alan In the meantime I've reverted

Fixing sra-12.c (was: Re: [PATCH 2/5] completely_scalarize arrays as well as records)

2015-08-27 Thread Alan Lawrence
Jeff Law wrote: diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c new file mode 100644 index 000..e251058 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-15.c @@ -0,0 +1,38 @@ +/* Verify that SRA total scalarization works on records

Re: [PATCH 1/5] Refactor completely_scalarize_var

2015-08-27 Thread Alan Lawrence
Martin Jambor wrote: If you change what the function does, you have to change the comment too. If I am not mistaken, even with the whole patch set applied, the first sentence would still be: Create total_scalarization accesses for all scalar type fields in VAR and for VAR as a whole. And

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-27 Thread Alan Lawrence
Martin Jambor wrote: First, I would be much happier if you added a proper comment to scalarize_elem function which you forgot completely. The name is not very descriptive and it has quite few parameters too. Second, this patch should also fix PR 67283. It would be great if you could

Re: [PATCH 2/5] completely_scalarize arrays as well as records

2015-08-26 Thread Alan Lawrence
Richard Biener wrote: One extra question is does the way we limit total scalarization work well for arrays? I suppose we have either sth like the maximum size of an aggregate we scalarize or the maximum number of component accesses we create? Only the former and that would be kept intact.

[PATCH][AArch64 array_mode 8/8] Add d-registers to TARGET_ARRAY_MODE_SUPPORTED_P

2015-08-26 Thread Alan Lawrence
This adds an AARCH64_VALID_SIMD_DREG_MODE exactly paralleling the existing ...QREG... macro, and as a driveby fixes mode-(MODE) in the latter. The new test now compiles (at -O3) to: test_1: add v1.2s, v1.2s, v5.2s add v2.2s, v2.2s, v6.2s add v3.2s, v3.2s,

[PATCH][AArch64 array_mode 3/8] Stop using EImode in aarch64-simd.md and iterators.md

2015-08-26 Thread Alan Lawrence
The V_THREE_ELEM attribute used BLKmode for most sizes, but occasionally EImode. This patch changes to BLKmode in all cases, explicitly setting memory size (thus, preserving size for the cases that were EImode, and setting size for the first time for cases that were already BLKmode). The patterns

[PATCH][AArch64 array_mode 5/8] Remove V_FOUR_ELEM, again using BLKmode + set_mem_size.

2015-08-26 Thread Alan Lawrence
This removes V_FOUR_ELEM in the same way that patch 3 removed V_THREE_ELEM, again using BLKmode + set_mem_size. (This makes the four-lane expanders very similar to the three-lane expanders, and they will be combined in patch 7.) bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog:

[PATCH][AArch64 array_mode 4/8] Remove EImode

2015-08-26 Thread Alan Lawrence
This removes EImode from the (AArch64) compiler, and all mention of or support for it. bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog: * config/aarch64/aarch64.c (aarch64_simd_attr_length_rglist): Update comment. * config/aarch64/aarch64-builtins.c

[PATCH][AArch64 array_mode 2/8] Remove VSTRUCT_DREG, use BLKmode for d-reg aarch64_st/ld expands

2015-08-26 Thread Alan Lawrence
aarch64_stVSTRUCT:nregsVDC:mode and aarch64_ldVSTRUCT:nregsVDC:mode expanders back onto 12 insns aarch64_{ld,st}{2,3,4}mode_dreg (for VD and DX modes), using the VSTRUCT_DREG iterator over TI/EI/OI modes to represent the block of memory transferred. Instead, use BLKmode for all memory transfers,

[PATCH][AArch64 array_mode 6/8] Remove V_TWO_ELEM, again using BLKmode + set_mem_size.

2015-08-26 Thread Alan Lawrence
Same logic as previous; this makes the 2-, 3-, and 4-lane expanders all follow the same pattern. bootstrapped and check-gcc on aarch64-none-linux-gnu. gcc/ChangeLog: * config/aarch64/aarch64-simd.md (aarch64_simd_ld2rmode, aarch64_vec_load_lanesoi_lanemode,

[PATCH][AArch64 0/8] Add D-registers to TARGET_ARRAY_MODE_SUPPORTED_P

2015-08-26 Thread Alan Lawrence
The end goal of this series of patches is to enable 64bit vector modes for TARGET_ARRAY_MODE_SUPPORTED_P, achieved in the last patch. At present, doing so causes ICEs with illegal subregs (e.g. returning the middle bits from a large int mode covering 3 vectors); the patchset avoids these by first

[PATCH][AArch64 array_mode 1/8] Rename vec_store_lanesmode_lane to aarch64_vec_store_lanesmode_lane

2015-08-26 Thread Alan Lawrence
vec_store_lanes{oi,ci,xi}_lane are not standard pattern names, so using them in aarch64-simd.md is misleading. This adds an aarch64_ prefix to those pattern names, paralleling aarch64_vec_load_lanesmode_lane. bootstrapped and check-gcc on aarch64-none-linux-gnu gcc/ChangeLog: *

[PATCH][AArch64 array_mode 7/8] Combine the expanders using VSTRUCT:nregs

2015-08-26 Thread Alan Lawrence
The previous patches leave ld[234]_lane, st[234]_lane, and ld[234]r expanders all nearly identical, so we can easily parameterize across the number of lanes and combine them. For the ldVSTRUCT:nregs_lane pattern, I switched from the VCONQ attribute to just using the MODE attribute, this is

Re: [RFC 4/5] Handle constant-pool entries

2015-08-26 Thread Alan Lawrence
Jeff Law wrote: The question I have is why this differs from the effects of patch #5. That would seem to indicate that there's things we're not getting into the candidate tables with this approach?!? I'll answer this first, as I think (Richard and) Martin have identified enough other

Re: [PATCH 14/15][ARM/AArch64 Testsuite]Add test of vcvt{,_high}_i{f32_f16,f16_f32}

2015-08-25 Thread Alan Lawrence
Sorry - wrong version posted. The hunk for add_options_for_arm_neon_fp16 has moved to the previous patch! This version also fixes some whitespace issues. gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/vcvt_f16.c: New. * lib/target-supports.exp

Re: [PATCH 14/15][ARM/AArch64 Testsuite]Add test of vcvt{,_high}_{f16_f32,f32_f16}

2015-08-25 Thread Alan Lawrence
Christophe Lyon wrote: On 28 July 2015 at 13:27, Alan Lawrence alan.lawre...@arm.com wrote: gcc/testsuite/ChangeLog: * gcc.target/aarch64/advsimd-intrinsics/advsimd-intrinsics.exp: set additional flags for neon-fp16 support. * gcc.target/aarch64/advsimd-intrinsics

Re: [PATCH 12/15][AArch64] Add vcvt(_high)?_f32_f16 intrinsics, with BE RTL fix

2015-08-25 Thread Alan Lawrence
James Greenhalgh wrote: - VAR1 (UNOP, vec_unpacks_hi_, 10, v4sf) + VAR2 (UNOP, vec_unpacks_hi_, 10, v4sf, v8hf) Should this not use the appropriate BUILTIN_... iterator? Indeed; BUILTIN_VQ_HSF it is. VAR1 (BINOP, float_truncate_hi_, 0, v4sf) VAR1 (BINOP, float_truncate_hi_, 0,

[PATCH 0/5][tree-sra.c] PR/63679 Make SRA replace constant pool loads

2015-08-25 Thread Alan Lawrence
ssa-dom-cse-2.c fails on a number of platforms because the input array is pushed out to the constant pool, preventing later stages from folding away the entire computation. This patch series fixes the failure by extending SRA to pull the constants back in. This is my first patch(set) to SRA and

[RFC 4/5] Handle constant-pool entries

2015-08-25 Thread Alan Lawrence
This makes SRA replace loads of records/arrays from constant pool entries, with elementwise assignments of the constant values, hence, overcoming the fundamental problem in PR/63679. As a first pass, the approach I took was to look for constant-pool loads as we scanned through other accesses, and

[RFC 5/5] Always completely replace constant pool entries

2015-08-25 Thread Alan Lawrence
I used this as a means of better-testing the previous changes, as it exercises the constant replacement code a whole lot more. Indeed, quite a few tests are now optimized away to nothing on AArch64... Always pulling in constants, is almost certainly not what we want, but we may nonetheless want

[PATCH 2/5] completely_scalarize arrays as well as records

2015-08-25 Thread Alan Lawrence
This changes the completely_scalarize_record path to also work on arrays (thus allowing records containing arrays, etc.). This just required extending the existing type_consists_of_records_p and completely_scalarize_record methods to handle things of ARRAY_TYPE as well as RECORD_TYPE. Hence, I

[PATCH 3/5] Build ARRAY_REFs when the base is of ARRAY_TYPE.

2015-08-25 Thread Alan Lawrence
When SRA completely scalarizes an array, this patch changes the generated accesses from e.g. MEM[(int[8] *)a + 4B] = 1; to a[1] = 1; This overcomes a limitation in dom2, that accesses to equivalent chunks of e.g. MEM[(int[8] *)a] are not hashable_expr_equal_p with accesses to e.g.

[PATCH 1/5] Refactor completely_scalarize_var

2015-08-25 Thread Alan Lawrence
This is a small refactoring/renaming patch, it just moves the call to completely_scalarize_record out from completely_scalarize_var, and renames the latter to create_total_scalarization_access. This is because the next patch needs to drop the _record suffix and I felt it would be confusing to

Re: [PATCH 0/15][ARM/AArch64] Add support for float16_t vectors (v3)

2015-08-25 Thread Alan Lawrence
Alan Lawrence wrote: All AArch64 patches are unchanged from previous version. However, in response to discussion, the ARM patches are changed (much as I suggested https://gcc.gnu.org/ml/gcc-patches/2015-07/msg02249.html); this version: * Hides the existing vcvt_f16_f32 and vcvt_f32_f16

<    1   2   3   4   5   6   >