Re: [PATCH][ARM] NEON DImode immediate constants
On 27/04/12 16:17, Richard Earnshaw wrote: This is OK. Thanks, now committed. It would be good to merge all the target32 movdi variants into one pattern and then use alternative enabling to deal with the different valid alternatives. Yes, I'll take a look. Andrew
Re: [PATCH][ARM] NEON DImode neg
On 16/04/12 13:41, Richard Earnshaw wrote: P.S. This patch can't actually be committed until my NEON DImode immediate constants patch is approved and committed. (Without that the load #0 needs a constant pool, and loading constants this late has a bug at -O0.) neon-neg64.patch 2012-04-12 Andrew Stubbsa...@codesourcery.com gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. OK Thanks, now that the other patch is committed, I've committed this one also. Andrew
Re: [PATCH][ARM] NEON DImode immediate constants
Ping. On 10/04/12 14:00, Andrew Stubbs wrote: Ping. On 30/03/12 12:15, Andrew Stubbs wrote: On 28/02/12 16:20, Andrew Stubbs wrote: Hi all, This patch implements 64-bit immediate constant loads in NEON. The current state is that you can load const_vector, but not const_int. This is clearly not ideal. The result is a constant pool entry when it's not necessary. The patch disables the movdi_vfp patterns for loading DImode values, if the operand is const_int and NEON is enabled, and extends the neon_mov pattern to include DImode const_int, as well as the const_vector operands. I've modified neon_valid_immediate only enough to accept const_int input - the logic remains untouched. That patch failed to bootstrap successfully, but this updated patch bootstraps and tests with no regressions. OK? Andrew
Re: [PATCH][ARM] NEON DImode immediate constants
On 30/03/12 12:15, Andrew Stubbs wrote: On 28/02/12 16:20, Andrew Stubbs wrote: Hi all, This patch implements 64-bit immediate constant loads in NEON. The current state is that you can load const_vector, but not const_int. This is clearly not ideal. The result is a constant pool entry when it's not necessary. The patch disables the movdi_vfp patterns for loading DImode values, if the operand is const_int and NEON is enabled, and extends the neon_mov pattern to include DImode const_int, as well as the const_vector operands. I've modified neon_valid_immediate only enough to accept const_int input - the logic remains untouched. That patch failed to bootstrap successfully, but this updated patch bootstraps and tests with no regressions. OK? Sorry for the delay. This is OK. It would be good to merge all the target32 movdi variants into one pattern and then use alternative enabling to deal with the different valid alternatives. R. Andrew neon-loadimm64.patch 2012-03-27 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.c (neon_valid_immediate): Allow const_int. (arm_print_operand): Add 'x' format. * config/arm/constraints.md (Dn): Allow const_int. * config/arm/neon.md (neon_movmode): Use VDX to allow DImode. Use 'x' format to print constants. * config/arm/predicates.md (imm_for_neon_mov_operand): Allow const_int. * config/arm/vfp.md (movdi_vfp): Disable for const_int when neon is enabled. (movdi_vfp_cortexa8): Likewise. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0bded8d..492ddde 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -8873,11 +8873,25 @@ neon_valid_immediate (rtx op, enum machine_mode mode, int inverse, break; \ } - unsigned int i, elsize = 0, idx = 0, n_elts = CONST_VECTOR_NUNITS (op); - unsigned int innersize = GET_MODE_SIZE (GET_MODE_INNER (mode)); + unsigned int i, elsize = 0, idx = 0, n_elts; + unsigned int innersize; unsigned char bytes[16]; int immtype = -1, matches; unsigned int invmask = inverse ? 0xff : 0; + bool vector = GET_CODE (op) == CONST_VECTOR; + + if (vector) +{ + n_elts = CONST_VECTOR_NUNITS (op); + innersize = GET_MODE_SIZE (GET_MODE_INNER (mode)); +} + else +{ + n_elts = 1; + if (mode == VOIDmode) + mode = DImode; + innersize = GET_MODE_SIZE (mode); +} /* Vectors of float constants. */ if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT) @@ -8913,7 +8927,7 @@ neon_valid_immediate (rtx op, enum machine_mode mode, int inverse, /* Splat vector constant out into a byte vector. */ for (i = 0; i n_elts; i++) { - rtx el = CONST_VECTOR_ELT (op, i); + rtx el = vector ? CONST_VECTOR_ELT (op, i) : op; unsigned HOST_WIDE_INT elpart; unsigned int part, parts; @@ -17230,6 +17244,19 @@ arm_print_operand (FILE *stream, rtx x, int code) } return; +/* An integer that we want to print in HEX. */ +case 'x': + switch (GET_CODE (x)) + { + case CONST_INT: + fprintf (stream, # HOST_WIDE_INT_PRINT_HEX, INTVAL (x)); + break; + + default: + output_operand_lossage (Unsupported operand for code '%c', code); + } + return; + case 'B': if (GET_CODE (x) == CONST_INT) { diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md index 7d0269a..68979c1 100644 --- a/gcc/config/arm/constraints.md +++ b/gcc/config/arm/constraints.md @@ -255,9 +255,9 @@ (define_constraint Dn @internal - In ARM/Thumb-2 state a const_vector which can be loaded with a Neon vmov - immediate instruction. - (and (match_code const_vector) + In ARM/Thumb-2 state a const_vector or const_int which can be loaded with a + Neon vmov immediate instruction. + (and (match_code const_vector,const_int) (match_test TARGET_32BIT imm_for_neon_mov_operand (op, GET_MODE (op) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index d7caa37..3c88568 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -152,9 +152,9 @@ (define_attr vqh_mnem vadd,vmin,vmax (const_string vadd)) (define_insn *neon_movmode - [(set (match_operand:VD 0 nonimmediate_operand + [(set (match_operand:VDX 0 nonimmediate_operand =w,Uv,w, w, ?r,?w,?r,?r, ?Us) - (match_operand:VD 1 general_operand + (match_operand:VDX 1 general_operand w,w, Dn,Uvi, w, r, r, Usi,r))] TARGET_NEON (register_operand (operands[0], MODEmode) @@ -173,7 +173,7 @@ if (width == 0) return vmov.f32\t%P0, %1 @ mode; else -sprintf (templ, vmov.i%d\t%%P0, %%1 @ mode, width); +sprintf (templ, vmov.i%d\t%%P0, %%x1 @ mode, width); return templ; } diff --git
Re: [PATCH][ARM] NEON DImode neg
On 14/04/12 14:11, Andrew Stubbs wrote: And now with the patch. :( On 14/04/12 13:48, Andrew Stubbs wrote: On 12/04/12 16:48, Richard Earnshaw wrote: If negation in Neon needs a scratch register, it seems to me to be somewhat odd that we're disparaging the ARM version. Also, wouldn't it be sensible to support a variant that was early-clobber on operand 0, but loaded immediate zero into that value first: vmov Dd, #0 vsub Dd, Dd, Dm That way you'll never need more than two registers, whereas today you want three. This patch implements the changes you suggested. I've done a full bootstrap and test and found no regressions. OK? Andrew P.S. This patch can't actually be committed until my NEON DImode immediate constants patch is approved and committed. (Without that the load #0 needs a constant pool, and loading constants this late has a bug at -O0.) neon-neg64.patch 2012-04-12 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. OK R.
Re: [PATCH][ARM] NEON DImode neg
On 12/04/12 16:48, Richard Earnshaw wrote: If negation in Neon needs a scratch register, it seems to me to be somewhat odd that we're disparaging the ARM version. Also, wouldn't it be sensible to support a variant that was early-clobber on operand 0, but loaded immediate zero into that value first: vmovDd, #0 vsubDd, Dd, Dm That way you'll never need more than two registers, whereas today you want three. This patch implements the changes you suggested. I've done a full bootstrap and test and found no regressions. OK? Andrew P.S. This patch can't actually be committed until my NEON DImode immediate constants patch is approved and committed. (Without that the load #0 needs a constant pool, and loading constants this late has a bug at -O0.)
Re: [PATCH][ARM] NEON DImode neg
And now with the patch. :( On 14/04/12 13:48, Andrew Stubbs wrote: On 12/04/12 16:48, Richard Earnshaw wrote: If negation in Neon needs a scratch register, it seems to me to be somewhat odd that we're disparaging the ARM version. Also, wouldn't it be sensible to support a variant that was early-clobber on operand 0, but loaded immediate zero into that value first: vmov Dd, #0 vsub Dd, Dd, Dm That way you'll never need more than two registers, whereas today you want three. This patch implements the changes you suggested. I've done a full bootstrap and test and found no regressions. OK? Andrew P.S. This patch can't actually be committed until my NEON DImode immediate constants patch is approved and committed. (Without that the load #0 needs a constant pool, and loading constants this late has a bug at -O0.) 2012-04-12 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..f1dbbf7 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4048,7 +4048,13 @@ (neg:DI (match_operand:DI 1 s_register_operand ))) (clobber (reg:CC CC_REGNUM))])] TARGET_EITHER - + { +if (TARGET_NEON) + { +emit_insn (gen_negdi2_neon (operands[0], operands[1])); + DONE; + } + } ) ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1). diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 3c88568..8c8b02d 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -922,6 +922,45 @@ (const_string neon_int_3)))] ) +(define_insn negdi2_neon + [(set (match_operand:DI 0 s_register_operand =w, w,r,r) + (neg:DI (match_operand:DI 1 s_register_operand w, w,0, r))) + (clobber (match_scratch:DI 2 = X,w,X, X)) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON + # + [(set_attr length 8)] +) + +; Split negdi2_neon for vfp registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON reload_completed IS_VFP_REGNUM (REGNO (operands[0])) + [(set (match_dup 2) (const_int 0)) + (parallel [(set (match_dup 0) (minus:DI (match_dup 2) (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + { +if (!REG_P (operands[2])) + operands[2] = operands[0]; + } +) + +; Split negdi2_neon for core registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_32BIT reload_completed +arm_general_register_operand (operands[0], DImode) + [(parallel [(set (match_dup 0) (neg:DI (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + +) + (define_insn *uminmode3_neon [(set (match_operand:VDQIW 0 s_register_operand =w) (umin:VDQIW (match_operand:VDQIW 1 s_register_operand w)
Re: [PATCH][ARM] NEON DImode neg
Ping. On 26/03/12 11:14, Andrew Stubbs wrote: On 28/02/12 17:45, Andrew Stubbs wrote: Hi all, This patch adds a DImode negate pattern for NEON. Unfortunately, the NEON vneg instruction only supports vectors, not singletons, so there's no direct way to do it in DImode, and the compiler ends up moving the value back to core registers, negating it, and returning to NEON afterwards: fmrrd r2, r3, d16 @ int negs r2, r2 sbc r3, r3, r3, lsl #1 fmdrr d16, r2, r3 @ int The new patch does it entirely in NEON: vmov.i32 d17, #0 @ di vsub.i64 d16, d17, d16 (Note that this is the result when combined with my recent patch for NEON DImode immediates. Without that you get a constant pool load.) This updates fixes a bootstrap failure caused by an early clobber error. I've also got a native regression test running now. OK? Andrew
Re: [PATCH][ARM] NEON DImode neg
On 26/03/12 11:14, Andrew Stubbs wrote: On 28/02/12 17:45, Andrew Stubbs wrote: Hi all, This patch adds a DImode negate pattern for NEON. Unfortunately, the NEON vneg instruction only supports vectors, not singletons, so there's no direct way to do it in DImode, and the compiler ends up moving the value back to core registers, negating it, and returning to NEON afterwards: fmrrd r2, r3, d16 @ int negs r2, r2 sbc r3, r3, r3, lsl #1 fmdrr d16, r2, r3 @ int The new patch does it entirely in NEON: vmov.i32 d17, #0 @ di vsub.i64 d16, d17, d16 (Note that this is the result when combined with my recent patch for NEON DImode immediates. Without that you get a constant pool load.) This updates fixes a bootstrap failure caused by an early clobber error. I've also got a native regression test running now. OK? Andrew neon-neg64.patch 2012-03-26 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. --- gcc/config/arm/arm.md |8 +++- gcc/config/arm/neon.md | 37 + 2 files changed, 44 insertions(+), 1 deletions(-) diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..f1dbbf7 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4048,7 +4048,13 @@ (neg:DI (match_operand:DI 1 s_register_operand ))) (clobber (reg:CC CC_REGNUM))])] TARGET_EITHER - + { +if (TARGET_NEON) + { +emit_insn (gen_negdi2_neon (operands[0], operands[1])); + DONE; + } + } ) ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1). diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 3c88568..bf229a7 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -922,6 +922,43 @@ (const_string neon_int_3)))] ) +(define_insn negdi2_neon + [(set (match_operand:DI 0 s_register_operand = w,?r,?r,?w) + (neg:DI (match_operand:DI 1 s_register_operand w, 0, r, w))) + (clobber (match_scratch:DI 2 =w, X, X,w)) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON + # + [(set_attr length 8) + (set_attr arch nota8,*,*,onlya8)] +) + If negation in Neon needs a scratch register, it seems to me to be somewhat odd that we're disparaging the ARM version. Also, wouldn't it be sensible to support a variant that was early-clobber on operand 0, but loaded immediate zero into that value first: vmovDd, #0 vsubDd, Dd, Dm That way you'll never need more than two registers, whereas today you want three. R.
Re: [PATCH][ARM] NEON DImode neg
On 12/04/12 16:48, Richard Earnshaw wrote: If negation in Neon needs a scratch register, it seems to me to be somewhat odd that we're disparaging the ARM version. Good point, I'm really meaning to provide missing functionality in NEON to prevent unnecessary moves, rather than cause them, so I should remove the '?'s. It probably doesn't need disparaging on A8, etiher? Also, wouldn't it be sensible to support a variant that was early-clobber on operand 0, but loaded immediate zero into that value first: vmovDd, #0 vsubDd, Dd, Dm That way you'll never need more than two registers, whereas today you want three. That does seem like a good option. I'll have a go and see what happens. Andrew
Re: [PATCH][ARM] NEON DImode neg
On 12/04/12 17:16, Andrew Stubbs wrote: On 12/04/12 16:48, Richard Earnshaw wrote: If negation in Neon needs a scratch register, it seems to me to be somewhat odd that we're disparaging the ARM version. Good point, I'm really meaning to provide missing functionality in NEON to prevent unnecessary moves, rather than cause them, so I should remove the '?'s. It probably doesn't need disparaging on A8, etiher? Possibly not. Which then begs the question as to whether we need an A8 variant at all. Also, wouldn't it be sensible to support a variant that was early-clobber on operand 0, but loaded immediate zero into that value first: vmovDd, #0 vsubDd, Dd, Dm That way you'll never need more than two registers, whereas today you want three. That does seem like a good option. I'll have a go and see what happens. Andrew
Re: [PATCH][ARM] NEON DImode immediate constants
Ping. On 30/03/12 12:15, Andrew Stubbs wrote: On 28/02/12 16:20, Andrew Stubbs wrote: Hi all, This patch implements 64-bit immediate constant loads in NEON. The current state is that you can load const_vector, but not const_int. This is clearly not ideal. The result is a constant pool entry when it's not necessary. The patch disables the movdi_vfp patterns for loading DImode values, if the operand is const_int and NEON is enabled, and extends the neon_mov pattern to include DImode const_int, as well as the const_vector operands. I've modified neon_valid_immediate only enough to accept const_int input - the logic remains untouched. That patch failed to bootstrap successfully, but this updated patch bootstraps and tests with no regressions. OK? Andrew
Re: [PATCH][ARM] NEON DImode immediate constants
On 28/02/12 16:20, Andrew Stubbs wrote: Hi all, This patch implements 64-bit immediate constant loads in NEON. The current state is that you can load const_vector, but not const_int. This is clearly not ideal. The result is a constant pool entry when it's not necessary. The patch disables the movdi_vfp patterns for loading DImode values, if the operand is const_int and NEON is enabled, and extends the neon_mov pattern to include DImode const_int, as well as the const_vector operands. I've modified neon_valid_immediate only enough to accept const_int input - the logic remains untouched. That patch failed to bootstrap successfully, but this updated patch bootstraps and tests with no regressions. OK? Andrew 2012-03-27 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.c (neon_valid_immediate): Allow const_int. (arm_print_operand): Add 'x' format. * config/arm/constraints.md (Dn): Allow const_int. * config/arm/neon.md (neon_movmode): Use VDX to allow DImode. Use 'x' format to print constants. * config/arm/predicates.md (imm_for_neon_mov_operand): Allow const_int. * config/arm/vfp.md (movdi_vfp): Disable for const_int when neon is enabled. (movdi_vfp_cortexa8): Likewise. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 0bded8d..492ddde 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -8873,11 +8873,25 @@ neon_valid_immediate (rtx op, enum machine_mode mode, int inverse, break; \ } - unsigned int i, elsize = 0, idx = 0, n_elts = CONST_VECTOR_NUNITS (op); - unsigned int innersize = GET_MODE_SIZE (GET_MODE_INNER (mode)); + unsigned int i, elsize = 0, idx = 0, n_elts; + unsigned int innersize; unsigned char bytes[16]; int immtype = -1, matches; unsigned int invmask = inverse ? 0xff : 0; + bool vector = GET_CODE (op) == CONST_VECTOR; + + if (vector) +{ + n_elts = CONST_VECTOR_NUNITS (op); + innersize = GET_MODE_SIZE (GET_MODE_INNER (mode)); +} + else +{ + n_elts = 1; + if (mode == VOIDmode) + mode = DImode; + innersize = GET_MODE_SIZE (mode); +} /* Vectors of float constants. */ if (GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT) @@ -8913,7 +8927,7 @@ neon_valid_immediate (rtx op, enum machine_mode mode, int inverse, /* Splat vector constant out into a byte vector. */ for (i = 0; i n_elts; i++) { - rtx el = CONST_VECTOR_ELT (op, i); + rtx el = vector ? CONST_VECTOR_ELT (op, i) : op; unsigned HOST_WIDE_INT elpart; unsigned int part, parts; @@ -17230,6 +17244,19 @@ arm_print_operand (FILE *stream, rtx x, int code) } return; +/* An integer that we want to print in HEX. */ +case 'x': + switch (GET_CODE (x)) + { + case CONST_INT: + fprintf (stream, # HOST_WIDE_INT_PRINT_HEX, INTVAL (x)); + break; + + default: + output_operand_lossage (Unsupported operand for code '%c', code); + } + return; + case 'B': if (GET_CODE (x) == CONST_INT) { diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md index 7d0269a..68979c1 100644 --- a/gcc/config/arm/constraints.md +++ b/gcc/config/arm/constraints.md @@ -255,9 +255,9 @@ (define_constraint Dn @internal - In ARM/Thumb-2 state a const_vector which can be loaded with a Neon vmov - immediate instruction. - (and (match_code const_vector) + In ARM/Thumb-2 state a const_vector or const_int which can be loaded with a + Neon vmov immediate instruction. + (and (match_code const_vector,const_int) (match_test TARGET_32BIT imm_for_neon_mov_operand (op, GET_MODE (op) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index d7caa37..3c88568 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -152,9 +152,9 @@ (define_attr vqh_mnem vadd,vmin,vmax (const_string vadd)) (define_insn *neon_movmode - [(set (match_operand:VD 0 nonimmediate_operand + [(set (match_operand:VDX 0 nonimmediate_operand =w,Uv,w, w, ?r,?w,?r,?r, ?Us) - (match_operand:VD 1 general_operand + (match_operand:VDX 1 general_operand w,w, Dn,Uvi, w, r, r, Usi,r))] TARGET_NEON (register_operand (operands[0], MODEmode) @@ -173,7 +173,7 @@ if (width == 0) return vmov.f32\t%P0, %1 @ mode; else -sprintf (templ, vmov.i%d\t%%P0, %%1 @ mode, width); +sprintf (templ, vmov.i%d\t%%P0, %%x1 @ mode, width); return templ; } diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md index b535335..8a8a1f1 100644 --- a/gcc/config/arm/predicates.md +++ b/gcc/config/arm/predicates.md @@ -630,7 +630,7 @@ }) (define_predicate imm_for_neon_mov_operand - (match_code const_vector) + (match_code const_vector,const_int) { return neon_immediate_valid_for_move (op, mode, NULL, NULL); }) diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md index 6530570..2061414 100644 --- a/gcc/config/arm/vfp.md +++ b/gcc/config/arm/vfp.md @@ -138,7 +138,9 @@
Re: [PATCH][ARM] NEON DImode not
On 08/03/12 18:03, Richard Henderson wrote: On 03/08/12 08:19, Andrew Stubbs wrote: + (set_attr arch nota8,*,*,onlya8) + (set_attr_alternative insn_enabled + [(if_then_else (match_test TARGET_NEON) + (const_string yes) (const_string no)) +(const_string yes) +(const_string yes) +(if_then_else (match_test TARGET_NEON) + (const_string yes) (const_string no))])] ) While this works, it might be better to add neon/neon_na8/neon_oa8 alternatives to the arch attribute, and adjust arch_enabled to match. Obviously this opinion is non-binding; Richard E might have other plans... No reply from Richard so far ... so here's an update. OK now? Andrew 2012-03-27 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (arch): Add neon_onlya8 and neon_nota8. (arch_enabled): Handle new arch types. (one_cmpldi2): Add NEON support. diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..6669329 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -207,7 +207,7 @@ ; for ARM or Thumb-2 with arm_arch6, and nov6 for ARM without ; arm_arch6. This attribute is used to compute attribute enabled, ; use type any to enable an alternative in all cases. -(define_attr arch any,a,t,32,t1,t2,v6,nov6,onlya8,nota8 +(define_attr arch any,a,t,32,t1,t2,v6,nov6,onlya8,neon_onlya8,nota8,neon_nota8 (const_string any)) (define_attr arch_enabled no,yes @@ -246,8 +246,18 @@ (eq_attr tune cortexa8)) (const_string yes) + (and (eq_attr arch neon_onlya8) + (eq_attr tune cortexa8) + (match_test TARGET_NEON)) + (const_string yes) + (and (eq_attr arch nota8) (not (eq_attr tune cortexa8))) + (const_string yes) + + (and (eq_attr arch neon_nota8) + (not (eq_attr tune cortexa8)) + (match_test TARGET_NEON)) (const_string yes)] (const_string no))) @@ -4207,11 +4217,16 @@ ) (define_insn_and_split one_cmpldi2 - [(set (match_operand:DI 0 s_register_operand =r,r) - (not:DI (match_operand:DI 1 s_register_operand 0,r)))] + [(set (match_operand:DI 0 s_register_operand =w,r,r,?w) + (not:DI (match_operand:DI 1 s_register_operand w, 0, r, w)))] TARGET_32BIT - # - TARGET_32BIT reload_completed + @ + vmvn\t%P0, %P1 + # + # + vmvn\t%P0, %P1 + TARGET_32BIT reload_completed +arm_general_register_operand (operands[0], DImode) [(set (match_dup 0) (not:SI (match_dup 1))) (set (match_dup 2) (not:SI (match_dup 3)))] @@ -4221,8 +4236,10 @@ operands[3] = gen_highpart (SImode, operands[1]); operands[1] = gen_lowpart (SImode, operands[1]); } - [(set_attr length 8) - (set_attr predicable yes)] + [(set_attr length *,8,8,*) + (set_attr predicable yes) + (set_attr neon_type neon_int_1,*,*,neon_int_1) + (set_attr arch neon_nota8,*,*,neon_onlya8)] ) (define_expand one_cmplsi2
Re: [PATCH][ARM] NEON DImode not
On 03/27/12 13:23, Andrew Stubbs wrote: gcc/ * config/arm/arm.md (arch): Add neon_onlya8 and neon_nota8. (arch_enabled): Handle new arch types. (one_cmpldi2): Add NEON support. Looks good to me. r~
Re: [PATCH][ARM] NEON DImode neg
On 28/02/12 17:45, Andrew Stubbs wrote: Hi all, This patch adds a DImode negate pattern for NEON. Unfortunately, the NEON vneg instruction only supports vectors, not singletons, so there's no direct way to do it in DImode, and the compiler ends up moving the value back to core registers, negating it, and returning to NEON afterwards: fmrrd r2, r3, d16 @ int negs r2, r2 sbc r3, r3, r3, lsl #1 fmdrr d16, r2, r3 @ int The new patch does it entirely in NEON: vmov.i32 d17, #0 @ di vsub.i64 d16, d17, d16 (Note that this is the result when combined with my recent patch for NEON DImode immediates. Without that you get a constant pool load.) This updates fixes a bootstrap failure caused by an early clobber error. I've also got a native regression test running now. OK? Andrew 2012-03-26 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. --- gcc/config/arm/arm.md |8 +++- gcc/config/arm/neon.md | 37 + 2 files changed, 44 insertions(+), 1 deletions(-) diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..f1dbbf7 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4048,7 +4048,13 @@ (neg:DI (match_operand:DI 1 s_register_operand ))) (clobber (reg:CC CC_REGNUM))])] TARGET_EITHER - + { +if (TARGET_NEON) + { +emit_insn (gen_negdi2_neon (operands[0], operands[1])); + DONE; + } + } ) ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1). diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 3c88568..bf229a7 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -922,6 +922,43 @@ (const_string neon_int_3)))] ) +(define_insn negdi2_neon + [(set (match_operand:DI 0 s_register_operand = w,?r,?r,?w) + (neg:DI (match_operand:DI 1 s_register_operand w, 0, r, w))) + (clobber (match_scratch:DI 2 =w, X, X,w)) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON + # + [(set_attr length 8) + (set_attr arch nota8,*,*,onlya8)] +) + +; Split negdi2_neon for vfp registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON reload_completed IS_VFP_REGNUM (REGNO (operands[0])) + [(set (match_dup 2) (const_int 0)) + (parallel [(set (match_dup 0) (minus:DI (match_dup 2) (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + +) + +; Split negdi2_neon for core registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_32BIT reload_completed +arm_general_register_operand (operands[0], DImode) + [(parallel [(set (match_dup 0) (neg:DI (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + +) + (define_insn *uminmode3_neon [(set (match_operand:VDQIW 0 s_register_operand =w) (umin:VDQIW (match_operand:VDQIW 1 s_register_operand w)
Re: [PATCH][ARM] NEON DImode not
On 01/03/12 17:07, Richard Earnshaw wrote: The RTL part of one_cmpldi2_internal and one_cmpldi2_neon are the same. Given that we now have controls to determine when an alternative is enabled it's generally better to have just one pattern here and turn on the alternatives that are suitable rather than having multiple patterns. You're already half doing this with the nota8 and onlya8 controls. Ok, this patch unifies the two and emits the NEON instructions directly from the arm.md pattern. I was under the impression that it is desirable to keep the neon stuff in neon.md as far as possible. OK? Andrew P.S. The insn_enabled code is not ideal, but the 'arch' attribute that might normally deal with this is already in use. Alternative ways might be to have a variation on the 'w' constraint ('W'?) that is conditional on TARGET_NEON, or to have compound arch values (e.g. onlya8_neon)? 2012-03-08 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (one_cmpldi2): Add NEON support. --- gcc/config/arm/arm.md | 26 -- 1 files changed, 20 insertions(+), 6 deletions(-) diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..27a0f81 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4207,11 +4207,16 @@ ) (define_insn_and_split one_cmpldi2 - [(set (match_operand:DI 0 s_register_operand =r,r) - (not:DI (match_operand:DI 1 s_register_operand 0,r)))] + [(set (match_operand:DI 0 s_register_operand =w,r,r,?w) + (not:DI (match_operand:DI 1 s_register_operand w, 0, r, w)))] TARGET_32BIT - # - TARGET_32BIT reload_completed + @ + vmvn\t%P0, %P1 + # + # + vmvn\t%P0, %P1 + TARGET_32BIT reload_completed +arm_general_register_operand (operands[0], DImode) [(set (match_dup 0) (not:SI (match_dup 1))) (set (match_dup 2) (not:SI (match_dup 3)))] @@ -4221,8 +4226,17 @@ operands[3] = gen_highpart (SImode, operands[1]); operands[1] = gen_lowpart (SImode, operands[1]); } - [(set_attr length 8) - (set_attr predicable yes)] + [(set_attr length *,8,8,*) + (set_attr predicable yes) + (set_attr neon_type neon_int_1,*,*,neon_int_1) + (set_attr arch nota8,*,*,onlya8) + (set_attr_alternative insn_enabled + [(if_then_else (match_test TARGET_NEON) + (const_string yes) (const_string no)) + (const_string yes) + (const_string yes) + (if_then_else (match_test TARGET_NEON) + (const_string yes) (const_string no))])] ) (define_expand one_cmplsi2
Re: [PATCH][ARM] NEON DImode not
On 03/08/12 08:19, Andrew Stubbs wrote: + (set_attr arch nota8,*,*,onlya8) + (set_attr_alternative insn_enabled + [(if_then_else (match_test TARGET_NEON) +(const_string yes) (const_string no)) + (const_string yes) + (const_string yes) + (if_then_else (match_test TARGET_NEON) +(const_string yes) (const_string no))])] ) While this works, it might be better to add neon/neon_na8/neon_oa8 alternatives to the arch attribute, and adjust arch_enabled to match. Obviously this opinion is non-binding; Richard E might have other plans... r~
Re: [PATCH][ARM] NEON DImode not
On Wed 29 Feb 2012 18:00:19 GMT, Richard Earnshaw wrote: Why can't we have a single insn that deals with the have-neon and dont-have-neon cases? Sorry, I'm not sure I follow? There's one insn for the have-neon case, and one for the don't-have-neon. The expander is necessary to prevent gen_one_cmpldi2 locking recog to a disabled pattern (it caches the recog result, I think). It would be possible to have the arm.md insn emit NEON instructions, but that's not the usual practice, I think? The neon.md isns could also emit arm/thumb2 instructions, but there's no real point since there are already two different splitters for that. Andrew
Re: [PATCH][ARM] NEON DImode not
On 01/03/12 12:57, Andrew Stubbs wrote: On Wed 29 Feb 2012 18:00:19 GMT, Richard Earnshaw wrote: Why can't we have a single insn that deals with the have-neon and dont-have-neon cases? Sorry, I'm not sure I follow? There's one insn for the have-neon case, and one for the don't-have-neon. The expander is necessary to prevent gen_one_cmpldi2 locking recog to a disabled pattern (it caches the recog result, I think). It would be possible to have the arm.md insn emit NEON instructions, but that's not the usual practice, I think? The neon.md isns could also emit arm/thumb2 instructions, but there's no real point since there are already two different splitters for that. Andrew The RTL part of one_cmpldi2_internal and one_cmpldi2_neon are the same. Given that we now have controls to determine when an alternative is enabled it's generally better to have just one pattern here and turn on the alternatives that are suitable rather than having multiple patterns. You're already half doing this with the nota8 and onlya8 controls. R.
Re: [PATCH][ARM] NEON DImode neg
On 28/02/12 17:45, Andrew Stubbs wrote: This patch adds a DImode negate pattern for NEON. Oops, that patch completely failed to allow for the fallback to instructions that work in core registers. It also forgot to mention that the CC register was clobbered. This patch is the same except that it addresses those deficiencies. The outputs are the same. Ok for 4.8? Andrew 2012-02-29 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (negdi2): Use gen_negdi2_neon. * config/arm/neon.md (negdi2_neon): New insn. Also add splitters for core and NEON registers. --- gcc/config/arm/arm.md |8 +++- gcc/config/arm/neon.md | 37 + 2 files changed, 44 insertions(+), 1 deletions(-) diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..f1dbbf7 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4048,7 +4048,13 @@ (neg:DI (match_operand:DI 1 s_register_operand ))) (clobber (reg:CC CC_REGNUM))])] TARGET_EITHER - + { +if (TARGET_NEON) + { +emit_insn (gen_negdi2_neon (operands[0], operands[1])); + DONE; + } + } ) ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1). diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index d7caa37..b040ab1 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -922,6 +922,43 @@ (const_string neon_int_3)))] ) +(define_insn negdi2_neon + [(set (match_operand:DI 0 s_register_operand =w,?r,?r,?w) + (neg:DI (match_operand:DI 1 s_register_operand w, 0, r, w))) + (clobber (match_scratch:DI 2 =w, X, X, w)) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON + # + [(set_attr length 8) + (set_attr arch nota8,*,*,onlya8)] +) + +; Split negdi2_neon for vfp registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_NEON reload_completed IS_VFP_REGNUM (REGNO (operands[0])) + [(set (match_dup 2) (const_int 0)) + (parallel [(set (match_dup 0) (minus:DI (match_dup 2) (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + +) + +; Split negdi2_neon for core registers +(define_split + [(set (match_operand:DI 0 s_register_operand ) + (neg:DI (match_operand:DI 1 s_register_operand ))) + (clobber (match_scratch:DI 2 )) + (clobber (reg:CC CC_REGNUM))] + TARGET_32BIT reload_completed +arm_general_register_operand (operands[0], DImode) + [(parallel [(set (match_dup 0) (neg:DI (match_dup 1))) + (clobber (reg:CC CC_REGNUM))])] + +) + (define_insn *uminmode3_neon [(set (match_operand:VDQIW 0 s_register_operand =w) (umin:VDQIW (match_operand:VDQIW 1 s_register_operand w)
Re: [PATCH][ARM] NEON DImode not
On 29/02/12 14:48, Andrew Stubbs wrote: Hi all, This patch adds support for the DImode not operation in NEON. Currently the compiler must move the value to core registers, invert it there, and move it back again. This is bonkers because the VMVN instruction will do the job perfectly. The patch adds a pattern to support VMVN in DImode (in addition to the vector modes already supported) and retains the support for doing bitwise not in core registers where appropriate. OK for 4.8? Andrew neon-not64.patch 2012-02-29 Andrew Stubbs a...@codesourcery.com gcc/ * config/arm/arm.md (one_cmpldi2): Rename to one_cmpldi2_internal and replace with a new define_expand. (one_cmpldi2_internal): Exclude splitting for VFP registers. * config/arm/neon.md (one_cmpldi2_neon): New pattern. --- gcc/config/arm/arm.md | 13 ++--- gcc/config/arm/neon.md | 14 ++ 2 files changed, 24 insertions(+), 3 deletions(-) diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 751997f..93fde58 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -4206,12 +4206,19 @@ TARGET_32BIT TARGET_HARD_FLOAT (TARGET_FPA || TARGET_VFP_DOUBLE) ) -(define_insn_and_split one_cmpldi2 +(define_expand one_cmpldi2 + [(set (match_operand:DI 0 s_register_operand ) + (not:DI (match_operand:DI 1 s_register_operand )))] + TARGET_32BIT + ) + +(define_insn_and_split *one_cmpldi2_internal [(set (match_operand:DI 0 s_register_operand =r,r) (not:DI (match_operand:DI 1 s_register_operand 0,r)))] - TARGET_32BIT + TARGET_32BIT !TARGET_NEON # - TARGET_32BIT reload_completed + TARGET_32BIT reload_completed +arm_general_register_operand (operands[0], DImode) [(set (match_dup 0) (not:SI (match_dup 1))) (set (match_dup 2) (not:SI (match_dup 3)))] diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index d7caa37..f34d266 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -896,6 +896,20 @@ [(set_attr neon_type neon_int_1)] ) +(define_insn *one_cmpldi2_neon + [(set (match_operand:DI 0 s_register_operand =w,?r,?r,?w) +(not:DI (match_operand:DI 1 s_register_operand w, 0, r, w)))] + TARGET_NEON + @ + vmvn\t%P0, %P1 + # + # + vmvn\t%P0, %P1 + [(set_attr neon_type neon_int_1,*,*,neon_int_1) + (set_attr arch nota8,*,*,onlya8) + (set_attr length *,8,8,*)] +) + (define_insn absmode2 [(set (match_operand:VDQW 0 s_register_operand =w) (abs:VDQW (match_operand:VDQW 1 s_register_operand w)))] Why can't we have a single insn that deals with the have-neon and dont-have-neon cases? R.