Re: [AArch64] Handle HFAs of float16 types properly

2016-08-05 Thread Richard Earnshaw (lists)
On 26/07/16 14:55, James Greenhalgh wrote:
> 
> Hi,
> 
> It looks like we've not been handling structures of 16-bit floating-point
> data correctly for AArch64. For some reason we end up passing them
> packed in to integer registers. That is to say, on trunk and GCC 6, for:
> 
>   struct x {
> __fp16 x[4];
>   };
> 
>   __fp16
>   foo1 (struct x x)
>   {
> return x.x[1];
>   }
> 
> We generate:
> 
>   foo1:
>   sbfxx0, x0, 16, 16
>   mov v0.h[0], w0
>   ret
> 
> Which is wrong.
> 
> This patch fixes that, so now we generate:
> 
>   foo1:
>   umovw0, v1.h[0]
>   sxthx0, w0
>   mov v0.h[0], w0
>   ret
> 
> Far from optimal (I'll work on that...) but at least getting the data from
> the right register bank!
> 
> To do this we need to keep around a reference to the fp16 type after we
> construct it. I've moved this initialisation to a new function
> aarch64_init_fp16_types in aarch64-builtins.c and made the references
> available through arm_neon.h.
> 
> After that, we want to remove the #if 0 wrapping HFmode support in
> aarch64_gimplify_va_arg_expr in aarch64.c, and add HFmode to the
> REAL_TYPE and COMPLEX_TYPE support in aapcs_vfp_sub_candidate.
> 
> Strictly speaking, we don't need the hunk regarding COMPLEX_TYPE.
> We can't build complex forms of __fp16. But, were we ever to support the
> _Float16 type we'd need this. Rather than leave the chance it will be
> forgotten about, I've just added it here. If the maintainers would prefer,
> I can change this to a TODO and put a sticky-note somewhere near my desk.
> 
> With those simple changes, we fix the argument passing. The rest of the
> patch is an update to the various testcases in aapcs64.exp to fully cover
> various __fp16 cases (both naked, and within an HFA).
> 
> Bootstrapped on aarch64-none-linux-gnu and tested with no issues. Also
> tested on aarch64_be-none-elf. All test came back clean.
> 
> OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6,
> though it will apply cleanly there if the maintainers support that.
> 

Can you please file a PR for this and use that when committing.  As
previously discussed, since this was new for 6.1 having a PR makes it
easier if we do decide to have a back-port.

OK on that basis.

R.

> Thanks,
> James
> 
> ---
> 
> gcc/
> 
> 2016-07-26  James Greenhalgh  
> 
>   * config/aarch64/aarch64.h (aarch64_fp16_type_node): Declare.
>   (aarch64_fp16_ptr_type_node): Likewise.
>   * config/aarch64/aarch64-simd-builtins.c
>   (aarch64_fp16_ptr_type_node): Define.
>   (aarch64_init_fp16_types): New, refactored out of...
>   (aarch64_init_builtins): ...here, update to call
>   aarch64_init_fp16_types.
>   * config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr): Handle
>   HFmode.
>   (aapcs_vfp_sub_candidate): Likewise.
> 
> gcc/testsuite/
> 
> 2016-07-26  James Greenhalgh  
> 
>   * gcc.target/aarch64/aapcs64/abitest-common.h: Define half-precision
>   registers.
>   * gcc.target/aarch64/aapcs64/abitest.S (dumpregs): Add assembly for
>   saving the half-precision registers.
>   * gcc.target/aarch64/aapcs64/func-ret-1.c: Test that an __fp16
>   value is returned in h0.
>   * gcc.target/aarch64/aapcs64/test_2.c: Check that __FP16 arguments
>   are passed in FP/SIMD registers.
>   * gcc.target/aarch64/aapcs64/test_27.c: New, test that __fp16 HFA
>   passing works corrcetly.
>   * gcc.target/aarch64/aapcs64/type-def.h (hfa_f16x1_t): New.
>   (hfa_f16x2_t): Likewise.
>   (hfa_f16x3_t): Likewise.
>   * gcc.target/aarch64/aapcs64/va_arg-1.c: Check that __fp16 values
>   are promoted to double and passed in a double register.
>   * gcc.target/aarch64/aapcs64/va_arg-2.c: Check that __fp16 values
>   are promoted to double and stacked.
>   * gcc.target/aarch64/aapcs64/va_arg-4.c: Check stacking of HFA of
>   __fp16 data types.
>   * gcc.target/aarch64/aapcs64/va_arg-5.c: Likewise.
>   * gcc.target/aarch64/aapcs64/va_arg-16.c: New, check HFAs of
>   __fp16 first get passed in FP/SIMD registers, then stacked.
> 
> 
> 0001-AArch64-Handle-HFAs-of-float16-types-properly.patch
> 
> 
> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index ca91d91..1de325a 100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -443,13 +443,15 @@ static struct aarch64_simd_type_info aarch64_simd_types 
> [] = {
>  };
>  #undef ENTRY
>  
> -/* This type is not SIMD-specific; it is the user-visible __fp16.  */
> -static tree aarch64_fp16_type_node = NULL_TREE;
> -
>  static tree aarch64_simd_intOI_type_node = NULL_TREE;
>  static tree aarch64_simd_intCI_type_node = NULL_TREE;
>  static tree aarch64_simd_intXI_type_node = NULL_TREE;
>  
> +/* The user-visible __fp16 type, and a pointer to that type.  Used
> +   across the back-end.  */
> +tree aarch64_fp16_type_no

Re: [AArch64] Handle HFAs of float16 types properly

2016-08-05 Thread Richard Earnshaw (lists)
On 05/08/16 15:17, James Greenhalgh wrote:
> On Fri, Aug 05, 2016 at 11:15:24AM +0100, James Greenhalgh wrote:
>> On Fri, Aug 05, 2016 at 11:00:39AM +0100, Yao Qi wrote:
>>> On Tue, Jul 26, 2016 at 2:55 PM, James Greenhalgh
>>>  wrote:

 OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6,
 though it will apply cleanly there if the maintainers support that.

>>>
>>> What do you mean by "ABI break"?  AFAICS, with this patch, it conforms to
>>> AAPCS.
>>
>> After this patch code generated for GCC 4.9/5/6 will not be ABI
> 
> Note that the __fp16 type was only added for AArch64 for GCC 6, so there
> would be no break going back to the earlier branches.
> 
> The only released compiler we would potentially have an ABI break against
> would be GCC 6.1 (and any vendor/distibution compilers that had backported
> the __fp16 support).
> 
> __fp16 is a fairly corner-case type anyway, so the actual impact of this
> break should be reasonably well limited. Especially if we backport the fix
> such that GCC 6.2 contains the fix.

I agree.  Given this was a new feature we should fix it and be done.
6.1 was buggy, 6.2 is it!

R.

> 
> Thanks,
> James
> 
>> compatible with code generated for GCC 7 for HFAs of __fp16. The new
>> generated code will conform to AAPCS64, but the old code didn't so there has
>> been an ABI change between the GCC versions. We don't like doing that for
>> minor releases, so the patch is not really suitable for backporting.
>>
>>> The subject leads me thinking about the handling of HVA of float16.
>>
>> These are handled like any other vector, the code looking at HVA's doesn't
>> care about the inner mode of the vector just the bitsize:
>>



Re: [AArch64] Handle HFAs of float16 types properly

2016-08-05 Thread James Greenhalgh
On Fri, Aug 05, 2016 at 11:15:24AM +0100, James Greenhalgh wrote:
> On Fri, Aug 05, 2016 at 11:00:39AM +0100, Yao Qi wrote:
> > On Tue, Jul 26, 2016 at 2:55 PM, James Greenhalgh
> >  wrote:
> > >
> > > OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6,
> > > though it will apply cleanly there if the maintainers support that.
> > >
> > 
> > What do you mean by "ABI break"?  AFAICS, with this patch, it conforms to
> > AAPCS.
> 
> After this patch code generated for GCC 4.9/5/6 will not be ABI

Note that the __fp16 type was only added for AArch64 for GCC 6, so there
would be no break going back to the earlier branches.

The only released compiler we would potentially have an ABI break against
would be GCC 6.1 (and any vendor/distibution compilers that had backported
the __fp16 support).

__fp16 is a fairly corner-case type anyway, so the actual impact of this
break should be reasonably well limited. Especially if we backport the fix
such that GCC 6.2 contains the fix.

Thanks,
James

> compatible with code generated for GCC 7 for HFAs of __fp16. The new
> generated code will conform to AAPCS64, but the old code didn't so there has
> been an ABI change between the GCC versions. We don't like doing that for
> minor releases, so the patch is not really suitable for backporting.
> 
> > The subject leads me thinking about the handling of HVA of float16.
> 
> These are handled like any other vector, the code looking at HVA's doesn't
> care about the inner mode of the vector just the bitsize:
> 



Re: [AArch64] Handle HFAs of float16 types properly

2016-08-05 Thread James Greenhalgh
On Fri, Aug 05, 2016 at 11:00:39AM +0100, Yao Qi wrote:
> On Tue, Jul 26, 2016 at 2:55 PM, James Greenhalgh
>  wrote:
> >
> > OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6,
> > though it will apply cleanly there if the maintainers support that.
> >
> 
> What do you mean by "ABI break"?  AFAICS, with this patch, it conforms to
> AAPCS.

After this patch code generated for GCC 4.9/5/6 will not be ABI
compatible with code generated for GCC 7 for HFAs of __fp16. The new
generated code will conform to AAPCS64, but the old code didn't so there has
been an ABI change between the GCC versions. We don't like doing that for
minor releases, so the patch is not really suitable for backporting.

> The subject leads me thinking about the handling of HVA of float16.

These are handled like any other vector, the code looking at HVA's doesn't
care about the inner mode of the vector just the bitsize:

  config/aarch64/aarch64.c::aapcs_vfp_sub_candidate

case VECTOR_TYPE:
  /* Use V2SImode and V4SImode as representatives of all 64-bit
 and 128-bit vector types.  */
  size = int_size_in_bytes (type);
  switch (size)
{
case 8:
  mode = V2SImode;
  break;
case 16:
  mode = V4SImode;
  break;
default:
  return -1;
}

  if (*modep == VOIDmode)
*modep = mode;

  /* Vector modes are considered to be opaque: two vectors are
 equivalent for the purposes of being homogeneous aggregates
 if they are the same size.  */
  if (*modep == mode)
return 1;

  break;

Thanks,
James



Re: [AArch64] Handle HFAs of float16 types properly

2016-08-05 Thread Yao Qi
On Tue, Jul 26, 2016 at 2:55 PM, James Greenhalgh
 wrote:
>
> OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6,
> though it will apply cleanly there if the maintainers support that.
>

What do you mean by "ABI break"?  AFAICS, with this patch, it conforms to
AAPCS.

The subject leads me thinking about the handling of HVA of float16.

-- 
Yao (齐尧)


Re: [AArch64] Handle HFAs of float16 types properly

2016-08-04 Thread James Greenhalgh
On Tue, Jul 26, 2016 at 02:55:02PM +0100, James Greenhalgh wrote:
> 
> Hi,
> 
> It looks like we've not been handling structures of 16-bit floating-point
> data correctly for AArch64. For some reason we end up passing them
> packed in to integer registers. That is to say, on trunk and GCC 6, for:
> 
>   struct x {
> __fp16 x[4];
>   };
> 
>   __fp16
>   foo1 (struct x x)
>   {
> return x.x[1];
>   }
> 
> We generate:
> 
>   foo1:
>   sbfxx0, x0, 16, 16
>   mov v0.h[0], w0
>   ret
> 
> Which is wrong.
> 
> This patch fixes that, so now we generate:
> 
>   foo1:
>   umovw0, v1.h[0]
>   sxthx0, w0
>   mov v0.h[0], w0
>   ret
> 
> Far from optimal (I'll work on that...) but at least getting the data from
> the right register bank!
> 
> To do this we need to keep around a reference to the fp16 type after we
> construct it. I've moved this initialisation to a new function
> aarch64_init_fp16_types in aarch64-builtins.c and made the references
> available through arm_neon.h.
> 
> After that, we want to remove the #if 0 wrapping HFmode support in
> aarch64_gimplify_va_arg_expr in aarch64.c, and add HFmode to the
> REAL_TYPE and COMPLEX_TYPE support in aapcs_vfp_sub_candidate.
> 
> Strictly speaking, we don't need the hunk regarding COMPLEX_TYPE.
> We can't build complex forms of __fp16. But, were we ever to support the
> _Float16 type we'd need this. Rather than leave the chance it will be
> forgotten about, I've just added it here. If the maintainers would prefer,
> I can change this to a TODO and put a sticky-note somewhere near my desk.
> 
> With those simple changes, we fix the argument passing. The rest of the
> patch is an update to the various testcases in aapcs64.exp to fully cover
> various __fp16 cases (both naked, and within an HFA).
> 
> Bootstrapped on aarch64-none-linux-gnu and tested with no issues. Also
> tested on aarch64_be-none-elf. All test came back clean.
> 
> OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6,
> though it will apply cleanly there if the maintainers support that.

*Ping*

https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01720.html

Thanks,
James

> 
> gcc/
> 
> 2016-07-26  James Greenhalgh  
> 
>   * config/aarch64/aarch64.h (aarch64_fp16_type_node): Declare.
>   (aarch64_fp16_ptr_type_node): Likewise.
>   * config/aarch64/aarch64-simd-builtins.c
>   (aarch64_fp16_ptr_type_node): Define.
>   (aarch64_init_fp16_types): New, refactored out of...
>   (aarch64_init_builtins): ...here, update to call
>   aarch64_init_fp16_types.
>   * config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr): Handle
>   HFmode.
>   (aapcs_vfp_sub_candidate): Likewise.
> 
> gcc/testsuite/
> 
> 2016-07-26  James Greenhalgh  
> 
>   * gcc.target/aarch64/aapcs64/abitest-common.h: Define half-precision
>   registers.
>   * gcc.target/aarch64/aapcs64/abitest.S (dumpregs): Add assembly for
>   saving the half-precision registers.
>   * gcc.target/aarch64/aapcs64/func-ret-1.c: Test that an __fp16
>   value is returned in h0.
>   * gcc.target/aarch64/aapcs64/test_2.c: Check that __FP16 arguments
>   are passed in FP/SIMD registers.
>   * gcc.target/aarch64/aapcs64/test_27.c: New, test that __fp16 HFA
>   passing works corrcetly.
>   * gcc.target/aarch64/aapcs64/type-def.h (hfa_f16x1_t): New.
>   (hfa_f16x2_t): Likewise.
>   (hfa_f16x3_t): Likewise.
>   * gcc.target/aarch64/aapcs64/va_arg-1.c: Check that __fp16 values
>   are promoted to double and passed in a double register.
>   * gcc.target/aarch64/aapcs64/va_arg-2.c: Check that __fp16 values
>   are promoted to double and stacked.
>   * gcc.target/aarch64/aapcs64/va_arg-4.c: Check stacking of HFA of
>   __fp16 data types.
>   * gcc.target/aarch64/aapcs64/va_arg-5.c: Likewise.
>   * gcc.target/aarch64/aapcs64/va_arg-16.c: New, check HFAs of
>   __fp16 first get passed in FP/SIMD registers, then stacked.
>