On Tue, Jul 26, 2016 at 02:55:02PM +0100, James Greenhalgh wrote: > > Hi, > > It looks like we've not been handling structures of 16-bit floating-point > data correctly for AArch64. For some reason we end up passing them > packed in to integer registers. That is to say, on trunk and GCC 6, for: > > struct x { > __fp16 x[4]; > }; > > __fp16 > foo1 (struct x x) > { > return x.x[1]; > } > > We generate: > > foo1: > sbfx x0, x0, 16, 16 > mov v0.h[0], w0 > ret > > Which is wrong. > > This patch fixes that, so now we generate: > > foo1: > umov w0, v1.h[0] > sxth x0, w0 > mov v0.h[0], w0 > ret > > Far from optimal (I'll work on that...) but at least getting the data from > the right register bank! > > To do this we need to keep around a reference to the fp16 type after we > construct it. I've moved this initialisation to a new function > aarch64_init_fp16_types in aarch64-builtins.c and made the references > available through arm_neon.h. > > After that, we want to remove the #if 0 wrapping HFmode support in > aarch64_gimplify_va_arg_expr in aarch64.c, and add HFmode to the > REAL_TYPE and COMPLEX_TYPE support in aapcs_vfp_sub_candidate. > > Strictly speaking, we don't need the hunk regarding COMPLEX_TYPE. > We can't build complex forms of __fp16. But, were we ever to support the > _Float16 type we'd need this. Rather than leave the chance it will be > forgotten about, I've just added it here. If the maintainers would prefer, > I can change this to a TODO and put a sticky-note somewhere near my desk. > > With those simple changes, we fix the argument passing. The rest of the > patch is an update to the various testcases in aapcs64.exp to fully cover > various __fp16 cases (both naked, and within an HFA). > > Bootstrapped on aarch64-none-linux-gnu and tested with no issues. Also > tested on aarch64_be-none-elf. All test came back clean. > > OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6, > though it will apply cleanly there if the maintainers support that.
*Ping* https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01720.html Thanks, James > > gcc/ > > 2016-07-26 James Greenhalgh <james.greenha...@arm.com> > > * config/aarch64/aarch64.h (aarch64_fp16_type_node): Declare. > (aarch64_fp16_ptr_type_node): Likewise. > * config/aarch64/aarch64-simd-builtins.c > (aarch64_fp16_ptr_type_node): Define. > (aarch64_init_fp16_types): New, refactored out of... > (aarch64_init_builtins): ...here, update to call > aarch64_init_fp16_types. > * config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr): Handle > HFmode. > (aapcs_vfp_sub_candidate): Likewise. > > gcc/testsuite/ > > 2016-07-26 James Greenhalgh <james.greenha...@arm.com> > > * gcc.target/aarch64/aapcs64/abitest-common.h: Define half-precision > registers. > * gcc.target/aarch64/aapcs64/abitest.S (dumpregs): Add assembly for > saving the half-precision registers. > * gcc.target/aarch64/aapcs64/func-ret-1.c: Test that an __fp16 > value is returned in h0. > * gcc.target/aarch64/aapcs64/test_2.c: Check that __FP16 arguments > are passed in FP/SIMD registers. > * gcc.target/aarch64/aapcs64/test_27.c: New, test that __fp16 HFA > passing works corrcetly. > * gcc.target/aarch64/aapcs64/type-def.h (hfa_f16x1_t): New. > (hfa_f16x2_t): Likewise. > (hfa_f16x3_t): Likewise. > * gcc.target/aarch64/aapcs64/va_arg-1.c: Check that __fp16 values > are promoted to double and passed in a double register. > * gcc.target/aarch64/aapcs64/va_arg-2.c: Check that __fp16 values > are promoted to double and stacked. > * gcc.target/aarch64/aapcs64/va_arg-4.c: Check stacking of HFA of > __fp16 data types. > * gcc.target/aarch64/aapcs64/va_arg-5.c: Likewise. > * gcc.target/aarch64/aapcs64/va_arg-16.c: New, check HFAs of > __fp16 first get passed in FP/SIMD registers, then stacked. >