Hi Tim and reviewers,
I followed Tim's suggestion to use macros to handle the float16_t storage type. Notes: 1) Since the vget/vset_lane_f16 intrinsics read and write 16 bit data (no FP arithmetic performed), I simply reinterpreted float16_t and the vector of float16_t as i16 data. See the operators defined in NeonEmitter. 2) With this change vget/vset_lane_f16 use to vget/vset_lane_i16 implementation. 3) I added f16_to_f32 pattern because in the vset_lane case the i16 data is moved from GPR to a FP register, and hence the need for this pattern. 4) I added test cases that define float16_t variable in the function body, but do not return such value type as it is not allowed. Let me know if these tests are satisfactory. 5) I did not try to enforce the recommended intrinsic->instruction map from the ARM document. To force those instructions I would have to use builtins and v1i16 type casts so I can create the pattern. Even doing that, in some cases the UMOV, INS patterns defined earlier can prevail over. Thanks, Ana.
0001-llvm-Implemented-vget-vset_lane_f16-intrinsics.patch
Description: Binary data
0001-clang-Implemented-vget-vset_lane_f16-intrinsics.patch
Description: Binary data
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
