SjoerdMeijer added inline comments.
================ Comment at: clang/include/clang/Basic/arm_fp16.td:58 +class IInst<string n, string p, string t> : Inst<n, p, t, OP_NONE> {} + +// ARMv8.2-A FP16 intrinsics. ---------------- az wrote: > SjoerdMeijer wrote: > > There's a little bit of duplication here: the definitions above are the > > same as in arm_neon.td. Would it be easy to share this, with e.g. an > > include? > The duplication is small compared to the overall infrastructure/data > structure needed to automatically generate the intrinsics. There are 3 ways > to do this: 1) copy only the needed data structure in arm_fp16.td (this is > what was done in original review) 2) put all data structure in a newly > created file and include it in arm_neon.td and arm_fp16.td (done here). 3) > put only the duplication in a new file and include it. I did not go for this > one given that we create a new file for the only purpose of avoiding a small > duplication but I am fine of going with 3 too. Note that some of the > duplicated structure in the original arm_fp16.td was a stripped down version > of the copied one. Given that the duplication is tiny, I don't have strong opinions to be honest. Would be nice to share these definitions if that's easy to do, otherwise we can perfectly live with this I think. ================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:4102 NEONMAP1(vuqadds_s32, aarch64_neon_suqadd, Add1ArgType), + // FP16 scalar intrinisics go here. + NEONMAP1(vabdh_f16, aarch64_sisd_fabd, Add1ArgType), ---------------- az wrote: > SjoerdMeijer wrote: > > Looks like a few intrinsic descriptions are missing here. For example, the > > first 2-operand intrinsic vaddh_f16 is missing, but there are also more. Is > > this intentional, or might they have slipped through the cracks (or am I > > missing something)? > I agree that this is confusing. For the intrinsics listed in this table, code > generation happens in a generic way based on the info in the table. The ones > not listed in this table are addressed in a more specific way below in a the > function called EmitAArch64BuiltinExpr. While I do not like how few things > were implemented in generating the intrinsics, I am in general following the > approach taken for arm_neon instead of introducing a new approach. Ah, I see, I somehow missed that. Fair enough. ================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:6511 "vgetq_lane"); + case NEON::BI__builtin_neon_vaddh_f16: + Ops.push_back(EmitScalarExpr(E->getArg(1))); ---------------- nit: indentation seems off by 1 for these case statements. ================ Comment at: clang/lib/CodeGen/CGBuiltin.cpp:6531 + Value *F = CGM.getIntrinsic(Intrinsic::fma, HalfTy); + Value *Zero = llvm::ConstantFP::getZeroValueForNegation(HalfTy); + Value* Sub = Builder.CreateFSub(Zero, EmitScalarExpr(E->getArg(1)), "vsubh"); ---------------- nit: indentation seems off by 1 ================ Comment at: clang/utils/TableGen/NeonEmitter.cpp:2464 + " * Permission is hereby granted, free of charge, to any person " + "obtaining " + "a copy\n" ---------------- more nits: I see this is copied from above, but I think this and the next line can be on the same line, just increasing readability a tiny bit. ================ Comment at: clang/utils/TableGen/NeonEmitter.cpp:2468 + "\"Software\")," + " to deal\n" + " * in the Software without restriction, including without limitation " ---------------- same here ================ Comment at: clang/utils/TableGen/NeonEmitter.cpp:2470 + " * in the Software without restriction, including without limitation " + "the " + "rights\n" ---------------- and same here ================ Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:250 def int_aarch64_neon_umax : AdvSIMD_2VectorArg_Intrinsic; - def int_aarch64_neon_fmax : AdvSIMD_2VectorArg_Intrinsic; + def int_aarch64_neon_fmax : AdvSIMD_2FloatArg_Intrinsic; def int_aarch64_neon_fmaxnmp : AdvSIMD_2VectorArg_Intrinsic; ---------------- There's a scalar and vector variant of FMAX and thus I am wondering if we don't need two definitions here: one using AdvSIMD_2FloatArg_Intrinsic and the other AdvSIMD_2VectorArg_Intrinsic? ================ Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:262 def int_aarch64_neon_umin : AdvSIMD_2VectorArg_Intrinsic; - def int_aarch64_neon_fmin : AdvSIMD_2VectorArg_Intrinsic; + def int_aarch64_neon_fmin : AdvSIMD_2FloatArg_Intrinsic; def int_aarch64_neon_fminnmp : AdvSIMD_2VectorArg_Intrinsic; ---------------- Same here for FMIN https://reviews.llvm.org/D41792 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits