[PATCH] D41792: [AArch64] Add ARMv8.2-A FP16 scalar intrinsics

Sjoerd Meijer via Phabricator via cfe-commits Thu, 11 Jan 2018 01:35:59 -0800

SjoerdMeijer added inline comments.


================
Comment at: clang/include/clang/Basic/arm_fp16.td:58
+class IInst<string n, string p, string t> : Inst<n, p, t, OP_NONE> {}
+
+// ARMv8.2-A FP16 intrinsics.
----------------
az wrote:
> SjoerdMeijer wrote:
> > There's a little bit of duplication here: the definitions above are the 
> > same as in arm_neon.td. Would it be easy to share this, with e.g. an 
> > include?
> The duplication is small compared to the overall infrastructure/data 
> structure needed to automatically generate the intrinsics. There are 3 ways 
> to do this: 1) copy only the needed data structure in arm_fp16.td (this is 
> what was done in original review) 2) put all data structure in a newly 
> created file and include it in arm_neon.td and arm_fp16.td (done here). 3) 
> put only the duplication in a new file and include it. I did not go for this 
> one given that we create a new file for the only purpose of avoiding a small 
> duplication but I am fine of going with 3 too. Note that some of the 
> duplicated structure in the original arm_fp16.td was a stripped down version 
> of the copied one.
Given that the duplication is tiny, I don't have strong opinions to be honest. 
Would be nice to share these definitions if that's easy to do, otherwise we can 
perfectly live with this I think.


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:4102
   NEONMAP1(vuqadds_s32, aarch64_neon_suqadd, Add1ArgType),
+  // FP16 scalar intrinisics go here.
+  NEONMAP1(vabdh_f16, aarch64_sisd_fabd, Add1ArgType),
----------------
az wrote:
> SjoerdMeijer wrote:
> > Looks like  a few intrinsic descriptions are missing here. For example, the 
> > first 2-operand intrinsic vaddh_f16 is missing, but there are also more. Is 
> > this intentional, or might they have slipped through the cracks (or am I 
> > missing something)?
> I agree that this is confusing. For the intrinsics listed in this table, code 
> generation happens in a generic way based on the info in the table. The ones 
> not listed in this table are addressed in a more specific way below in a the 
> function called EmitAArch64BuiltinExpr. While I do not like how few things 
> were implemented in generating the intrinsics, I am in general following the 
> approach taken for arm_neon instead of introducing a new approach. 
Ah, I see, I somehow missed that. Fair enough.


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:6511
                                         "vgetq_lane");
+   case NEON::BI__builtin_neon_vaddh_f16:
+     Ops.push_back(EmitScalarExpr(E->getArg(1)));
----------------
nit: indentation seems off by 1 for these case statements.


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:6531
+     Value *F = CGM.getIntrinsic(Intrinsic::fma, HalfTy);
+    Value *Zero = llvm::ConstantFP::getZeroValueForNegation(HalfTy);
+     Value* Sub = Builder.CreateFSub(Zero, EmitScalarExpr(E->getArg(1)), 
"vsubh");
----------------
nit: indentation seems off by 1


================
Comment at: clang/utils/TableGen/NeonEmitter.cpp:2464
+        " * Permission is hereby granted, free of charge, to any person "
+        "obtaining "
+        "a copy\n"
----------------
more nits: I see this is copied from above, but I think this and the next line 
can be on the same line, just increasing readability a tiny bit.


================
Comment at: clang/utils/TableGen/NeonEmitter.cpp:2468
+        "\"Software\"),"
+        " to deal\n"
+        " * in the Software without restriction, including without limitation "
----------------
same here


================
Comment at: clang/utils/TableGen/NeonEmitter.cpp:2470
+        " * in the Software without restriction, including without limitation "
+        "the "
+        "rights\n"
----------------
and same here


================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:250
   def int_aarch64_neon_umax : AdvSIMD_2VectorArg_Intrinsic;
-  def int_aarch64_neon_fmax : AdvSIMD_2VectorArg_Intrinsic;
+  def int_aarch64_neon_fmax : AdvSIMD_2FloatArg_Intrinsic;
   def int_aarch64_neon_fmaxnmp : AdvSIMD_2VectorArg_Intrinsic;
----------------
There's a scalar and vector variant of FMAX and thus I am wondering if we don't 
need two definitions here: one using AdvSIMD_2FloatArg_Intrinsic and the other 
AdvSIMD_2VectorArg_Intrinsic?


================
Comment at: llvm/include/llvm/IR/IntrinsicsAArch64.td:262
   def int_aarch64_neon_umin : AdvSIMD_2VectorArg_Intrinsic;
-  def int_aarch64_neon_fmin : AdvSIMD_2VectorArg_Intrinsic;
+  def int_aarch64_neon_fmin : AdvSIMD_2FloatArg_Intrinsic;
   def int_aarch64_neon_fminnmp : AdvSIMD_2VectorArg_Intrinsic;
----------------
Same here for FMIN


https://reviews.llvm.org/D41792



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D41792: [AArch64] Add ARMv8.2-A FP16 scalar intrinsics

Reply via email to