[PATCH] D126158: [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration.

Stephan Herhut via Phabricator via cfe-commits Wed, 25 May 2022 08:27:05 -0700

herhut added inline comments.


================
Comment at: mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp:158
+// by the same divisor.
+struct ExpandDivF16 : public ConvertOpToLLVMPattern<LLVM::FDivOp> {
+  using ConvertOpToLLVMPattern<LLVM::FDivOp>::ConvertOpToLLVMPattern;
----------------
This pattern is a bit misplaced here, as `LLVM::FDivOp` is not really a GPU 
dialect operation. Instead, should this be a special lowering of the arith 
dialect to NVVM (which we do not have yet) or a rewrite at the LLVM dialect 
level?

When lowering to LLVM, we already typically configure a different lowering for 
math dialect, so configuring the lowering of arith dialect differently seems 
like an OK option. That would mean a specialized pattern for `arith.divf` with 
higher priority. That would also give users a choice.


================
Comment at: mlir/lib/Conversion/GPUToNVVM/LowerGpuOpsToNVVMOps.cpp:304
 
+  patterns.add<ExpandDivF16>(converter);
+
----------------
I assume this is to differentiate this pattern somehow but there is no need for 
an extra `patterns.add` here.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126158/new/

https://reviews.llvm.org/D126158

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D126158: [MLIR][GPU] Replace fdiv on fp16 with promoted (fp32) multiplication with reciprocal plus one (conditional) Newton iteration.

Reply via email to