masahi commented on a change in pull request #10177:
URL: https://github.com/apache/tvm/pull/10177#discussion_r800996189
##########
File path: python/tvm/contrib/cutlass/gen_conv2d.py
##########
@@ -252,6 +252,8 @@ def select_op(
lambda align: all([dim % align == 0 for dim in [IC, OC]]),
use_3xtf32,
profile_all_alignments,
+ # Use fp32 accumulation for wgrad to align with cuDNN
+ accumlator_dtype="float32" if conv_kind == ConvKind.Wgrad else
out_dtype,
Review comment:
Always using fp32 accum for other dtype will probably bring perf
regression (cuDNN also allows fp16 accumulation for fprop and dgrad). Ideally
we should add `accumulation_dtype` to `Conv2dAttr` to guide that decision, I
thought about doing that, but I realized that I have to change a lot of topi to
take that into account.
Also we need to discuss what the interface for `ToMixedPrecision` should be
if we want to allow changing accumulation dtype, right now we cannot flexibly
change even output dtype @AndrewZhaoLuo
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]