masahi commented on a change in pull request #10177:
URL: https://github.com/apache/tvm/pull/10177#discussion_r800996189



##########
File path: python/tvm/contrib/cutlass/gen_conv2d.py
##########
@@ -252,6 +252,8 @@ def select_op(
             lambda align: all([dim % align == 0 for dim in [IC, OC]]),
             use_3xtf32,
             profile_all_alignments,
+            # Use fp32 accumulation for wgrad to align with cuDNN
+            accumlator_dtype="float32" if conv_kind == ConvKind.Wgrad else 
out_dtype,

Review comment:
       Always using fp32 accum for other dtype will probably bring perf 
regression (cuDNN also allows fp16 accumulation for fprop and dgrad). Ideally 
we should add `accumulation_dtype` to `Conv2dAttr` to guide that decision, I 
thought about doing that, but I realized that I have to change a lot of topi to 
take that into account. 
   
   Also we need to discuss what the interface for `ToMixedPrecision` should be 
if we want to allow changing accumulation dtype, right now we cannot flexibly 
change even output dtype @AndrewZhaoLuo 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to