AndrewZhaoLuo commented on a change in pull request #10177:
URL: https://github.com/apache/tvm/pull/10177#discussion_r801068115



##########
File path: python/tvm/contrib/cutlass/gen_conv2d.py
##########
@@ -252,6 +252,8 @@ def select_op(
             lambda align: all([dim % align == 0 for dim in [IC, OC]]),
             use_3xtf32,
             profile_all_alignments,
+            # Use fp32 accumulation for wgrad to align with cuDNN
+            accumlator_dtype="float32" if conv_kind == ConvKind.Wgrad else 
out_dtype,

Review comment:
       Hmm yeah, so if I'm understanding correctly for conv2d_winograd we want 
to accumulate to fp32 but if it's not winograd we are ok with accumulating to 
fp16. 
   
   `ToMixedPrecision` can configure accumulation and output dtypes for any call 
node but only using information from examining that node. I'm not sure 
implementation details like whether it's winograd can be transmitted here. 
   
   I will say on relay level all we care about is type checking imo so just get 
the output_dtype correct. For example, accumulate all you like in fp32 but 
internally just make sure the output fits the expected type written in 
interface. Perhaps extraneous cast here is bad but maybe we can repair it 
further down in topi-tir level.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to