AndrewZhaoLuo commented on issue #8294:
URL: https://github.com/apache/tvm/issues/8294#issuecomment-866176947


   Hmm yeah the problem has to do with what i say. Specifically in 
`python/tvm/topi/cuda/conv2d_winograd.py` the winograd matrix `G` is cast to 
the output dtypes while the kernel isn't so there is a type mismatch.
   
   In general it seems reasonable to have implicit type promotion to higher bit 
floating point types. Furthermore, it might also be good to have most binary 
arithmetic ops to have output_dtypes.
   
   E.g. right now there isn't a good way to represent adding two fp16 numbers 
into a fp32 result.
   
   Later NVidia GPUs support this as a more primitive operations so maybe we 
should have a better representation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to