Laurawly commented on a change in pull request #4550: [Perf] Add CublasLt
extern support for better Igemm performance
URL: https://github.com/apache/incubator-tvm/pull/4550#discussion_r360626382
##########
File path: tests/python/contrib/test_cublas.py
##########
@@ -73,11 +132,14 @@ def verify(target="cuda"):
verify()
def test_matmul_add():
- verify_matmul_add('float', 'float')
+ verify_matmul_add('float', 'float', rtol=1e-3)
Review comment:
I tested it on Tesla T4 GPU with CUDA 10.1 and this one fails due to
accuracy without rtol. I wonder if you can reproduce it on your end.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services