Lunderberg commented on issue #8690: URL: https://github.com/apache/tvm/issues/8690#issuecomment-896977421
It looks like the numpy results are returning identically zero for the values around the edge, while the cudnn results have non-zero values. The numpy results are correct due to the padding used for that value. Something that might be complicating the investigation is that [cudnn attempts](https://github.com/apache/tvm/blob/main/python/tvm/topi/cuda/conv2d.py#L125) to find the most efficient method on its own if nothing is specified. Last time I touched this file, I hoisted it up to be a configurable knob, but it still defaults to letting cudnn run its own benchmarks and select the fastest. My current mental model is that in some cases, cudnn's benchmark selects an algorithm that is faster, but has approximations that result in slightly non-zero padding. If that cudnn implementation is selected, then the test fails in CI. Unfortunately, of the 18 options that cudnn supports, I only have hardware support to test out 4 of them, and all 4 of them keep the padding identically equal to zero. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
