icemelon9 commented on issue #4324: [Relay] Use opaque for where op
URL: https://github.com/apache/incubator-tvm/pull/4324#issuecomment-557727463
 
 
   @tqchen After some benchmark, I do see there's slightly improvement on the 
CPU instances. On C5.9xl, with where being opaque, the latency of BERT is 
34.51ms (std: 0.59ms), while with where being broadcast, the latency is 36.06ms 
(std: 0.50ms). The difference is about 1.5ms or 4%. But the difference on 
C5.4xl is smaller, opaque 46.99ms (std: 0.13ms) vs broadcast 47.87ms (std: 
0.24ms), 1.8% improvement. I also evaluate this on GPU instance P3.2xl, which I 
didn't see much difference in performance.
   
   Given the improvement is not very signficant on CPU, I think let's don't 
make this change for now. In the future, we can potentially have a performance 
based op fusion pass which it can determin what kind of fusion gives the best 
performance.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to