Lunderberg commented on pull request #8196:
URL: https://github.com/apache/tvm/pull/8196#issuecomment-857140287


   Running some performance tests, it looks like the refactor has very little 
impact on the overall runtime.  The plots below show the Q1/median/Q3 runtimes 
for different low-level tasks that would need to access the thread-specific 
resources.  The only significant difference is for the copying data to the 
device, which is slightly higher for very small buffer copies.
   
   
![image](https://user-images.githubusercontent.com/3888575/121226379-5933e480-c83f-11eb-9713-fde3a2d0b755.png)
   
   Benchmarking details: Used `pytest-benchmark`, mostly with default settings. 
 Number of iterations chosen based on runtime of the first iteration such that 
each data point is collected in ~1 second.  Repeated initialization was allowed 
to use up to 10 seconds per data point.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to