CaptainDuke edited a comment on pull request #8479:
URL: https://github.com/apache/tvm/pull/8479#issuecomment-884044279


   > Could you provide timing information for a variety of shapes and ranks. I 
just want to make sure this is faster on all inputs.
   
   
![ScatterND_performance](https://user-images.githubusercontent.com/24515303/126465264-74e04075-9c99-47f1-a958-fc1047c94419.png)
   
   @tkonolige 
   We evalutate the performance with 3 types of ranks and shapes. Time is 
collected using Nsight System.
   
   So long as the original `with ib.for_range() as i` is large enough, the 
separated two kernels would enlarge dimGrid and achieve better parallelism 
significantly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to