masahi commented on pull request #7137:
URL: https://github.com/apache/tvm/pull/7137#issuecomment-750733915


   ok just as a one data point, when I was investigating GPU NMS performance 
issue, the old code was taking 630 milliseconds while with this fix it was 2.1 
seconds. But again, that's because the new code is dealing with far more boxes.
   
   According to the numbers I posted in 
https://github.com/apache/tvm/pull/7154, on CPU NMS is fast: the old code was 
spending only 8 milliseconds. So I don't expect NMS on CPU would be a big issue.
    
   After NMS, PyTorch detection model does post-NMS topk, which selects 1000 
boxes for later processing. So the perf difference should only be in NMS. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to