masahi edited a comment on pull request #8174: URL: https://github.com/apache/tvm/pull/8174#issuecomment-854239897
lol thrust is the bottleneck now? Note that our sorting performance improved thanks to https://github.com/apache/tvm/pull/7611, thrust is no longer a requirement for good performance. I see that the number of boxes is quite small in your model, I believe the new implementation would be much faster when the number of boxes is large. Do you have other models, preferably the one with more input boxes? I've been testing tf2 models from https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md, their ssd mv2 model (FPNLite 320x320) has 12480 boxes, and ssd resnet 50 v1 has more than 50000, and efficient det2 has 110484 boxes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
