trevor-m commented on pull request #8174: URL: https://github.com/apache/tvm/pull/8174#issuecomment-855112421
> @trevor-m Also it is worth trying out the graph runtime for the new implementation, since there is no dynamic shape after NMS. > > Thrust numbers actually make sense given that now we are running sort on `(batch * class, num_boxes`) along the second axis. Since batched sort with thrust is implemented via two calls to stable sort, when the axis to sort is small the overhead from the two calls become relatively big. And for small sort, TVM's sort should be faster than Thrust. > > In contrast, the previous implementation does sort on `(batch, num_boxes * class)` along the second axis, so Thrust sort is fast. I see, that makes sense. I was using the models from https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md I think these ones still cannot use graph runtime because there is a loop at the beginning. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
