masahi edited a comment on pull request #7154: URL: https://github.com/apache/tvm/pull/7154#issuecomment-750589039
> When we do the nested loop, we always have to check if the id of instance k matches the id of instance j. Since the input shape is (batch_size, num_anchors, features), and features = 6 here, I wouldn't be surprised if checking the instance of k ends up reading all of the features of k into registers, and that memory read is the expensive operation. Once it's in memory, actually doing the iou calculation is relatively cheap, so skipping it doesn't help that much. That's highly possible. Looking at this if condition in the triangle inner loop: https://github.com/apache/tvm/blob/9956b5b859a24c8f4dec1abf308723b1257ffc66/python/tvm/topi/cuda/nms.py#L535-L540 previously, `force_suppress` is always True, so this condition short circuit and access to `out[base_idx + offset_k + id_index]` and `out[base_idx + offset_j + id_index]` just below never happen. But now, to make NMS class aware, I had to change `force_suppress` to False, and now access to `out[base_idx + offset_k + id_index]` and `out[base_idx + offset_j + id_index]` always happen. This may be cancelling the speedup from reduced IOU tests. Storing the class IDs in a different 1D tensor may help. That brings me to my pain point with our NMS API: I belieave our NMS API needs to be reworked. The current way of packing class ids and scores together with bbox coordinates is a design mistake that we inherited from MXNet. To store class ids, I have to cast ids to float32, update and pass `id_index` appropriately. Since our NMS API also requires scores to be packed with bbox, all frontends except MXNet needs to do this concatenation. The worst part is that in NMS IR, the very first thing we do is the extraction 1D score tensor from packed data. So I see no good reason to pack score tensor and bbox coordinates. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
