masahi edited a comment on pull request #7154:
URL: https://github.com/apache/tvm/pull/7154#issuecomment-750589039


   > When we do the nested loop, we always have to check if the id of instance 
k matches the id of instance j. Since the input shape is (batch_size, 
num_anchors, features), and features = 6 here, I wouldn't be surprised if 
checking the instance of k ends up reading all of the features of k into 
registers, and that memory read is the expensive operation. Once it's in 
memory, actually doing the iou calculation is relatively cheap, so skipping it 
doesn't help that much.
   
   That's highly possible. Looking at this if condition in the triangle inner 
loop:
   
https://github.com/apache/tvm/blob/9956b5b859a24c8f4dec1abf308723b1257ffc66/python/tvm/topi/cuda/nms.py#L535-L540
   
   previously, `force_suppress` is always True, so this condition short circuit 
and access to `out[base_idx + offset_k + id_index]` and `out[base_idx + 
offset_j + id_index]` just below never happen. But now, to make NMS class 
aware, I had to change `force_suppress` to False, and now access to 
`out[base_idx + offset_k + id_index]` and `out[base_idx + offset_j + id_index]` 
always happen. This may be cancelling the speedup from reduced IOU tests. 
Storing the class IDs in a different 1D tensor may help. 
   
   That brings me to one of my pain points with our NMS API: I belieave our NMS 
API needs to be reworked. The current way of packing class ids and scores 
together with bbox coordinates is a design mistake that we inherited from 
MXNet. To store class ids, I have to cast ids to float32, update and pass 
`id_index` appropriately. Since our NMS API also requires scores to be packed 
with bbox, I had to update `score_index` too and all frontends except MXNet 
needs to do this concatenation. The worst part is that in NMS IR, the very 
first thing we do is the extraction 1D score tensor from packed data. So I see 
no good reason to pack score tensor and bbox coordinates.
                               


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to