Lunderberg commented on PR #12949:
URL: https://github.com/apache/tvm/pull/12949#issuecomment-1264013889

   Rats.  I figured it was worth a shot.  Two additional questions coming to 
mind:
   
   1. Would it work to construct a `DLTensor` of shape `[1]`, then iterate over 
`DLTensor::byte_offset` for each element?  I *think* that would work within the 
public methods using `DLTensor`, but the per-element overhead might be large.
   2. If there the per-element overhead of a virtual function call is large, 
would it be worth using `DetectIterMap` to attempt to regions that are 
contiguous in both source and destination layout, in order to copy those entire 
regions?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to