MasterJH5574 commented on PR #16111:
URL: https://github.com/apache/tvm/pull/16111#issuecomment-1807377050

   @junrushao Yes there is some amount of waste in terms of the storage size 
for sure. As long as the upper bound is as much tight as possible, I feel the 
“over-allocation” is not going to be a severe issue. For the number of 
allocations, the change in this PR does not increase the number of allocations 
than before. (If we don't use pool allocator, the number of allocations remains 
the same. And if we use pool allocator, only one allocation happens.)
   
   For the example of batching, say we can analyze the maximum possible batch 
size in ahead and annotate that value as the upper bound. In a serving engine, 
every integer between `[1, max_batch_size)` will be effectively used as the 
real batch size. And in this case only one static storage is allocated, which I 
believe is a great thing. If we do not allocate the output storage statically, 
there will be `max_batch_size` allocations, and each one of them have a 
different storage size, which, in the worst case, will cause the total memory 
in the pool allocator (when enabled) to be `O(max_batch_size^2)` times of a 
single allocated storage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to