shtinsa opened a new pull request, #16517: URL: https://github.com/apache/tvm/pull/16517
The commit contains new memory manager which limits amount of cached buffers. The problem appeared on scenario with 4 sequential transformer based networks which generate output with different sizes on each iteration. In this case default PooledAllocator keeps large amount of buffers, for example it was about 2000 unallocated buffers (~1.9GB of memory) at the pool clearance moment. New allocator limits the pool size by 256 entities. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
