csullivan opened a new pull request #7690:
URL: https://github.com/apache/tvm/pull/7690
This PR utilizes the core StorageToken allocation framework provided by the
graph memory planner, but extends it to additionally support optimization of
two dimensional allocations. The storage_ids and storage_scopes for each Expr
output are then provided to the graph runtime, from which a minimal set of one
and two dimensional flat buffer and texture memory storage pools can be
allocated, respectively.
**GraphMemoryPlanner**
- The StorageAllocator Request, Alloc, and CheckRelease, as a few helpers
have been refactored into a TokenAllocator, which can dispatch to a one or two
dimensional token allocator based on the token storage scope.
- TokenAllocator1D's implementation is identical to what exists prior to
this PR
- TokenAllocator2D utilizes a similar bin packing strategy but also tracks
the excess space that increasing a storage pool would incur. The strategy is as
follows,
- For a given token allocation request search among the idle texture
pools, keeping track of which pool would require the least amount of additional
memory. If a requested allocation fits within an existing pool, no additional
memory would be required. If not, expand the pool which results in the least
amount of added space.
- For any set of free storage pools which do not require expansion for the
given allocation request, choose the pool which would minimize the amount of
wasted memory.
- Note also that during optimization requests are only queried against
candidate pools of the same type as this is a requirement of texture memory
alignment.
**Graph runtime**
- The graph runtime PoolEntry previously tracked a single dimension for
allocation (size). This PR replaces that with a std::vector<int64_t> shape. For
global scope, this shape is only one dimensional. For texture scope this shape
is three dimensional (width, height, channel).
- The graph runtime uses the storage ids and scope information to grow pool
entries along one or two axes for storage pools of global or texture scope,
respectively.
- For 1D pools nothing has changed, the size of the allocation is determined
from the maximum number of bytes required for all entries in a storage pool,
and the resulting pool is of kDLFloat type and byte aligned.
- For 2D texture pools the size of an image pool is determined from the
maximum number of row and column elements required for all entries of a storage
pool, and retains the type of the underlying pool entries.
RFC in progress, once posted I will add a link here.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]