yelite commented on code in PR #15064:
URL: https://github.com/apache/tvm/pull/15064#discussion_r1224614780
##########
src/runtime/relax_vm/lm_support.cc:
##########
@@ -167,12 +167,59 @@ class AttentionKVCache : public ObjectRef {
TVM_REGISTER_OBJECT_TYPE(AttentionKVCacheObj);
+/*!
+ * \brief Create multiple kv caches with same shape, from single memory
allocation.
+ * \param init_data The initial data to put into the cache. Ignored if
init_fill_count is
+ * less than 0.
+ * \param reserve_shape The shape of cache.
+ * \param init_fill_count The initial row to fill into
+ * the cache.
+ * \param num_caches Number of caches to create.
+ */
+Array<AttentionKVCache> CreateMultipleKVCaches(NDArray init_data, ShapeTuple
reserve_shape,
+ int init_fill_count, int
num_caches) {
+ DLDataType dtype = init_data->dtype;
+
+ int64_t cache_size = (dtype.bits * dtype.lanes + 7) / 8;
+ for (const auto dim : reserve_shape) {
+ cache_size *= dim;
+ }
+
+ // Add padding to make each cache align to kAllocAlignment
+ using tvm::runtime::kAllocAlignment;
+ int64_t padding = (kAllocAlignment - cache_size % kAllocAlignment) %
kAllocAlignment;
+ int64_t cache_offset = cache_size + padding;
+
+ auto block = NDArray::Empty(ShapeTuple({cache_offset * num_caches}), dtype,
init_data->device);
+ auto block_view = block.CreateView(reserve_shape, dtype);
+
+ Array<AttentionKVCache> result;
+ for (int i = 0; i < num_caches; ++i) {
+ // Use DLManagedTensor to prevent underlying memory from being freed
+ DLManagedTensor* data_view = block_view.ToDLPack();
Review Comment:
Thanks! I updated the code to use storage interface and it looks cleaner.
But now it could print a warning message if the requested allocator type
mismatches from the allocator that is created at VM initialization.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]