MasterJH5574 opened a new pull request, #16112: URL: https://github.com/apache/tvm/pull/16112
This PR supports controlling whether KV cache automatic growth is allowed through constructor parameter. Previously we always allow the KV cache to grow whenever it is full and more capacity is demanded. Although automatic growth can be good, in practice we often want the pre-allocated memory to be static, large enough and not changeable, which will make the memory management more controllable. Hence, this PR supports to specify if growth is allowed, and will throw error when growing in unallowed cases. This PR also adds an auxiliary function to KV cache to query the number of available pages. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
