MasterJH5574 opened a new pull request, #16112:
URL: https://github.com/apache/tvm/pull/16112

   This PR supports controlling whether KV cache automatic growth is allowed 
through constructor parameter. Previously we always allow the KV cache to grow 
whenever it is full and more capacity is demanded.
   
   Although automatic growth can be good, in practice we often want the 
pre-allocated memory to be static, large enough and not changeable, which will 
make the memory management more controllable. Hence, this PR supports to 
specify if growth is allowed, and will throw error when growing in unallowed 
cases.
   
   This PR also adds an auxiliary function to KV cache to query the number of 
available pages.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to