MasterJH5574 opened a new pull request, #16849:
URL: https://github.com/apache/tvm/pull/16849

   This PR udpates PagedKVCache to initialize one more page than specified via 
constructor. The reason is that applications usually depends the number of free 
pages (returned from `GetNumAvailablePages`) to decide the KV cache operation 
policy. If there is no this extra page, the KV cache will tell "no available" 
pages even when the last allocated pages are not full, which may give the 
applications an illusion that the KV cache is already completely full, and 
cause further issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to