apoorvmittal10 commented on PR #17474:
URL: https://github.com/apache/kafka/pull/17474#issuecomment-2409130891

   >If our goal is to avoid the cache filling up with ‘ghost’ clients
   
   Not neccesarily ghost clients but also the clients which are spawned for 
really short span and stopped. For example the issue fixed in the PR: 
https://github.com/apache/kafka/pull/17431 should never be encountered if a 
Kafka Consumer is not closed just after instantiation. But we have seen 
applications which does have such workloads i.e. clients are created for a very 
short span as well. Hence those instances are legit but shall occupy a cache 
entry till time based eviction happens.
   
   >perhaps we could introduce prioritization in the cache. The priority of a 
client instance could be based on either the number of PUSH_TELEMETRY_REQUEST 
calls or its lifetime
   
   This is a good idea. But do we want to have additional compute and tracking 
for number of requests as well in the cache? I think that will be extra and not 
required when our goal is to get rid of instances which will never send the 
telemetry data again i.e. their connections are not disconnected.
   
   With this PR, we are not solving the scenario when a broker is handling the 
long running connections for 16384 clients, hence cache can again get full. 
Because that requires separate handling, if a broker can handle such large 
number of long running active clients then the value of cache should be bumped. 
I can do a follow up PR to make the cache config exposed in server.properties.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to