apoorvmittal10 commented on PR #17474: URL: https://github.com/apache/kafka/pull/17474#issuecomment-2409130891
>If our goal is to avoid the cache filling up with ‘ghost’ clients Not neccesarily ghost clients but also the clients which are spawned for really short span and stopped. For example the issue fixed in the PR: https://github.com/apache/kafka/pull/17431 should never be encountered if a Kafka Consumer is not closed just after instantiation. But we have seen applications which does have such workloads i.e. clients are created for a very short span as well. Hence those instances are legit but shall occupy a cache entry till time based eviction happens. >perhaps we could introduce prioritization in the cache. The priority of a client instance could be based on either the number of PUSH_TELEMETRY_REQUEST calls or its lifetime This is a good idea. But do we want to have additional compute and tracking for number of requests as well in the cache? I think that will be extra and not required when our goal is to get rid of instances which will never send the telemetry data again i.e. their connections are not disconnected. With this PR, we are not solving the scenario when a broker is handling the long running connections for 16384 clients, hence cache can again get full. Because that requires separate handling, if a broker can handle such large number of long running active clients then the value of cache should be bumped. I can do a follow up PR to make the cache config exposed in server.properties. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
