jbarotin opened a new issue, #21605: URL: https://github.com/apache/pulsar/issues/21605
### Search before asking - [X] I searched in the [issues](https://github.com/apache/pulsar/issues) and found nothing similar. ### Version Tested on 3.0.0 and 3.1.1 version ### Minimal reproduce step We figured out that it's due to our key distribution, that's not homogenous, we think we had reproduced this comportment as follows : - first step : produce some message considering the following key distribution - 10 keys with 1000 messages - 100 keys with 100 messages - 5890 keys with 9 messages - second step : launch 3 clients with 10 consumers at the same time, each of them are on the same subscription to consume this topic, we simulate processing with a Sleep of 500ms. You can find test code on this github repository : https://github.com/jbarotin/pulsar-simulation ### What did you expect to see? With a quick calculation ((10*1000+100*100+5890*9)*0.5/(3*10) = 1216s I estimate an optimal time to about 20 minutes. I run this test : ### What did you see instead? - on my dev machine on a pulsar docker in standalone mode, result, the total duration to consume all the message is 30 minutes, - on a distributed cluster with 3 OVH Cloud instance R2-15 (2 Core and 15GB RAM) it took 61 minutes. idle time is measured, we have a lot of consumers that stop several minutes and get back to consume message after, it seems that availablePermit is equal 0 to zero during this stop. ### Anything else? _No response_ ### Are you willing to submit a PR? - [ ] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
