ross710 commented on issue #11100: URL: https://github.com/apache/pulsar/issues/11100#issuecomment-1006218658
We were able to reproduce this issue with one of our applications. With ~40 producers spun up but idle (not producing any messages) we tested with pulsar client 2.6.3 and 2.7.2 and analyzed the CPU usage using datadog's profiler. 2.63: <img width="1505" alt="2 6 3" src="https://user-images.githubusercontent.com/3477670/148313853-53695457-e8e0-4f1e-970c-484b9147e901.png"> 2.7.2: <img width="752" alt="2 7 2" src="https://user-images.githubusercontent.com/3477670/148313889-dc62bcc8-07e3-4a17-ac87-0e4db404b7ac.png"> --- 2.7.2 used significantly more CPU, mostly in the ProducerImpl's batching task. We did notice this change between 2.6.x and 2.7.x: https://github.com/apache/pulsar/pull/7733/files#diff-d6fcf8aa2d0035cc386dca0942a452343d6854763c7fd397efa4e660c0069767R1233 Looking at the code there seems to be a slight behavior regression in how this task is scheduled. Previously, the behavior mimicked [scheduleWithFixedDelay](https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ScheduledExecutorService.html#scheduleWithFixedDelay(java.lang.Runnable,%20long,%20long,%20java.util.concurrent.TimeUnit)) however [scheduleAtFixedRate](https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ScheduledExecutorService.html#scheduleAtFixedRate(java.lang.Runnable,%20long,%20long,%20java.util.concurrent.TimeUnit)) was chosen. So we believe that the batching task for producers runs much more frequently now, causing higher CPU usage. We think replacing the usage of `scheduleAtFixedRate` with `scheduleWithFixedDelay` will likely fix this issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
