racorn opened a new issue #6414: High CPU load when PulsarClient (Java) is idle
URL: https://github.com/apache/pulsar/issues/6414
 
 
   **Describe the bug**
   Running a simple consumer on Windows using the `PulsarClient`, I notice that 
there is considerable CPU usage when the consumer is just idling. Profiling 
showed that the 'pulsar-timer-x-y' thread is heavy on the CPU. Digging into the 
source code, we can see that the `HashedWheelTimer `is set up with 1 ms tick 
time:
   
   `timer = new HashedWheelTimer(getThreadFactory("pulsar-timer"), 1, 
TimeUnit.MILLISECONDS);`
   
   A simple test with similar values show that a running HashedWheelTimer can 
consume CPU equal to 100% of one hyperthread:
   
   ```
       @Test
       public void testTimerWheel() throws Exception {
           HashedWheelTimer timer = new HashedWheelTimer(1, 
TimeUnit.MILLISECONDS);
   
           timer.newTimeout(new TimerTask() {
   
               @Override
               public void run(final Timeout timeout) throws Exception {
                   timer.newTimeout(this, 500, TimeUnit.MILLISECONDS);
   
               }
           }, 500, TimeUnit.MILLISECONDS);
   
           Thread.sleep(30_000);
           timer.stop();
       }
   ```
   My understanding of the `HashedWheelTimer` is that it is indented to be used 
for a large number of approximated timeouts and not for millisecond precision 
scheduling, but I may be mistaken.
   
   **To Reproduce**
   Run a simple producer subscribing to a topic with no messages, or run the 
above simple JUnit test. Observe CPU usage.
   
   **Expected behavior**
   An idle consumer should not incur high CPU load. For example, when running 8 
consumers on a i7-8700 (4 cores 2 threads per core), TaskManager reported 100% 
CPU usage - when there was no traffic, no messages on the topics.
   
   **Discussion**
   High CPU load is a high price to pay for immediate batch dispatching (which 
may be the primary reason for setting the 1 ms tick time). It should be 
possible to lax on immediate batch dispatching in favor of more compute 
resources available to other tasks.
   
   An alternative is to use a plain old `ScheduledExecutorService` for batch 
dispatching/reception, tests suggest it will also be lighter on the CPU when 
millisecond precision is required and the number of task is not too high. For 
example, I could schedule 2000 simple tasks every millisecond 
(`scheduleAtFixedRate`) with more compute available on a single thread. When 
scheduling 10k tasks every millisecond, the thread was working 100%, but 
scheduling 10k tasks every 10 millisecond was OK. So if a client is used to 
create a huge number of producers/consumers, it might be better to lax the 
batch timeout so the timeout thread has a chance to actually perform its work. 
Another observation is that the `HashedWheelTimer` used with 1 millisecond tick 
time is not able to run a single simple task every millisecond, party due to 
the `HashedWheelTimer` approximate nature, and maybe party because spinning the 
wheel competes with task expiration.
   
   Note that when using `ScheduledExecutorService` each task might not get 
exactly the same number of executions, for example when scheduling 1000 tasks 
run execute every millisecond for 30 seconds, task were executed between 29 986 
and 30 012 times.
   
   Some suggestions:
   1) The tick time could be configurable, with default value = 1 millisecond 
to preserve current behavior.
   2) Use default tick time 100 milliseconds, then create a 
`ScheduledExecutorService `in `PulsarClientImpl` that is exposed the same way 
as the `Timer `object for other classes to use. The `ProducerImpl `and 
`ConsumerBase` could use the `ScheduledExecutorService `to schedule batch 
dispatching/reception.
   
   **Additional context**
   Observed under Windows 10 64-bit, running Java 11.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to