vicaya opened a new issue #5200: Broker high cpu usage under light load
URL: https://github.com/apache/pulsar/issues/5200
 
 
   **Describe the bug**
   All 3 of the brokers in a cluster gradually consume ~3 cores/vcpus (per 
broker) and went into load balancing mode (according to the broker log, it was 
using 90+% cpu, which is over the 85% threshold) under very light load (sum of 
pulsar_rate_in of the cluster is under 1000msg/s with average message size 
under 200 bytes with a few producers and a few hundred consumers). Restarting 
the brokers resolved the high cpu usage. They went back consuming only a 
fraction of a core under the same load.
   
   **To Reproduce**
   Not able to reproduce the problem in a separate cluster yet.
   
   **Expected behavior**
   Broker cpu usage should be kept low under light load.
   
   **Profile information**
   Output of https://github.com/stevegury/jstack-profiler when the broker was 
pegging ~3 cores.
   ```
   Runnable methods breakout
   -------------------------
   rank    time    name
   0       84.66%  java.lang.Thread.run(Thread.java:748)
   1       84.66%  
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
   2       76.68%  
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:909)
   3       54.47%  io.netty.channel.epoll.Native.epollWait(Native.java:114)
   4       54.47%  io.netty.channel.epoll.Native.epollWait0(Native Method)
   5       54.47%  
io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:278)
   6       54.47%  
io.netty.channel.epoll.EpollEventLoop.epollWait(EpollEventLoop.java:253)
   7       15.34%  
io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:765)
   8       15.34%  sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
   9       15.34%  
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1145)
   10      15.34%  
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349)
   11      15.34%  
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
   12      15.34%  sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
   13      15.34%  sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
   14      15.34%  io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:413)
   15      15.34%  
io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62)
   16      7.99%   
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
   17      7.99%   
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
   18      5.75%   
io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:335)
   19      3.83%   
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
   20      3.83%   sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
   21      3.83%   
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
   22      3.83%   
org.eclipse.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:648)
   23      3.83%   
org.eclipse.jetty.server.ServerConnector.accept(ServerConnector.java:377)
   24      3.83%   
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
   25      3.83%   
org.apache.pulsar.broker.web.PulsarServerConnector.accept(PulsarServerConnector.java:57)
   26      3.83%   
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
   27      3.83%   
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
   28      3.67%   
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produceTask(EatWhatYouKill.java:357)
   29      3.67%   
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:181)
   30      3.51%   sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
   31      3.51%   
org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:359)
   32      3.51%   sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
   33      3.51%   
org.eclipse.jetty.io.ManagedSelector$SelectorProducer.select(ManagedSelector.java:422)
   34      3.35%   
io.netty.util.concurrent.AbstractScheduledEventExecutor.pollScheduledTask(AbstractScheduledEventExecutor.java:111)
   35      3.35%   
io.netty.util.concurrent.SingleThreadEventExecutor.fetchFromScheduledTaskQueue(SingleThreadEventExecutor.java:282)
   36      3.35%   
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:393)
   37      3.35%   
io.netty.util.internal.DefaultPriorityQueue.poll(DefaultPriorityQueue.java:31)
   38      3.35%   
io.netty.util.internal.DefaultPriorityQueue.poll(DefaultPriorityQueue.java:116)
   39      3.35%   java.util.AbstractQueue.remove(AbstractQueue.java:113)
   40      3.04%   
io.netty.util.internal.DefaultPriorityQueue.bubbleDown(DefaultPriorityQueue.java:266)
   41      2.40%   
io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
   42      2.40%   
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
   43      2.08%   
org.apache.pulsar.client.impl.PersistentAcknowledgmentsGroupingTracker$$Lambda$413/1524624927.run(Unknown
 Source)
   44      2.08%   
io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:133)
   45      2.08%   
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   46      1.12%   
io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:268)
   47      1.12%   
io.netty.channel.DefaultSelectStrategy.calculateStrategy(DefaultSelectStrategy.java:30)
   48      1.12%   
io.netty.channel.epoll.EpollEventLoop.epollWaitNow(EpollEventLoop.java:257)
   49      1.12%   
io.netty.channel.epoll.EpollEventLoop.access$000(EpollEventLoop.java:46)
   50      1.12%   
io.netty.channel.epoll.EpollEventLoop$1.get(EpollEventLoop.java:74)
   51      0.16%   
org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl$$Lambda$42/65438140.run(Unknown
 Source)
   52      0.16%   
org.apache.bookkeeper.proto.BookieClientImpl.lambda$new$0(BookieClientImpl.java:121)
   53      0.16%   
org.slf4j.helpers.MessageFormatter.safeObjectAppend(MessageFormatter.java:299)
   54      0.16%   
io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:794)
   55      0.16%   org.eclipse.jetty.util.log.Slf4jLog.debug(Slf4jLog.java:97)
   ```
   
   **Additional context**
   The brokers were using the official container image: 
apachepulsar/pulsar-all:2.4.0

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to