You can take a profile with Java Flight Recorder if you use Java 11 or
using async profiler otherwise. See below for the latter:

https://issues.apache.org/jira/browse/KAFKA-9339?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=17013400#comment-17013400

It's worth filing a JIRA and discuss it there.

Ismael

On Sun, Jan 12, 2020 at 10:28 PM Navneeth Krishnan <reachnavnee...@gmail.com>
wrote:

> Hi Ismael,
>
> We were previously running on 0.10.2.1 with 8 brokers running around 80%
> CPU. But now we have upgraded to 2.3 with 16 brokers. It's the same message
> rate, topics, producers and consumers but the CPU is still >80%. How can we
> troubleshoot to find where exactly is the problem?
>
> Thanks
>
> On Wed, Jan 8, 2020 at 10:33 AM Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Has the behavior changed after an upgrade or has it been consistent since
> > the start?
> >
> > Ismael
> >
> > On Thu, Jan 2, 2020 at 4:18 PM Navneeth Krishnan <
> reachnavnee...@gmail.com
> > >
> > wrote:
> >
> > > Hi All,
> > >
> > > We have a kafka cluster with 12 nodes and we are pretty much seeing 90%
> > > cpu usage on all the nodes. Here is all the information. Need some help
> > on
> > > figuring out what the problem is and how to overcome this issue.
> > >
> > > *Cluster:*
> > > Kafka version: 2.3.0
> > > Number of brokers in cluster: 12
> > > Node type: 4 vCores 32GB mem
> > > Network In: 10Mbps per broker
> > > Network Out: 16Mbps per broker
> > > Topics: 10 (approximately)
> > > Partitions: 20 (Max), some has only partitions
> > > Replication Factor: 3
> > >
> > > *CPU Usage:*
> > > [image: image.png]
> > >
> > > *VMStat*
> > >
> > > [root]# vmstat 1 10
> > >
> > > procs -----------memory---------- ---swap-- -----io---- -system--
> > > ------cpu-----
> > >
> > >  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
> > id
> > > wa st
> > >
> > >  8  0      0 234444  19064 24046980    0    0    17  2026    1    3 38
> 33
> > > 28  0  1
> > >
> > >  7  0      0 256444  19036 24023880    0    0   768     0 64027 22708
> 44
> > > 40 16  0  1
> > >
> > >  7  0      0 245356  19052 24034560    0    0   256   472 63509 23276
> 44
> > > 39 17  0  1
> > >
> > >  7  0      0 235096  19052 24046616    0    0     0     0 62277 22516
> 46
> > > 38 15  0  1
> > >
> > >  8  0      0 260548  19036 24020084    0    0   516 49888 62364 22894
> 43
> > > 38 18  0  1
> > >
> > >  5  0      0 249232  19036 24030924    0    0   512     0 61022 24589
> 41
> > > 39 20  0  1
> > >
> > >  6  0      0 238072  19036 24042512    0    0  1024     0 63358 23063
> 44
> > > 38 17  0  0
> > >
> > >  5  0      0 262904  19052 24017972    0    0     0   440 63078 23499
> 46
> > > 37 17  0  1
> > >
> > >  7  0      0 250324  19052 24030008    0    0     0     0 64615 22617
> 48
> > > 38 14  0  1
> > >
> > >  6  0      0 237920  19052 24042372    0    0  1024 48900 63223 23029
> 42
> > > 40 18  0  1
> > >
> > >
> > > *IO Stat:*
> > >
> > > [root]# iostat -m
> > >
> > > Linux 4.14.72-73.55.amzn2.x86_64 (loc-kafka11.internal.dnaspaces.io)
> > > 01/02/2020        _x86_64_             (4 CPU)
> > >
> > >
> > >
> > > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
> > >
> > >           38.11    0.00   33.09    0.11    0.61   28.08
> > >
> > >
> > >
> > > Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
> > >
> > > xvda              2.36         0.01         0.01      26760      43360
> > >
> > > nvme0n1           0.00         0.00         0.00          2          0
> > >
> > > xvdf             70.95         0.06         7.67     185908   25205338
> > >
> > > *Top Kafka broker threads:*
> > > [image: image.png]
> > >
> > > *Top 3:*
> > >
> > >
> "data-plane-kafka-network-thread-10-ListenerName(PLAINTEXT)-PLAINTEXT-0"
> > > #60 prio=5 os_prio=0 tid=0x00007f8b1ab56000 nid=0x581f runnable
> > > [0x00007f8a886ce000]
> > >
> > >
> "data-plane-kafka-network-thread-10-ListenerName(PLAINTEXT)-PLAINTEXT-2"
> > > #62 prio=5 os_prio=0 tid=0x00007f8b1ab59000 nid=0x5821 runnable
> > > [0x00007f8a6aefd000]
> > >
> > >
> "data-plane-kafka-network-thread-10-ListenerName(PLAINTEXT)-PLAINTEXT-1"
> > > #61 prio=5 os_prio=0 tid=0x00007f8b1ab57800 nid=0x5820 runnable
> > > [0x00007f8a885cd000]
> > >
> > > It doesn't looks like GC and IO is the problem.
> > >
> > > Thanks
> > >
> >
>

Reply via email to