Andrey Konyaev created KAFKA-3900:
-------------------------------------

             Summary: High CPU util on broker
                 Key: KAFKA-3900
                 URL: https://issues.apache.org/jira/browse/KAFKA-3900
             Project: Kafka
          Issue Type: Bug
         Environment: kafka = 2.11-0.10.0.0
java version "1.8.0_91"
amazon linux
            Reporter: Andrey Konyaev


I start kafka cluster in amazon with m4.xlarge (4 cpu and 16 GB mem (14 
allocate for kafka in heap)). Have three nodes.

I haven't high load (6000 message/sec) and we have cpu_idle = 70%, but sometime 
(about once a day) I see this message in server.log:

[2016-06-24 14:52:22,299] WARN [ReplicaFetcherThread-0-2], Error in fetch 
kafka.server.ReplicaFetcherThread$FetchRequest@6eaa1034 
(kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to 2 was disconnected before the response was 
read
        at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87)
        at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84)
        at scala.Option.foreach(Option.scala:257)
        at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84)
        at 
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80)
        at 
kafka.utils.NetworkClientBlockingOps$.recursivePoll$2(NetworkClientBlockingOps.scala:137)
        at 
kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollContinuously$extension(NetworkClientBlockingOps.scala:143)
        at 
kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80)
        at 
kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:244)
        at 
kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:229)
        at 
kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
        at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:107)
        at 
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:98)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)


I know, this can be network glitch, but why kafka eat all cpu time?

My config:

inter.broker.protocol.version=0.10.0.0
log.message.format.version=0.10.0.0

default.replication.factor=3
num.partitions=3

replica.lag.time.max.ms=15000

broker.id=0
listeners=PLAINTEXT://:9092
log.dirs=/mnt/kafka/kafka
log.retention.check.interval.ms=300000
log.retention.hours=168
log.segment.bytes=1073741824
num.io.threads=20
num.network.threads=10
num.partitions=1
num.recovery.threads.per.data.dir=2
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
socket.send.buffer.bytes=102400
zookeeper.connection.timeout.ms=6000
delete.topic.enable = true
broker.max_heap_size=10 GiB 
  
Any ideas?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to