[ 
https://issues.apache.org/jira/browse/KAFKA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma resolved KAFKA-6064.
--------------------------------
    Resolution: Auto Closed

0.8.2.1 is no longer supported. Many bugs related to topic deletion have been 
fixed in the releases since then. I suggest upgrading.

> Cluster hung when the controller tried to delete a bunch of topics 
> -------------------------------------------------------------------
>
>                 Key: KAFKA-6064
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6064
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8.2.1
>         Environment: rhel 6, 12 core, 48GB 
>            Reporter: Chaitanya GSK
>              Labels: controller, kafka-0.8
>
> Hi, 
> We have been using 0.8.2.1 in our kafka cluster and we had a full cluster 
> outage when we programmatically tried to delete 220 topics and the controller 
> got hung and went out of memory. This has somehow led to the whole cluster 
> outage and the clients were not able to push the data at the right rate. 
> AFAIK, controller shouldn't impact the write rate to the fellow brokers and 
> in this case, it did. Below is the client error.
> [WARN] Failed to send kafka.producer.async request with correlation id 
> 1613935688 to broker 44 with data for partitions 
> [topic_2,65],[topic_2,167],[topic_3,2],[topic_4,0],[topic_5,30],[topic_2,48],[topic_2,150]
> java.io.IOException: Broken pipe
>       at sun.nio.ch.FileDispatcherImpl.writev0(Native Method) ~[?:1.8.0_60]
>       at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) 
> ~[?:1.8.0_60]
>       at sun.nio.ch.IOUtil.write(IOUtil.java:148) ~[?:1.8.0_60]
>       at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:504) 
> ~[?:1.8.0_60]
>       at java.nio.channels.SocketChannel.write(SocketChannel.java:502) 
> ~[?:1.8.0_60]
>       at 
> kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56) 
> ~[stormjar.jar:?]
>       at kafka.network.Send$class.writeCompletely(Transmission.scala:75) 
> ~[stormjar.jar:?]
>       at 
> kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26)
>  ~[stormjar.jar:?]
>       at kafka.network.BlockingChannel.send(BlockingChannel.scala:103) 
> ~[stormjar.jar:?]
>       at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) 
> ~[stormjar.jar:?]
>       at 
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
>  ~[stormjar.jar:?]
>       at 
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103)
>  ~[stormjar.jar:?]
>       at 
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
>  ~[stormjar.jar:?]
>       at 
> kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103)
>  ~[stormjar.jar:?]
>       at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) ~[stormjar.jar:?]
>       at 
> kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102)
>  ~[stormjar.jar:?]
>       at 
> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) 
> ~[stormjar.jar:?]
>       at 
> kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) 
> ~[stormjar.jar:?]
>       at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) ~[stormjar.jar:?]
>       at kafka.producer.SyncProducer.send(SyncProducer.scala:101) 
> ~[stormjar.jar:?]
>       at 
> kafka.producer.async.YamasKafkaEventHandler.kafka$producer$async$YamasKafkaEventHandler$$send(YamasKafkaEventHandler.scala:481)
>  [stormjar.jar:?]
>       at 
> kafka.producer.async.YamasKafkaEventHandler$$anonfun$dispatchSerializedData$2.apply(YamasKafkaEventHandler.scala:144)
>  [stormjar.jar:?]
>       at 
> kafka.producer.async.YamasKafkaEventHandler$$anonfun$dispatchSerializedData$2.apply(YamasKafkaEventHandler.scala:138)
>  [stormjar.jar:?]
>       at 
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772)
>  [stormjar.jar:?]
>       at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) 
> [stormjar.jar:?]
>       at 
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) 
> [stormjar.jar:?]
>       at 
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) 
> [stormjar.jar:?]
>       at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) 
> [stormjar.jar:?]
>       at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) 
> [stormjar.jar:?]
>       at 
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771)
>  [stormjar.jar:?]
>       at 
> kafka.producer.async.YamasKafkaEventHandler.dispatchSerializedData(YamasKafkaEventHandler.scala:138)
>  [stormjar.jar:?]
>       at 
> kafka.producer.async.YamasKafkaEventHandler.handle(YamasKafkaEventHandler.scala:79)
>  [stormjar.jar:?]
>       at 
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105)
>  [stormjar.jar:?]
>       at 
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88)
>  [stormjar.jar:?]
>       at 
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68)
>  [stormjar.jar:?]
>       at scala.collection.immutable.Stream.foreach(Stream.scala:547) 
> [stormjar.jar:?]
>       at 
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67)
>  [stormjar.jar:?]
>       at 
> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45) 
> [stormjar.jar:?]
> We tried shifting the controller to a different broker and that didn't help. 
> We had to ultimately clean up the kafka cluster to stabilize it. 
> Wondering if this is a known issue and if not we would appreciate it if 
> anyone in the community could provide insights into why the hung controller 
> would bring down the cluster and why deleting the topics would cause the 
> controllers hang.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to