[ https://issues.apache.org/jira/browse/KAFKA-6064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ismael Juma resolved KAFKA-6064. -------------------------------- Resolution: Auto Closed 0.8.2.1 is no longer supported. Many bugs related to topic deletion have been fixed in the releases since then. I suggest upgrading. > Cluster hung when the controller tried to delete a bunch of topics > ------------------------------------------------------------------- > > Key: KAFKA-6064 > URL: https://issues.apache.org/jira/browse/KAFKA-6064 > Project: Kafka > Issue Type: Bug > Components: controller > Affects Versions: 0.8.2.1 > Environment: rhel 6, 12 core, 48GB > Reporter: Chaitanya GSK > Labels: controller, kafka-0.8 > > Hi, > We have been using 0.8.2.1 in our kafka cluster and we had a full cluster > outage when we programmatically tried to delete 220 topics and the controller > got hung and went out of memory. This has somehow led to the whole cluster > outage and the clients were not able to push the data at the right rate. > AFAIK, controller shouldn't impact the write rate to the fellow brokers and > in this case, it did. Below is the client error. > [WARN] Failed to send kafka.producer.async request with correlation id > 1613935688 to broker 44 with data for partitions > [topic_2,65],[topic_2,167],[topic_3,2],[topic_4,0],[topic_5,30],[topic_2,48],[topic_2,150] > java.io.IOException: Broken pipe > at sun.nio.ch.FileDispatcherImpl.writev0(Native Method) ~[?:1.8.0_60] > at sun.nio.ch.SocketDispatcher.writev(SocketDispatcher.java:51) > ~[?:1.8.0_60] > at sun.nio.ch.IOUtil.write(IOUtil.java:148) ~[?:1.8.0_60] > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:504) > ~[?:1.8.0_60] > at java.nio.channels.SocketChannel.write(SocketChannel.java:502) > ~[?:1.8.0_60] > at > kafka.network.BoundedByteBufferSend.writeTo(BoundedByteBufferSend.scala:56) > ~[stormjar.jar:?] > at kafka.network.Send$class.writeCompletely(Transmission.scala:75) > ~[stormjar.jar:?] > at > kafka.network.BoundedByteBufferSend.writeCompletely(BoundedByteBufferSend.scala:26) > ~[stormjar.jar:?] > at kafka.network.BlockingChannel.send(BlockingChannel.scala:103) > ~[stormjar.jar:?] > at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:73) > ~[stormjar.jar:?] > at > kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72) > ~[stormjar.jar:?] > at > kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SyncProducer.scala:103) > ~[stormjar.jar:?] > at > kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) > ~[stormjar.jar:?] > at > kafka.producer.SyncProducer$$anonfun$send$1$$anonfun$apply$mcV$sp$1.apply(SyncProducer.scala:103) > ~[stormjar.jar:?] > at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) ~[stormjar.jar:?] > at > kafka.producer.SyncProducer$$anonfun$send$1.apply$mcV$sp(SyncProducer.scala:102) > ~[stormjar.jar:?] > at > kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) > ~[stormjar.jar:?] > at > kafka.producer.SyncProducer$$anonfun$send$1.apply(SyncProducer.scala:102) > ~[stormjar.jar:?] > at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) ~[stormjar.jar:?] > at kafka.producer.SyncProducer.send(SyncProducer.scala:101) > ~[stormjar.jar:?] > at > kafka.producer.async.YamasKafkaEventHandler.kafka$producer$async$YamasKafkaEventHandler$$send(YamasKafkaEventHandler.scala:481) > [stormjar.jar:?] > at > kafka.producer.async.YamasKafkaEventHandler$$anonfun$dispatchSerializedData$2.apply(YamasKafkaEventHandler.scala:144) > [stormjar.jar:?] > at > kafka.producer.async.YamasKafkaEventHandler$$anonfun$dispatchSerializedData$2.apply(YamasKafkaEventHandler.scala:138) > [stormjar.jar:?] > at > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:772) > [stormjar.jar:?] > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > [stormjar.jar:?] > at > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98) > [stormjar.jar:?] > at > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226) > [stormjar.jar:?] > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39) > [stormjar.jar:?] > at scala.collection.mutable.HashMap.foreach(HashMap.scala:98) > [stormjar.jar:?] > at > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:771) > [stormjar.jar:?] > at > kafka.producer.async.YamasKafkaEventHandler.dispatchSerializedData(YamasKafkaEventHandler.scala:138) > [stormjar.jar:?] > at > kafka.producer.async.YamasKafkaEventHandler.handle(YamasKafkaEventHandler.scala:79) > [stormjar.jar:?] > at > kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:105) > [stormjar.jar:?] > at > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:88) > [stormjar.jar:?] > at > kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:68) > [stormjar.jar:?] > at scala.collection.immutable.Stream.foreach(Stream.scala:547) > [stormjar.jar:?] > at > kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:67) > [stormjar.jar:?] > at > kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:45) > [stormjar.jar:?] > We tried shifting the controller to a different broker and that didn't help. > We had to ultimately clean up the kafka cluster to stabilize it. > Wondering if this is a known issue and if not we would appreciate it if > anyone in the community could provide insights into why the hung controller > would bring down the cluster and why deleting the topics would cause the > controllers hang. -- This message was sent by Atlassian JIRA (v6.4.14#64029)