Hi Gyula, I'm not aware of any recent issues with the Kafka Producer. However there was one with the Kafka Consumer which prevented the proper cancellation ( https://issues.apache.org/jira/browse/FLINK-5048).
Which version of Flink and which Kafka Producer were you using? Cheers, Till On Tue, Nov 22, 2016 at 10:03 AM, Gyula Fóra <gyf...@apache.org> wrote: > Hi, > > Has anyone ever experienced the Kafka producer getting stuck in cancelling? > > I am aware that there were problems with the Kafka consumer before but I > haven't seen this one yet. It happened simultaneously to 3 of my jobs last > night, they were stuck from about 8 pm to 8 am (not exact times but you get > the length.). > > The logs don't seem to be very helpful on the JobManager, they just show > that all tasks start cancelling and then go cancelled except for one Kafka > sink task. That goes into cancelling but only gets cancelled 12 hours > later. On one of the task managers I have found this though: > > 2016-11-21 20:22:52,220 INFO org.apache.flink.yarn.YarnTaskManager > - Un-registering task and sending final execution > state CANCELED to JobManager for task Execute EventProcessors > (f030e71787a6dbd7a543e9745c42289d) > > 2016-11-22 08:49:35,181 WARN org.apache.kafka.common.network.Selector > - Error in I/O with > kafka17.sto.midasplayer.com/172.25.82.212 > java.io.EOFException > at org.apache.kafka.common.network.NetworkReceive. > readFrom(NetworkReceive.java:62) > at org.apache.kafka.common.network.Selector.poll( > Selector.java:248) > at org.apache.kafka.clients.NetworkClient.poll( > NetworkClient.java:192) > at org.apache.kafka.clients.producer.internals.Sender.run( > Sender.java:191) > at org.apache.kafka.clients.producer.internals.Sender.run( > Sender.java:135) > at java.lang.Thread.run(Thread.java:745) > 2016-11-22 08:49:35,183 INFO > org.apache.flink.runtime.taskmanager.Task - Sink: > Kafka output (2/8) switched to CANCELED > > > There might have been some network/kafka issue that caused 3 jobs to get > stuck at the same time but I don't know what actually happened. > > Any ideas? > Gyula >