flink读写kafka

宁吉浩 Mon, 21 Sep 2020 18:34:42 -0700

hi,all
最近用flink写入kafka,发现checkpoint失败特别多,基本50%的都失败了。
checkpoint时间间隔的30~60s 之间，没有大状态，基本就是维护offset的状态
希望能帮我看看 是什么原因导致的,能否降低一下 checkpoint的失败率


主要报错如下所示:
Caused by: org.apache.kafka.common.errors.ProducerFencedException: Producer 
attempted an operation with an old epoch. Either there is a newer producer with 
the same transactionalId, or the producer's transaction has been expired by the 
broker.


Caused by: org.apache.flink.util.FlinkRuntimeException: Committing one of 
transactions failed, logging first encountered failure

kafka日志如下:
org.apache.kafka.common.errors.ProducerFencedException: Producer's epoch is no 
longer valid. There is probably another producer with a newer epoch. 16744 
(request epoch), 16745 (server epoch)
org.apache.kafka.common.errors.ProducerFencedException: Producer's epoch is no 
longer valid. There is probably another producer with a newer epoch. 16744 
(request epoch), 16745 (server epoch)
org.apache.kafka.common.errors.ProducerFencedException: Producer's epoch is no 
longer valid. There is probably another producer with a newer epoch. 16744 
(request epoch), 16745 (server epoch)
org.apache.kafka.common.errors.ProducerFencedException: Producer's epoch is no 
longer valid. There is probably another producer with a newer epoch. 16744 
(request epoch), 16745 (server epoch)

hdfs暂时没发现异常

flink读写kafka

回复