[jira] [Created] (FLINK-23839) Unclear severity of Kafka transaction recommit warning in logs

Nico Kruber (Jira) Tue, 17 Aug 2021 08:10:04 -0700

Nico Kruber created FLINK-23839:
-----------------------------------

             Summary: Unclear severity of Kafka transaction recommit warning in 
logs
                 Key: FLINK-23839
                 URL: https://issues.apache.org/jira/browse/FLINK-23839
             Project: Flink
          Issue Type: Sub-task
          Components: Connectors / Kafka
    Affects Versions: 1.13.2, 1.12.5, 1.11.4
            Reporter: Nico Kruber

In a transactional Kafka sink, after recovery, all transactions from the
recovered checkpoint are recommitted even though they may have already been
committed before because this is not part of the checkpoint.

This second commit can lead to a number of WARN entries in the logs coming from
[KafkaCommitter|https://github.com/apache/flink/blob/6c9818323b41a84137c52822d2993df788dbc9bb/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/connector/kafka/sink/KafkaCommitter.java#L66]
or
[FlinkKafkaProducer|https://github.com/apache/flink/blob/6c9818323b41a84137c52822d2993df788dbc9bb/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducer.java#L1057].

Examples:
{code}
WARN [org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer] ...
Encountered error org.apache.kafka.common.errors.InvalidTxnStateException: The
producer attempted a transactional operation in an invalid state. while
recovering transaction KafkaTransactionState [transactionalId=...,
producerId=12345, epoch=123]. Presumably this transaction has been already
committed before.

WARN [org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer] ...
Encountered error org.apache.kafka.common.errors.ProducerFencedException:
Producer attempted an operation with an old epoch. Either there is a newer
producer with the same transactionalId, or the producer's transaction has been
expired by the broker. while recovering transaction KafkaTransactionState
[transactionalId=..., producerId=12345, epoch=12345]. Presumably this
transaction has been already committed before
{code}

It sounds to me like the second exception is useful and indicates that the
transaction timeout is too short. The first exception, however, seems
superfluous and rather alerts the user more than it helps. Or what would you do
with it?

Can we instead filter out superfluous exceptions and at least put these onto
DEBUG logs instead?

--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23839) Unclear severity of Kafka transaction recommit warning in logs

Reply via email to