Henry Cai created KAFKA-8089:
--------------------------------
Summary: High level consumer from MirrorMaker is slow to deal with
SSL certification expiration
Key: KAFKA-8089
URL: https://issues.apache.org/jira/browse/KAFKA-8089
Project: Kafka
Issue Type: Bug
Components: clients, consumer
Affects Versions: 2.0.0
Reporter: Henry Cai
We have been using Kafka 2.0's mirror maker (which used High level consumer) to
do replication. The topic is SSL enabled and the certificate will expire at a
random time within 12 hours. When the certificate expired we will see many SSL
related exception in the log
[2019-03-07 18:02:54,128] ERROR [Consumer
clientId=kafkamirror-euw1-use1-m10nkafka03-1,
groupId=kafkamirror-euw1-use1-m10nkafka03] Connection to node 3005 failed
authentication due to: SSL handshake failed
(org.apache.kafka.clients.NetworkClient)
This error will repeat for several hours.
However even with the SSL error, the preexisting socket connection will still
work so the main fetching activities is actually not affected, but the metadata
operations from the client and the heartbeats from heartbeat thread will be
affected since they might open new socket connections. I think those errors
are most likely originated from those side activities.
The situation will last several hours until the main fetcher thread tried to
open a new connection (usually due to consumer rebalance) and then the SSL
Authentication exception will abort the operation and mirror maker will exit.
During that several hours, the client wouldn't be able to get the latest
metadata and heartbeats also falters (we see rebalancing triggered because of
this).
In NetworkClient.processDisconnection(), when the above method prints the ERROR
message, can it just throw the AuthenticationException up, this will kill the
KafkaConsumer.poll(), and this will speedup the certificate recycle (in our
case, we will restart the mirror maker with the new certificate)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)