[ 
https://issues.apache.org/jira/browse/KAFKA-8089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajini Sivaram reassigned KAFKA-8089:
-------------------------------------

    Assignee: Rajini Sivaram

> High level consumer from MirrorMaker is slow to deal with SSL certification 
> expiration
> --------------------------------------------------------------------------------------
>
>                 Key: KAFKA-8089
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8089
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, consumer
>    Affects Versions: 2.0.0
>            Reporter: Henry Cai
>            Assignee: Rajini Sivaram
>            Priority: Critical
>
> We have been using Kafka 2.0's mirror maker (which used High level consumer) 
> to do replication.  The topic is SSL enabled and the certificate will expire 
> at a random time within 12 hours.  When the certificate expired we will see 
> many SSL related exception in the log
>  
> [2019-03-07 18:02:54,128] ERROR [Consumer 
> clientId=kafkamirror-euw1-use1-m10nkafka03-1, 
> groupId=kafkamirror-euw1-use1-m10nkafka03] Connection to node 3005 failed 
> authentication due to: SSL handshake failed 
> (org.apache.kafka.clients.NetworkClient)
> This error will repeat for several hours.
> However even with the SSL error, the preexisting socket connection will still 
> work so the main fetching activities is actually not affected, but the 
> metadata operations from the client and the heartbeats from heartbeat thread 
> will be affected since they might open new socket connections.  I think those 
> errors are most likely originated from those side activities.
> The situation will last several hours until the main fetcher thread tried to 
> open a new connection (usually due to consumer rebalance) and then the SSL 
> Authentication exception will abort the operation and mirror maker will exit.
> During that several hours, the client wouldn't be able to get the latest 
> metadata and heartbeats also falters (we see rebalancing triggered because of 
> this).
> In NetworkClient.processDisconnection(), when the above method prints the 
> ERROR message, can it just throw the AuthenticationException up, this will 
> kill the KafkaConsumer.poll(), and this will speedup the certificate recycle 
> (in our case, we will restart the mirror maker with the new certificate)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to