[ 
https://issues.apache.org/jira/browse/KAFKA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16811415#comment-16811415
 ] 

Kartik commented on KAFKA-8188:
-------------------------------

[~candicewan]

I see below JAAS config file missing error.

2019-04-03 08:25:19.611 
[zk-session-expiry-handler0-SendThread(host1:36100)]WARN 
org.apache.zookeeper.ClientCnxn - SASL configuration failed: 
javax.security.auth.login.LoginException: No JAAS configuration section named 
'Client' was found in specified JAAS configuration file: 
*{color:#333333}'file:/app0/common/config/ldap-auth.config'.{color}* Will 
continue connection to Zookeeper server without SASL authentication, if 
Zookeeper server allows it.
2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(host1:36100)] 
INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
host1/169.30.47.206:36100
2019-04-03 08:25:19.611 [zk-session-expiry-handler0-EventThread] ERROR 
kafka.zookeeper.ZooKeeperClient - [ZooKeeperClient] Auth failed.

 

Because of this, the broker failed to connect to ZK. When it couldn't find the 
config file, it tried connecting to ZK without SASL auth, but ZK refused to 
connect.

 

> Zookeeper Connection Issue Take Down the Whole Kafka Cluster
> ------------------------------------------------------------
>
>                 Key: KAFKA-8188
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8188
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 2.1.1
>            Reporter: Candice Wan
>            Priority: Critical
>         Attachments: thread_dump.log
>
>
> We recently upgraded to 2.1.1 and saw below zookeeper connection issues which 
> took down the whole cluster. We've got 3 nodes in the cluster, 2 of which 
> threw below exceptions at the same second.
> 2019-04-03 08:25:19.603 [main-SendThread(host2:36100)] WARN 
> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, 
> session 0x10071ff9baf0001 has expired
>  2019-04-03 08:25:19.603 [main-SendThread(host2:36100)] INFO 
> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, 
> session 0x10071ff9baf0001 has expired, closing socket connection
>  2019-04-03 08:25:19.605 [main-EventThread] INFO 
> org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 
> 0x10071ff9baf0001
>  2019-04-03 08:25:19.605 [zk-session-expiry-handler0] INFO 
> kafka.zookeeper.ZooKeeperClient - [ZooKeeperClient] Session expired.
>  2019-04-03 08:25:19.609 [zk-session-expiry-handler0] INFO 
> kafka.zookeeper.ZooKeeperClient - [ZooKeeperClient] Initializing a new 
> session to host1:36100,host2:36100,host3:36100.
>  2019-04-03 08:25:19.610 [zk-session-expiry-handler0] INFO 
> org.apache.zookeeper.ZooKeeper - Initiating client connection, 
> connectString=host1:36100,host2:36100,host3:36100 sessionTimeout=6000 
> watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@12f8b1d8
>  2019-04-03 08:25:19.610 [zk-session-expiry-handler0] INFO 
> o.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 4194304 Bytes
>  2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(host1:36100)] 
> WARN org.apache.zookeeper.ClientCnxn - SASL configuration failed: 
> javax.security.auth.login.LoginException: No JAAS configuration section named 
> 'Client' was found in specified JAAS configuration file: 
> 'file:/app0/common/config/ldap-auth.config'. Will continue connection to 
> Zookeeper server without SASL authentication, if Zookeeper server allows it.
>  2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(host1:36100)] 
> INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
> host1/169.30.47.206:36100
>  2019-04-03 08:25:19.611 [zk-session-expiry-handler0-EventThread] ERROR 
> kafka.zookeeper.ZooKeeperClient - [ZooKeeperClient] Auth failed.
>  2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(host1:36100)] 
> INFO org.apache.zookeeper.ClientCnxn - Socket connection established, 
> initiating session, client: /169.20.222.18:56876, server: 
> host1/169.30.47.206:36100
>  2019-04-03 08:25:19.612 [controller-event-thread] INFO 
> k.controller.PartitionStateMachine - [PartitionStateMachine controllerId=3] 
> Stopped partition state machine
>  2019-04-03 08:25:19.613 [controller-event-thread] INFO 
> kafka.controller.ReplicaStateMachine - [ReplicaStateMachine controllerId=3] 
> Stopped replica state machine
>  2019-04-03 08:25:19.614 [controller-event-thread] INFO 
> kafka.controller.KafkaController - [Controller id=3] Resigned
>  2019-04-03 08:25:19.615 [controller-event-thread] INFO 
> kafka.zk.KafkaZkClient - Creating /brokers/ids/3 (is it secure? false)
>  2019-04-03 08:25:19.628 [zk-session-expiry-handler0-SendThread(host1:36100)] 
> INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on 
> server host1/169.30.47.206:36100, sessionid = 0x1007f4d2b810000, negotiated 
> timeout = 6000
>  2019-04-03 08:25:19.631 [/config/changes-event-process-thread] INFO 
> k.c.ZkNodeChangeNotificationListener - Processing notification(s) to 
> /config/changes
>  2019-04-03 08:25:19.637 [controller-event-thread] ERROR 
> k.zk.KafkaZkClient$CheckedEphemeral - Error while creating ephemeral at 
> /brokers/ids/3, node already exists and owner '72182936680464385' does not 
> match current session '72197563457011712'
>  2019-04-03 08:25:19.637 [controller-event-thread] INFO 
> kafka.zk.KafkaZkClient - Result of znode creation at /brokers/ids/3 is: 
> NODEEXISTS
>  2019-04-03 08:25:19.644 [controller-event-thread] ERROR 
> k.c.ControllerEventManager$ControllerEventThread - [ControllerEventThread 
> controllerId=3] Error processing event RegisterBrokerAndReelect
>  org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:126)
>  at kafka.zk.KafkaZkClient.checkedEphemeralCreate(KafkaZkClient.scala:1631)
>  at kafka.zk.KafkaZkClient.registerBroker(KafkaZkClient.scala:87)
>  at 
> kafka.controller.KafkaController$RegisterBrokerAndReelect$.process(KafkaController.scala:1516)
>  at 
> kafka.controller.ControllerEventManager$ControllerEventThread.$anonfun$doWork$1(ControllerEventManager.scala:89)
>  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
>  at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
>  at 
> kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:89)
>  at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)
>  
> Thread dump attached
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to