[GitHub] [pulsar-helm-chart] anant-ahuja opened a new issue, #346: Zookeeper exception and Bookie stuck on Init on EKS install

GitBox Fri, 13 Jan 2023 14:21:22 -0800


anant-ahuja opened a new issue, #346:
URL: https://github.com/apache/pulsar-helm-chart/issues/346


   Using the 3.0.0 Helm release, when trying to install Pulsar on an EKS 
cluster I get the following errors.
   
   Zookeeper exception:
   `2023-01-13T22:15:18,697+0000 [SessionTracker] INFO  
org.apache.zookeeper.server.ZooKeeperServer - Expiring session 
0x1000b086ca6062d, timeout of 30000ms exceeded
   2023-01-13T22:15:18,698+0000 [SessionTracker] INFO  
org.apache.zookeeper.server.ZooKeeperServer - Expiring session 
0x1000b086ca6062e, timeout of 30000ms exceeded
   2023-01-13T22:15:19,925+0000 [NIOWorkerThread-2] WARN  
org.apache.zookeeper.server.NIOServerCnxn - Unexpected exception
   org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read 
additional data from client, it probably closed the socket: address = 
/192.168.140.224:50404, session = 0x1000b086ca60650
           at 
org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:163)
 ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:326) 
~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
 ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
 ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
           at java.lang.Thread.run(Thread.java:829) ~[?:?]
   2023-01-13T22:15:20,322+0000 [NIOWorkerThread-1] WARN  
org.apache.zookeeper.server.NIOServerCnxn - Unexpected exception
   org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read 
additional data from client, it probably closed the socket: address = 
/192.168.140.223:43204, session = 0x1000b086ca60651
           at 
org.apache.zookeeper.server.NIOServerCnxn.handleFailedRead(NIOServerCnxn.java:163)
 ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:326) 
~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:522)
 ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:154)
 ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) 
~[?:?]
           at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) 
~[?:?]
           at java.lang.Thread.run(Thread.java:829) ~[?:?]`
   
   
   And the Bookie pods are stuck at Init stage. Below are the logs of the 
Pulsar Bookkeeper Verify Cluster ID init container in the Pulsar Bookie pod. 
   
   `[1.348s][info ][safepoint    ] Application time: 0.0064771 seconds
   [1.348s][info ][safepoint    ] Entering safepoint region: RevokeBias
   [1.349s][info ][safepoint    ] Leaving safepoint region
   [1.349s][info ][safepoint    ] Total time for which application threads were 
stopped: 0.0000971 seconds, Stopping threads took: 0.0000107 seconds
   [1.350s][info ][safepoint    ] Application time: 0.0009467 seconds
   [1.350s][info ][safepoint    ] Entering safepoint region: RevokeBias
   [1.350s][info ][safepoint    ] Leaving safepoint region
   [1.350s][info ][safepoint    ] Total time for which application threads were 
stopped: 0.0000805 seconds, Stopping threads took: 0.0000102 seconds
   2023-01-13T22:18:36,156+0000 [main-SendThread(pulsar-zookeeper:2181)] WARN  
org.apache.zookeeper.ClientCnxn - An exception was thrown while closing send 
thread for session 0x1000b086ca60725.
   org.apache.zookeeper.ClientCnxn$EndOfStreamException: Unable to read 
additional data from server sessionid 0x1000b086ca60725, likely server has 
closed socket
           at 
org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:77) 
~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 ~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
           at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1290) 
~[org.apache.zookeeper-zookeeper-3.6.3.jar:3.6.3]
   2023-01-13T22:18:36,272+0000 [main-EventThread] INFO  
org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 
0x1000b086ca60725
   2023-01-13T22:18:36,272+0000 [main] INFO  org.apache.zookeeper.ZooKeeper - 
Session: 0x1000b086ca60725 closed
   [1.453s][info ][safepoint    ] Application time: 0.1029216 seconds
   [1.453s][info ][safepoint    ] Entering safepoint region: RevokeBias
   [1.453s][info ][safepoint    ] Leaving safepoint region
   [1.453s][info ][safepoint    ] Total time for which application threads were 
stopped: 0.0001701 seconds, Stopping threads took: 0.0000845 seconds
   Exception in thread "main" 
com.google.common.util.concurrent.UncheckedExecutionException: 
org.apache.bookkeeper.bookie.BookieException$MetadataStoreException: Failed to 
get cluster instance id
           at 
org.apache.bookkeeper.tools.cli.commands.bookies.InstanceIdCommand.apply(InstanceIdCommand.java:61)
           at 
org.apache.bookkeeper.bookie.BookieShell$WhatIsInstanceId.runCmd(BookieShell.java:1495)
           at 
org.apache.bookkeeper.bookie.BookieShell$MyCommand.runCmd(BookieShell.java:238)
           at 
org.apache.bookkeeper.bookie.BookieShell.run(BookieShell.java:2278)
           at 
org.apache.bookkeeper.bookie.BookieShell.main(BookieShell.java:2369)
   Caused by: java.util.concurrent.ExecutionException: 
org.apache.bookkeeper.bookie.BookieException$MetadataStoreException: Failed to 
get cluster instance id
           at 
org.apache.bookkeeper.meta.MetadataDrivers.runFunctionWithMetadataBookieDriver(MetadataDrivers.java:355)
           at 
org.apache.bookkeeper.meta.MetadataDrivers.runFunctionWithRegistrationManager(MetadataDrivers.java:375)
           at 
org.apache.bookkeeper.tools.cli.commands.bookies.InstanceIdCommand.apply(InstanceIdCommand.java:49)
           ... 4 more
   Caused by: 
org.apache.bookkeeper.bookie.BookieException$MetadataStoreException: Failed to 
get cluster instance id
           at 
org.apache.bookkeeper.discover.ZKRegistrationManager.getClusterInstanceId(ZKRegistrationManager.java:429)
           at 
org.apache.bookkeeper.tools.cli.commands.bookies.InstanceIdCommand.lambda$apply$0(InstanceIdCommand.java:52)
           at 
org.apache.bookkeeper.meta.MetadataDrivers.lambda$runFunctionWithRegistrationManager$1(MetadataDrivers.java:375)
           at 
org.apache.bookkeeper.meta.MetadataDrivers.runFunctionWithMetadataBookieDriver(MetadataDrivers.java:350)
           ... 6 more
   Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
KeeperErrorCode = NoNode for BookKeeper metadata
           at 
org.apache.bookkeeper.discover.ZKRegistrationManager.getClusterInstanceId(ZKRegistrationManager.java:419)
           ... 9 more
   [1.454s][info ][safepoint    ] Application time: 0.0016331 seconds
   [1.454s][info ][safepoint    ] Entering safepoint region: RevokeBias
   [1.454s][info ][safepoint    ] Leaving safepoint region
   [1.455s][info ][safepoint    ] Total time for which application threads were 
stopped: 0.0000907 seconds, Stopping threads took: 0.0000087 seconds
   [1.455s][info ][safepoint    ] Application time: 0.0001440 seconds
   [1.455s][info ][safepoint    ] Entering safepoint region: RevokeBias
   [1.455s][info ][safepoint    ] Leaving safepoint region
   [1.455s][info ][safepoint    ] Total time for which application threads were 
stopped: 0.0000836 seconds, Stopping threads took: 0.0000055 seconds
   [1.455s][info ][safepoint    ] Application time: 0.0001316 seconds
   [1.455s][info ][safepoint    ] Entering safepoint region: RevokeBias
   [1.455s][info ][safepoint    ] Leaving safepoint region
   [1.455s][info ][safepoint    ] Total time for which application threads were 
stopped: 0.0000824 seconds, Stopping threads took: 0.0000073 seconds
   [1.455s][info ][safepoint    ] Application time: 0.0001430 seconds
   [1.455s][info ][safepoint    ] Entering safepoint region: RevokeBias
   [1.455s][info ][safepoint    ] Leaving safepoint region
   [1.455s][info ][safepoint    ] Total time for which application threads were 
stopped: 0.0000758 seconds, Stopping threads took: 0.0000091 seconds
   [1.457s][info ][gc,heap,exit ] Heap
   [1.457s][info ][gc,heap,exit ]  garbage-first heap   total 133120K, used 
68059K [0x00000000f0000000, 0x0000000100000000)
   [1.457s][info ][gc,heap,exit ]   region size 1024K, 62 young (63488K), 8 
survivors (8192K)
   [1.457s][info ][gc,heap,exit ]  Metaspace       used 16284K, capacity 
16690K, committed 17024K, reserved 1064960K
   [1.457s][info ][gc,heap,exit ]   class space    used 1907K, capacity 2055K, 
committed 2176K, reserved 1048576K
   [1.457s][info ][safepoint    ] Application time: 0.0020102 seconds
   [1.457s][info ][safepoint    ] Entering safepoint region: Halt`
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [pulsar-helm-chart] anant-ahuja opened a new issue, #346: Zookeeper exception and Bookie stuck on Init on EKS install

Reply via email to