GBM-tamerm opened a new issue, #3292:
URL: https://github.com/apache/bookkeeper/issues/3292
**BUG REPORT**
***Describe the bug***
When we stop ZK leader node , it start new elections , and ZK clients get
disconnected , any Bookie node with auto recovery running in the background
will be shutdown with below exception
2022-05-24T02:13:33,263-0400 [AuditorElector-10.119.33.232:3181] ERROR
org.apache.bookkeeper.replication.AuditorElector - Exception while performing
auditor election
java.io.IOException:
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for /ledgers/underreplication/auditorelection/V_0000000079
at
org.apache.bookkeeper.meta.ZkLedgerAuditorManager.createMyVote(ZkLedgerAuditorManager.java:204)
~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at
org.apache.bookkeeper.meta.ZkLedgerAuditorManager.tryToBecomeAuditor(ZkLedgerAuditorManager.java:98)
~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
at
org.apache.bookkeeper.replication.AuditorElector$3.run(AuditorElector.java:184)
[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
2022-05-24T02:13:33,362-0400 [AutoRecoveryDeathWatcher-3181] INFO
org.apache.bookkeeper.replication.AutoRecoveryMain - AutoRecoveryDeathWatcher
noticed the AutoRecovery is not running any more,exiting the watch loop!
2022-05-24T02:13:33,363-0400 [AutoRecoveryDeathWatcher-3181] ERROR
org.apache.bookkeeper.common.component.ComponentStarter - Triggered
exceptionHandler of Component: bookie-server because of Exception in Thread:
Thread[AutoRecoveryDeathWatcher-3181,5,main]
java.lang.RuntimeException: AutoRecovery is not running any more
at
org.apache.bookkeeper.replication.AutoRecoveryMain$AutoRecoveryDeathWatcher.run(AutoRecoveryMain.java:237)
~[org.apache.bookkeeper-bookkeeper-server-4.14.4.jar:4.14.4]
2022-05-24T02:13:33,364-0400 [component-shutdown-thread] INFO
org.apache.bookkeeper.common.component.ComponentStarter - Closing component
bookie-server in shutdown hook.
2022-05-24T02:13:34,072-0400 [component-shutdown-thread] INFO
org.apache.bookkeeper.replication.ReplicationWorker - Shutting down replication
worker
2022-05-24T02:13:34,072-0400 [component-shutdown-thread] INFO
org.apache.bookkeeper.replication.ReplicationWorker - Shutting down
ReplicationWorker
2022-05-24T02:13:34,073-0400 [ReplicationWorker] INFO
org.apache.bookkeeper.replication.ReplicationWorker - ReplicationWorker exited
loop!
2022-05-24T02:13:34,237-0400 [main-EventThread] INFO
org.apache.zookeeper.ClientCnxn - EventThread shut down for session:
0x500000042f40000
2022-05-24T02:13:34,238-0400 [component-shutdown-thread] INFO
org.apache.bookkeeper.proto.BookieServer - Shutting down BookieServer
2022-05-24T02:13:34,238-0400 [component-shutdown-thread] INFO
org.apache.bookkeeper.proto.BookieNettyServer - Shutting down BookieNettyServer
***To Reproduce***
Steps to reproduce the behavior:
1. Stop ZK leader node
2. Stop one BK node ( ex : bookie1) to trigger auto-recovery
3. other running BKs that have auto-recovery will be shutdown with above
error
***Expected behavior***
other running BKs should not be shutdown
***Screenshots***
If applicable, add screenshots to help explain your problem.
***Additional context***
OS: Ubuntu 18.04
Java 8
Pulsar running as systemd service
6 brokers
6 bookies
5 ZK.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]