GitHub user otmanel31 added a comment to the discussion: [question] pulsar
broker reboot
1) Also, two days ago, we also face an other issue (that trigger a shutdown of
our deployement), where brokers timeout on zookeeper call. Before first
exception, i catched a lot of logs (Info) in few seconds like below for each
topics in all brokers:
- 09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO
org.apache.pulsar.broker.service.AbstractTopic - Disabling publish throttling
for
persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a
09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO
org.apache.pulsar.broker.service.persistent.PersistentTopic -
[persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a]
Policies updated successfully
For your information, we manage more than 25000 topics, so we have this 2 lines
for each active topic.
Is there any request to zookeeper when this 2 previous log lines appear ?
2) Then first exception thrown is:
09:50:06.167 [pulsar-ordered-OrderedExecutor-1-0] WARN
org.apache.pulsar.broker.service.BrokerService - Got exception when reading
persistence policy for
persistent://my_tenant/my_ns/data_down-5034936e-c9bb-4720-874d-e7e6e5e6d897:
null
java.util.concurrent.TimeoutException: null
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
~[?:1.8.0_252]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
~[?:1.8.0_252]
at
org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97)
~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.mledger.util.SafeRun$2.safeRun(SafeRun.java:49)
[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_252]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_252]
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
09:50:14.820 [prometheus-stats-43-1] ERROR
org.apache.pulsar.broker.service.BacklogQuotaManager - Failed to read policies
data, will apply the default backlog quota: namespace=my_tenant/my_ns
java.util.concurrent.TimeoutException: null
at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
~[?:1.8.0_252]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
~[?:1.8.0_252]
at
org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97)
~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.service.BacklogQuotaManager.getBacklogQuota(BacklogQuotaManager.java:64)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.service.persistent.PersistentTopic.getBacklogQuota(PersistentTopic.java:1859)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.getTopicStats(NamespaceStatsAggregator.java:97)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$0(NamespaceStatsAggregator.java:65)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388)
~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160)
~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$1(NamespaceStatsAggregator.java:64)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388)
~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160)
~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388)
~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
at
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160)
~[org.apache.pulsar-pulsar-common-2.6.1.jar