otmanel31 commented on issue #12160:
URL: https://github.com/apache/pulsar/issues/12160#issuecomment-933259609


   1) Also, two days ago, we also face an other issue (that trigger a shutdown 
of our deployement), where brokers timeout on zookeeper call. Before first 
exception, i catched a lot of logs (Info) in few seconds like below for each 
topics in all brokers:
   - 09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO  
org.apache.pulsar.broker.service.AbstractTopic - Disabling publish throttling 
for 
persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a
   09:49:31.186 [ForkJoinPool.commonPool-worker-1] INFO  
org.apache.pulsar.broker.service.persistent.PersistentTopic - 
[persistent://my_tenant/my_ns/action_down-f640fa62-3245-41af-81f5-edf868118a9a] 
Policies updated successfully
   For your information, we manage more than 25000 topics, so we have this 2 
lines for each active topic.
   Is there any request to zookeeper when this 2 previous log  lines appear ? 
   
   2) Then first exception thrown is: 
   
   09:50:06.167 [pulsar-ordered-OrderedExecutor-1-0] WARN  
org.apache.pulsar.broker.service.BrokerService - Got exception when reading 
persistence policy for 
persistent://my_tenant/my_ns/data_down-5034936e-c9bb-4720-874d-e7e6e5e6d897: 
null
   java.util.concurrent.TimeoutException: null
   at 
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) 
~[?:1.8.0_252]
   at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) 
~[?:1.8.0_252]
   at 
org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) 
~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.mledger.util.SafeRun$2.safeRun(SafeRun.java:49) 
[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) 
[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_252]
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_252]
   at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
   09:50:14.820 [prometheus-stats-43-1] ERROR 
org.apache.pulsar.broker.service.BacklogQuotaManager - Failed to read policies 
data, will apply the default backlog quota: namespace=my_tenant/my_ns
   java.util.concurrent.TimeoutException: null
   at 
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784) 
~[?:1.8.0_252]
   at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) 
~[?:1.8.0_252]
   at 
org.apache.pulsar.zookeeper.ZooKeeperDataCache.get(ZooKeeperDataCache.java:97) 
~[org.apache.pulsar-pulsar-zookeeper-utils-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.service.BacklogQuotaManager.getBacklogQuota(BacklogQuotaManager.java:64)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.service.persistent.PersistentTopic.getBacklogQuota(PersistentTopic.java:1859)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.getTopicStats(NamespaceStatsAggregator.java:97)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$0(NamespaceStatsAggregator.java:65)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388)
 ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160)
 ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$null$1(NamespaceStatsAggregator.java:64)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388)
 ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160)
 ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:388)
 ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:160)
 ~[org.apache.pulsar-pulsar-common-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsGenerator.generate(PrometheusMetricsGenerator.java:85)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) 
[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) 
[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_252]
   at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
   at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_252]
   at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [?:1.8.0_252]
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_252]
   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_252]
   at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 [io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
   at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252] at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_252]
   at 
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.generate(NamespaceStatsAggregator.java:60)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.stats.prometheus.NamespaceStatsAggregator.lambda$generate$2(NamespaceStatsAggregator.java:63)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   java.util.concurrent.TimeoutException: null
   at 
org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1074)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   java.util.concurrent.TimeoutException: null
   at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) 
[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
   at 
org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:70)
 ~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
   
   This previous exception was throws only in my broker-0. (we have 4 running 
broker and 4 zookeeper).
   Then, as my broker-0 was down, it seems load balancing not correctly worked 
and not dispatched to each others brokers
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to