rdhabalia opened a new pull request #7964:
URL: https://github.com/apache/pulsar/pull/7964
### Motivation
We have seen frequent instances where topic unloading stuck and we have to
restart broker. Unloading topic gives 500 with below broker log
```
01:26:19.691 [pulsar-web-35-6] INFO
org.apache.pulsar.broker.admin.impl.PersistentTopicsBase - [admin] Unloading
topic persistent://tenant/cluster/ns/topic-partition-15
01:26:19.692 [pulsar-web-35-6] WARN
org.apache.pulsar.broker.service.persistent.PersistentTopic -
[persistent://tenant/cluster/ns/topic-partition-15] Topic is already being
closed or deleted
:
01:26:19.692 [pulsar-web-35-6] ERROR
org.apache.pulsar.broker.admin.impl.PersistentTopicsBase - [admin] Failed to
unload topic persistent://tenant/cluster/ns/topic-partition-15,
org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException:
Topic is already fenced
java.util.concurrent.ExecutionException:
org.apache.pulsar.broker.service.BrokerServiceException$TopicFencedException:
Topic is already fenced
at
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395)
~[?:?]
at
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) ~[?:?]
```
It happens when first topic-unload stuck and unloading never completes.
After investigating broker heap-dump, we found out that
[PersistentDispatcherMultipleConsumers invoked consumer disconnect but it
couldn't
remove](https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentDispatcherMultipleConsumers.java#L183)
from the cached
[consumerSet](https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentDispatcherMultipleConsumers.java#L181).
We can also validate it in log where we can see consumer disconnect log but
can't see log after dispatcher removes the consumer from cache.
```
01:24:20.024 [pulsar-web-35-31] INFO
org.apache.pulsar.broker.service.Consumer - Disconnecting consumer:
Consumer{subscription=PersistentSubscription{topic=persistent://
tenant/bf1/ns/topic-partition-15, name=sub1}, consumerId=15,
consumerName=a36e0, address=/1.1.1.1:42808}
01:24:20.024 [pulsar-web-35-31] INFO
org.apache.pulsar.broker.service.Consumer - Disconnecting consumer:
Consumer{subscription=PersistentSubscription{topic=persistent://
tenant/bf1/ns/topic-partition-15, name=sub2}...
```
Therefore, we are adding info log for future investigation if Dispatcher
doesn't find consumer in set which should never happen.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]