GitHub user Radiancebobo edited a discussion: Issues with Topic-Level Policies 
after Cluster Restart in Pulsar 3.0.5

We are experiencing an issue with Pulsar 3.0.5 in our environment and would 
like to seek your advice.
# Environment Details
Pulsar Version: 3.0.5
Cluster Setup: 3 Bookies, 3 Brokers
System Topic Enabled: Yes
Topic-Level Policies Enabled: Yes
Reason for Using Topic-Level Policies: Different replication requirements for 
topics under the same namespace.
## Bookie Configuration:
```yaml
journalSyncData: "true"
journalWriteData: "true"
```
## Broker Configuration:
```yaml
allowAutoTopicCreation: "true"
brokerDeleteInactiveTopicsEnabled: "false"
defaultNumPartitions: "3"
defaultRetentionSizeInMB: "103424"
defaultRetentionTimeInMinutes: "4320"
managedLedgerDefaultAckQuorum: "2"
managedLedgerDefaultEnsembleSize: "2"
managedLedgerDefaultWriteQuorum: "2"
managedLedgerMaxEntriesPerLedger: "50000"
managedLedgerMaxLedgerRolloverTimeMinutes: "240"
managedLedgerMinLedgerRolloverTimeMinutes: "10"
systemTopicEnabled: "true"
topicLevelPoliciesEnabled: "true"
```


# Problem Description
After a cluster restart (including after a power outage), some topics may 
occasionally encounter the following error, causing them to be unable to 
produce or consume:
## Error 1: BrokerService Exception
```yaml
2025-10-31T14:08:29,658+0000 [pulsar-io-5-8] ERROR 
org.apache.pulsar.broker.service.BrokerService - Topic creation encountered an 
exception by initialize topic policies service. 
topic_name=persistent://10001001/default/log-partition-4 error_message=The 
subscription multiTopicsReader-f5fb22e226 of the topic 
persistent://10001001/default/__change_events-partition-0 gets the last message 
id was failed
{"errorMsg":"Failed to read last entry of the compacted Ledger Error while 
reading ledger","reqId":4227693217171430891, 
"remote":"pulsar-broker-0.pulsar-broker.pulsar.svc.cluster.local/22.25.102.149:6650",
 "local":"/22.25.102.149:59422"}
org.apache.pulsar.client.api.PulsarClientException$BrokerMetadataException: The 
subscription multiTopicsReader-f5fb22e226 of the topic 
persistent://10001001/default/__change_events-partition-0 gets the last message 
id was failed
{"errorMsg":"Failed to read last entry of the compacted Ledger Error while 
reading ledger","reqId":4227693217171430891, 
"remote":"pulsar-broker-0.pulsar-broker.pulsar.svc.cluster.local/22.25.102.149:6650",
 "local":"/22.25.102.149:59422"}
        at 
org.apache.pulsar.client.api.PulsarClientException.wrap(PulsarClientException.java:993)
 ~[org.apache.pulsar-pulsar-client-api-v3.0.5-v1.0.1.jar:v3.0.5-v1.0.1]
        at 
org.apache.pulsar.client.impl.ConsumerImpl.lambda$internalGetLastMessageIdAsync$64(ConsumerImpl.java:2566)
 ~[org.apache.pulsar-pulsar-client-original-v3.0.5-v1.0.1.jar:v3.0.5-v1.0.1]
        at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990)
 ~[?:?]
        at 
java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:974)
 ~[?:?]
        at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510) 
~[?:?]
        at 
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
 ~[?:?]
        at 
org.apache.pulsar.client.impl.ClientCnx.handleError(ClientCnx.java:792) 
~[org.apache.pulsar-pulsar-client-original-v3.0.5-v1.0.1.jar:v3.0.5-v1.0.1]
         at 
org.apache.pulsar.common.protocol.PulsarDecoder.channelRead(PulsarDecoder.java:192)
 ~[org.apache.pulsar-pulsar-common-v3.0.5-v1.0.1.jar:v3.0.5-v1.0.1]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
 ~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
 ~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
 ~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:346)
 ~[io.netty-netty-codec-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:318)
 ~[io.netty-netty-codec-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
 ~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
 ~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
 ~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.handler.flush.FlushConsolidationHandler.channelRead(FlushConsolidationHandler.java:152)
 ~[io.netty-netty-handler-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
 ~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
 ~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:868)
 ~[io.netty-netty-transport-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:799)
 ~[io.netty-netty-transport-classes-epoll-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:501) 
~[io.netty-netty-transport-classes-epoll-4.1.115.Final.jar:4.1.115.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:399) 
~[io.netty-netty-transport-classes-epoll-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
 ~[io.netty-netty-common-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) 
~[io.netty-netty-common-4.1.115.Final.jar:4.1.115.Final]
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 ~[io.netty-netty-common-4.1.115.Final.jar:4.1.115.Final]
        at java.lang.Thread.run(Thread.java:840) ~[?:?]
```
## Error 2: BookKeeper Read Failures
```go
2025-10-31T14:08:30,971+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Read for failed on bookie 
pulsar-bookie-2.pulsar-bookie.pulsar.svc.cluster.local:3181 code EIO
2025-10-31T14:08:30,971+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Error while reading ledger 
while reading L674 E0 from bookie: 
pulsar-bookie-2.pulsar-bookie.pulsar.svc.cluster.local:3181
2025-10-31T14:08:30,971+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] ERROR 
org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L674 
E0-E0, Sent to [pulsar-bookie-2.pulsar-bookie.pulsar.svc.cluster.local:3181, 
pulsar-bookie-1.pulsar-bookie.pulsar.svc.cluster.local:3181], Heard from [] : 
bitset = {}, Error = 'Error while reading ledger'. First unread entry is (-1, 
rc = null)
2025-10-31T14:08:30,971+0000 [BookKeeperClientWorker-OrderedExecutor-0-0] WARN  
org.apache.pulsar.broker.service.ServerCnx - 
[/22.25.102.149:59422][persistent://business/sec/__change_events-partition-1][multiTopicsReader-f080765823]
 Failed to create consumer: consumerId=2407, Error while reading ledger -  
ledger=674 - operation=Failed to read entry - entry=0
2025-10-31T14:08:30,972+0000 [pulsar-io-5-8] WARN  
org.apache.pulsar.client.impl.ClientCnx - [id: 0xc7781652, 
L:/22.25.102.149:59422 - 
R:pulsar-broker-0.pulsar-broker.pulsar.svc.cluster.local/22.25.102.149:6650] 
Received error from server: Error while reading ledger -  ledger=674 - 
operation=Failed to read entry - entry=0
2025-10-31T14:08:30,972+0000 [pulsar-io-5-8] WARN  
org.apache.pulsar.client.impl.ConsumerImpl - 
[persistent://business/sec/__change_events-partition-1][multiTopicsReader-f080765823]
 Failed to subscribe to topic on 
pulsar-broker-0.pulsar-broker.pulsar.svc.cluster.local/22.25.102.149:6650
```
## Error 3: Admin API Failures
When running pulsar-admin topics stats:   pulsar-admin topics stats  
persistent://business/notification/entry-partition-0
```go
--- An unexpected error occurred in the server ---

Message: Topic creation encountered an exception by initialize topic policies 
service. topic_name=persistent://business/notification/entry-partition-0 
error_message={"errorMsg":"Error while reading ledger -  ledger=684 - 
operation=Failed to read entry - entry=0","reqId":3147838028978352523, 
"remote":"pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local/22.25.106.148:6650",
 "local":"/22.25.102.143:43158"}

Stacktrace:

org.apache.pulsar.broker.service.BrokerServiceException$ServiceUnitNotReadyException:
 Topic creation encountered an exception by initialize topic policies service. 
topic_name=persistent://business/notification/entry-partition-0 
error_message={"errorMsg":"Error while reading ledger -  ledger=684 - 
operation=Failed to read entry - entry=0","reqId":3147838028978352523, 
"remote":"pulsar-broker-2.pulsar-broker.pulsar.svc.cluster.local/22.25.106.148:6650",
 "local":"/22.25.102.143:43158"}
        at 
org.apache.pulsar.broker.service.BrokerService.lambda$getTopic$28(BrokerService.java:1080)
        at 
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990)
        at 
java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:974)
        at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
        at 
org.apache.pulsar.client.impl.PulsarClientImpl.lambda$createSingleTopicReaderAsync$14(PulsarClientImpl.java:689)
        at 
java.base/java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:990)
        at 
java.base/java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:974)
        at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at 
java.base/java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2162)
        at 
org.apache.pulsar.client.impl.MultiTopicsConsumerImpl.lambda$new$2(MultiTopicsConsumerImpl.java:193)
        at 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
        at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
        at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at 
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
        at 
org.apache.pulsar.client.impl.MultiTopicsConsumerImpl.lambda$closeAsync$24(MultiTopicsConsumerImpl.java:634)
        at 
java.base/java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:863)
        at 
java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:841)
        at 
java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:510)
        at 
java.base/java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:2147)
        at 
org.apache.pulsar.client.impl.ConsumerBase.lambda$failPendingReceive$1(ConsumerBase.java:349)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:840)
```
# Attempted Recovery Methods
Recreating __change_events Topics: but the errors persisted.
Disabling Topic-Level Policies: This resolved the production/consumption issues.

# Questions for the Community
Is this a known bug in Pulsar? If so, in which version was it fixed?

Are there any temporary workarounds besides disabling topic-level policies 
entirely?

Currently, we're considering reverting to namespace-level policies. Are there 
other recommended solutions to maintain topic-level  Policy requirements while 
avoiding this issue?

Any insights or suggestions would be greatly appreciated.

Thank you!

GitHub link: https://github.com/apache/pulsar/discussions/24930

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to