milos-matijasevic opened a new issue #8991:
URL: https://github.com/apache/pulsar/issues/8991
**Describe the bug**
Null pointer exception while reading entries
```
00:14:43.672 [pulsar-io-25-1] INFO
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl -
[public/default/persistent/xx] Rewind from 85116:72902 to 85116:72902
00:14:43.672 [bookkeeper-ml-workers-OrderedExecutor-0-0] WARN
org.apache.bookkeeper.mledger.impl.OpReadEntry -
[public/default/persistent/x][x] read failed from ledger at
position:85116:72902 : Unknown exception
00:14:43.672 [broker-topic-workers-OrderedScheduler-5-0] ERROR
org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer
- [persistent://public/default/x/
x-Consumer{subscription=PersistentSubscription{topic=persistent://public/default/x,
name=x}, consumerId=1, consumerName=guvgf, address=/172.16.37.221:41704}]
Error reading entries at 85116:72902 : Unknown exception - Retrying to read in
15.0 seconds
```
```
00:20:06.485 [bookkeeper-ml-workers-OrderedExecutor-5-0] ERROR
org.apache.bookkeeper.common.util.SafeRunnable - Unexpected throwable caught
java.lang.NullPointerException: null
at
org.apache.bookkeeper.mledger.impl.OpReadEntry.lambda$readEntriesFailed$0(OpReadEntry.java:94)
~[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at
org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32)
~[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at
org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36)
[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at
org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run(OrderedExecutor.java:203)
[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at
org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run(OrderedExecutor.java:203)
[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_252]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_252]
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
```
**To Reproduce**
Pulsar go client is used, and consumer which tries to read entries.
It fails here
https://github.com/apache/pulsar/blob/v2.6.1/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java#L94
in line `callback.readEntriesComplete(entries, ctx);`
```java
if (!entries.isEmpty()) {
// There were already some entries that were read before, we can return
them
cursor.ledger.getExecutor().execute(safeRun(() -> {
callback.readEntriesComplete(entries, ctx);
recycle();
}));
}
```
The pipeline was that cursors are set to entry by the `SeekByTime` function,
after that first message is consumed just to print it and it wasn't acked. So
this probably happened after that.
Also, another exception occurred during resetting cursors
```
10:36:19.518 [bookkeeper-ml-workers-OrderedExecutor-4-0] WARN
org.apache.pulsar.broker.service.ServerCnx -
[/172.16.41.184:38110][PersistentSubscription{topic=persistent://public/default/z,
name=z}] Failed to reset subscription:
org.apache.pulsar.broker.service.BrokerServiceException:
org.apache.bookkeeper.mledger.ManagedLedgerException: Unknown exception
java.util.concurrent.CompletionException:
org.apache.pulsar.broker.service.BrokerServiceException:
org.apache.bookkeeper.mledger.ManagedLedgerException: Unknown exception
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
~[?:1.8.0_252]
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
~[?:1.8.0_252]
at
java.util.concurrent.CompletableFuture.uniRun(CompletableFuture.java:714)
~[?:1.8.0_252]
at
java.util.concurrent.CompletableFuture$UniRun.tryFire(CompletableFuture.java:701)
~[?:1.8.0_252]
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
~[?:1.8.0_252]
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
~[?:1.8.0_252]
at
org.apache.pulsar.broker.service.persistent.PersistentSubscription$5.findEntryFailed(PersistentSubscription.java:656)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.pulsar.broker.service.persistent.PersistentMessageFinder.findEntryFailed(PersistentMessageFinder.java:118)
~[org.apache.pulsar-pulsar-broker-2.6.1.jar:2.6.1]
at
org.apache.bookkeeper.mledger.impl.OpFindNewest.readEntryFailed(OpFindNewest.java:116)
~[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at
org.apache.bookkeeper.mledger.impl.EntryCacheImpl.lambda$asyncReadEntry0$0(EntryCacheImpl.java:215)
~[org.apache.pulsar-managed-ledger-2.6.1.jar:2.6.1]
at
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
[?:1.8.0_252]
at
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
[?:1.8.0_252]
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
[?:1.8.0_252]
at
org.apache.bookkeeper.common.util.OrderedExecutor$TimedRunnable.run(OrderedExecutor.java:203)
[org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_252]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_252]
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[io.netty-netty-common-4.1.48.Final.jar:4.1.48.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_252]
Caused by: org.apache.pulsar.broker.service.BrokerServiceException:
org.apache.bookkeeper.mledger.ManagedLedgerException: Unknown exception
... 12 more
Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException: Unknown
exception
10:36:19.570 [pulsar-io-25-2] INFO
org.apache.pulsar.broker.service.ServerCnx - Closed connection from
/172.16.41.184:38110
```
But after trying multiple times thay are reset to successfully.
There were multiple subscriptions for the same topic, but this didn't happen
for all subscriptions.
Cursors somehow recover and continue but this is cursor stats.
```
"cursors" : {
"markDeletePosition" : "87378:46711",
"readPosition" : "87378:47273",
"waitingReadOp" : false,
"pendingReadOps" : -20,
"messagesConsumedCounter" : -18228941,
"cursorLedger" : 88621,
"cursorLedgerLastEntry" : 781,
"individuallyDeletedMessages" : "[]",
"lastLedgerSwitchTimestamp" : "2020-12-17T15:36:44.377Z",
"state" : "Open",
"numberOfEntriesSinceFirstNotAckedMessage" : 562,
"totalNonContiguousDeletedMessagesRange" : 0,
"properties" : { }
},
```
I think every time this happens, `pendingReadOps` decreases. While i am
writing this `pendingReadOps` goes to -25
Also, this happens only on topics that have multiple subscriptions.
**Desktop (please complete the following information):**
- OS: Linux
- Pulsar: v2.6.1
- Client: Go v0.3.0
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]