jujugrrr opened a new issue #7246:
URL: https://github.com/apache/pulsar/issues/7246
**Describe the bug**
Once ledgers are offloaded and removed from local storage, a reader is not
able to retrieve the messages.
**To Reproduce**
Produce messages
Read messages from beginning (SUCCESS)
Rollover Ledgers
Offload
Wait for the offload deletion lag
Read messages from beginning (FAIL - Timeout)
**Expected behavior**
I should still be able to retrieve messages from the beginning of my topic
if they are offloaded to S3
**Additional context**
Pulsar on AWS EKS, with the latest helm chart.
**More details**
I'm testing pulsar offloading to S3. I have a script producing 1M messages
and another one reading them. The reading works well a few times (I re-run from
scratch) but then I start to get exceptions:
```
10:28:56.701 [pulsar-io-24-1] INFO
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl -
[ten/ns/persistent/my-topic-reader-3558c16521] Rewind from 233:0 to 233:0
10:28:56.701 [pulsar-io-24-1] INFO
org.apache.pulsar.broker.service.persistent.PersistentTopic -
[persistent://ten/ns/my-topic] There are no replicated subscriptions on the
topic
10:28:56.701 [pulsar-io-24-1] INFO
org.apache.pulsar.broker.service.persistent.PersistentTopic -
[persistent://ten/ns/my-topic][reader-3558c16521] Created new subscription for 0
10:28:56.701 [pulsar-io-24-1] INFO
org.apache.pulsar.broker.service.ServerCnx - [/x.x.x.x7:53908] Created
subscription on topic persistent://ten/ns/my-topic / reader-3558c16521
10:28:56.705 [bookkeeper-ml-workers-OrderedExecutor-6-0] WARN
org.apache.bookkeeper.mledger.impl.OpReadEntry -
[ten/ns/persistent/my-topic][reader-3558c16521] read failed from ledger at
position:233:0 : Unknown exception
10:28:56.705 [broker-topic-workers-OrderedScheduler-3-0] ERROR
org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer
- [persistent://ten/ns/my-topic /
reader-3558c16521-Consumer{subscription=PersistentSubscription{topic=persistent://ten/ns/my-topic,
name=reader-3558c16521}, consumerId=0, consumerName=,
address=/x.x.x.x7:53908}] Error reading entries at 233:0 : Unknown exception -
Retrying to read in 15.0 seconds
```
Those ledgers are getting offloaded to S3. It looks like as soon as the
ledger is removed(set-offload-deletion-lag) from the local storage I'm getting
the exception below.
```
10:28:28.452 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl -
[ten/ns/persistent/my-topic] End TrimConsumedLedgers. ledgers=3
totalSize=38942923
10:28:28.452 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl -
[ten/ns/persistent/my-topic] Deleting offloaded ledger 233 from bookkeeper -
size: 15432415
10:28:28.452 [bookkeeper-ml-workers-OrderedExecutor-6-0] INFO
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl -
[ten/ns/persistent/my-topic] Deleting offloaded ledger 234 from bookkeeper -
size: 15504438
- size: 16168356
```
Also I can see the Ledgers got removed from Zookeeper. Is there a
configuration option I'm missing? Is there a way to understand why:
```
read failed from ledger at position:233:0 : Unknown exception
```
is happening? Thank you!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]