frankjkelly opened a new issue #9266:
URL: https://github.com/apache/pulsar/issues/9266


   **Describe the bug**
   Retrieving data from Tiered Storage (S3) fails initially according to Broker 
logs ("unknown exception") then succeeds after 15 second retry resulting in a 
15 second response time minimum.
   I can reproduce this consistently.
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. Configure Tiered storage offload to S3
   2. Upload data to a topic
   2. Wait for Managed Ledger Rollver
   3. Offload Topic
   4. Attempt to retrieve data
   
   **Expected behavior**
   I expect some delay but not an exception followed by a 15 second retry
   
   **Screenshots**
   Logs
   
   ```
    17:50:18.735 [BookKeeperClientWorker-OrderedExecutor-2-0] INFO  
org.apache.bookkeeper.mledger.impl.MetaStoreImpl - 
[cogito-dialog/wav/persistent/57f5ea00-a7a2-49c3-95d3-c8237d893d31] 
[PulsarService-31955749-c92b-4bfe-a06a-2985d66b69c7] Updating cursor info 
ledgerId=1108490 mark-delete=1107259:-1
   17:50:18.739 [bookkeeper-ml-workers-OrderedExecutor-5-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - 
[cogito-dialog/wav/persistent/57f5ea00-a7a2-49c3-95d3-c8237d893d31] Updated 
cursor PulsarService-31955749-c92b-4bfe-a06a-2985d66b69c7 with ledger id 
1108490 md-position=1107259:-1 rd-position=1107259:0
   17:50:18.739 [bookkeeper-ml-workers-OrderedExecutor-5-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[cogito-dialog/wav/persistent/57f5ea00-a7a2-49c3-95d3-c8237d893d31] Opened new 
cursor: 
ManagedCursorImpl{ledger=cogito-dialog/wav/persistent/57f5ea00-a7a2-49c3-95d3-c8237d893d31,
 name=PulsarService-31955749-c92b-4bfe-a06a-2985d66b69c7, ackPos=1107259:-1, 
readPos=1107259:0}
   17:50:18.739 [bookkeeper-ml-workers-OrderedExecutor-5-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - 
[cogito-dialog/wav/persistent/57f5ea00-a7a2-49c3-95d3-c8237d893d31-PulsarService-31955749-c92b-4bfe-a06a-2985d66b69c7]
 Rewind from 1047055:0 to 1047055:0
   17:50:18.739 [bookkeeper-ml-workers-OrderedExecutor-5-0] INFO  
org.apache.pulsar.broker.service.persistent.PersistentTopic - 
[persistent://cogito-dialog/wav/57f5ea00-a7a2-49c3-95d3-c8237d893d31] There are 
no replicated subscriptions on the topic
   17:50:18.739 [bookkeeper-ml-workers-OrderedExecutor-5-0] INFO  
org.apache.pulsar.broker.service.persistent.PersistentTopic - 
[persistent://cogito-dialog/wav/57f5ea00-a7a2-49c3-95d3-c8237d893d31][PulsarService-31955749-c92b-4bfe-a06a-2985d66b69c7]
 Created new subscription for 0
   17:50:18.739 [bookkeeper-ml-workers-OrderedExecutor-5-0] INFO  
org.apache.pulsar.broker.service.ServerCnx - [/127.0.0.1:43862] Created 
subscription on topic 
persistent://cogito-dialog/wav/57f5ea00-a7a2-49c3-95d3-c8237d893d31 / 
PulsarService-31955749-c92b-4bfe-a06a-2985d66b69c7
   17:50:18.771 [bookkeeper-ml-workers-OrderedExecutor-5-0] WARN  
org.apache.bookkeeper.mledger.impl.OpReadEntry - 
[cogito-dialog/wav/persistent/57f5ea00-a7a2-49c3-95d3-c8237d893d31][PulsarService-31955749-c92b-4bfe-a06a-2985d66b69c7]
 read failed from ledger at position:1047055:0 : Unknown exception
   17:50:18.771 [broker-topic-workers-OrderedScheduler-4-0] ERROR 
org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer
 - [persistent://cogito-dialog/wav/57f5ea00-a7a2-49c3-95d3-c8237d893d31 / 
PulsarService-31955749-c92b-4bfe-a06a-2985d66b69c7-Consumer{subscription=PersistentSubscription{topic=persistent://cogito-dialog/wav/57f5ea00-a7a2-49c3-95d3-c8237d893d31,
 name=PulsarService-31955749-c92b-4bfe-a06a-2985d66b69c7}, consumerId=0, 
consumerName=8f423, address=/127.0.0.1:43862}] Error reading entries at 
1047055:0 : Unknown exception - Retrying to read in 15.0 seconds
   ```
   
   There was no stack trace for that error
   
   **Desktop (please complete the following information):**
    - OS: Linux
    - Pulsar: 2.6.1
   
   **Additional context**
   
   Tiered Offload Settings
   ```
   $ ./apache-pulsar-2.6.1/bin/pulsar-admin --admin-url 
http://platform-pulsar-broker:8080 namespaces get-offload-threshold 
cogito-dialog/wav
   0
   $ ./apache-pulsar-2.6.1/bin/pulsar-admin --admin-url 
http://platform-pulsar-broker:8080 namespaces get-offload-deletion-lag 
cogito-dialog/wav
   15 minute(s)
   ```
   Broker Overrides to defaults
   ```
           managedLedgerMaxLedgerRolloverTimeMinutes: "90"
   ```
   
   Other Broker settings
   ```
   # grep ffload broker.conf  | grep -v "#"
   managedLedgerOffloadDeletionLagMs=14400000
   managedLedgerOffloadAutoTriggerSizeThresholdBytes=-1
   offloadersDirectory=./offloaders
   managedLedgerOffloadDriver=aws-s3
   managedLedgerOffloadMaxThreads=2
   managedLedgerOffloadPrefetchRounds=1
   s3ManagedLedgerOffloadRegion=us-east-1
   s3ManagedLedgerOffloadBucket=eks-saas-pulsar-455275ef9748510192359f2f92675b2e
   s3ManagedLedgerOffloadServiceEndpoint=
   s3ManagedLedgerOffloadMaxBlockSizeInBytes=67108864
   s3ManagedLedgerOffloadReadBufferSizeInBytes=1048576
   gcsManagedLedgerOffloadRegion=
   gcsManagedLedgerOffloadBucket=
   gcsManagedLedgerOffloadMaxBlockSizeInBytes=67108864
   gcsManagedLedgerOffloadReadBufferSizeInBytes=1048576
   gcsManagedLedgerOffloadServiceAccountKeyFile=
   fileSystemProfilePath=../conf/filesystem_offload_core_site.xml
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to