codelipenghui opened a new pull request #13629:
URL: https://github.com/apache/pulsar/pull/13629


   ### Motivation
   
   To fix the reader skipping remaining compacted data while the topic has been 
unloaded.
   #11287 fixed the data skipped issue while the reader first time to read the 
messages
   with the earliest position. But if the reader has consumed some messages 
from the
   compacted ledger but not all, the start position will not be `earliest`, the 
broker
   will rewind the cursor for the reader to the next valid position of the 
original topic.
   So the remaining messages in the compacted ledger will be skipped.
   
   Here are the logs from the broker:
   
   ```
   10:44:36.035 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] INFO  
org.apache.pulsar.broker.service.BrokerService - Created topic 
persistent://xxx/product-full-prod/5126 - dedup is disabled
   10:44:36.035 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] INFO  
org.apache.pulsar.broker.service.persistent.PersistentTopic - 
[persistent://xxx/product-full-prod/5126][xxx] Creating non-durable 
subscription at msg id 181759:14:-1:-1
   10:44:36.035 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] INFO  
org.apache.bookkeeper.mledger.impl.NonDurableCursorImpl - 
[xxx/product-full-prod/persistent/5126] Created non-durable cursor 
read-position=221199:0 mark-delete-position=181759:13
   10:44:36.035 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - 
[xxx/product-full-prod/persistent/5126] Opened new cursor: 
NonDurableCursorImpl{ledger=xxx/product-full-prod/persistent/5126, 
ackPos=181759:13, readPos=221199:0}
   10:44:36.035 [bookkeeper-ml-scheduler-OrderedScheduler-4-0] INFO  
org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - 
[xxx/product-full-prod/persistent/5126-xxx] Rewind from 221199:0 to 221199:0
   ```
   
   There some many compacted messages after `181759:13`, but the broker will 
not dispatch them to the reader.
   The issue also can be reproduced by the unit test that was added in this PR.
   
   ### Modification
   
   If the cursor with `readCompacted = true`, just rewind to the next message 
of the mark delete position,
   so that the reader can continue to read the data from the compacted ledger.
   
   ### Verification
   
   A new test added for testing the reader can get all the compacted messages 
and non-compacted messages from the topic during the topic unloading.
   
   ### Documentation
   
   Check the box below or label this PR directly (if you have committer 
privilege).
   
   Need to update docs? 
   
   - [ ] `doc-required` 
     
     (If you need help on updating docs, create a doc issue)
     
   - [x] `no-need-doc` 
     
     (Please explain why)
     
   - [ ] `doc` 
     
     (If this PR contains doc changes)
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to