codelipenghui opened a new pull request #12429:
URL: https://github.com/apache/pulsar/pull/12429


   The PR is fixing the compacted data lost during the data compaction.
   We see a few events deletion but the compacted events obviously dropped a 
lot.
   
   
![image](https://user-images.githubusercontent.com/12592133/138008777-00eb7c0b-358e-4291-bfd4-f4b27cbedbf4.png)
   
   After investigating more details about the issue, only the first read 
operation reads the data from
   the compacted ledger, since the second read operation, the broker start read 
data from the original
   topic.
   
   ```
   2021-10-19T23:09:30,021+0800 [broker-topic-workers-OrderedScheduler-7-0] 
INFO  org.apache.pulsar.compaction.CompactedTopicImpl - 
=====[public/default/persistent/c499d42c-75d7-48d1-9225-2e724c0e1d83] Read from 
compacted Ledger = cursor position: -1:-1, Horizon: 16:-1, isFirstRead: true
   2021-10-19T23:09:30,049+0800 [broker-topic-workers-OrderedScheduler-7-0] 
INFO  org.apache.pulsar.compaction.CompactedTopicImpl - 
=====[public/default/persistent/c499d42c-75d7-48d1-9225-2e724c0e1d83] Read from 
original Ledger = cursor position: 16:0, Horizon: 16:-1, isFirstRead: false
   ```
   
   The compaction task depends on the last snapshot and the incremental
   entries to build the new snapshot. So for the compaction cursor, we
   need to force seek the read position to ensure the compactor can read
   the complete last snapshot because the compactor will read the data
   before the compaction cursor mark delete position.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to