candlerb opened a new issue #5394: Add ability to truncate raw partition after 
compaction
URL: https://github.com/apache/pulsar/issues/5394
 
 
   **Is your feature request related to a problem? Please describe.**
   When using an event stream as a "table", Create/Update/Delete options are 
modelled as posting new key/value pairs.  The stream needs an infinite 
retention policy, because some keys may never get new values.
   
   Pulsar already has a Compaction feature, which converts a raw topic into a 
compacted topic, containing only the latest value for each key.  Clients can 
choose to read the compacted stream.  However the raw topic remains 
indefinitely on disk.
   
   This means that the raw topic can grow without bounds, even for a limited 
size table, as the entire history of updates is kept indefinitely.
   
   **Describe the solution you'd like**
   Add the ability to truncate the raw topic data after compaction.  This might 
mean rotating the compacted ledger into the place of the original raw ledger.
   
   Since compaction is done on demand, this rotation can be done on demand too. 
 It could be an attribute of the compaction request.
   
   (Aside: I don't fully understand what happens when compaction is run on an 
already-compacted topic; and I don't know what happens if a client asks to read 
the compacted version of a topic which has not yet been compacted)
   
   **Describe alternatives you've considered**
   As far as I can see, the only other option is to spool all the data out of 
the compacted topic into a fresh topic.
   
   **Additional context**
   This would bring feature parity to Kafka's [log 
compaction](https://kafka.apache.org/documentation/#compaction).
   
   Possibly related to #2736
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to