On Mon, Oct 14, 2019 at 5:42 AM Brian Candler <[email protected]> wrote:

> On 14/10/2019 13:24, xiaolong ran wrote:
>
> For more details on compact topic, you can refer to:
> https://github.com/apache/pulsar/wiki/PIP-14:-Topic-compaction
>
> There it says:
>
> "Compaction doesn't directly interact with message expiration. Once a
> topic is compacted, the backlog still exists. However, subscribers with
> cursors before the compaction horizon will move quickly through the
> backlog, so it will eventually get cleaned up as there'll be no subscribers.
> "
>
> Now suppose my usage model is: I want to use a pulsar topic as an analogue
> of a "table".  I need to keep the last value for any given key in the
> backlog forever - but previous values of that key can be forgotten.
>
> As far as I can see, the only way to keep a key forever is either:
>
> 1. Set the topic's retention
> <https://pulsar.apache.org/docs/en/cookbooks-retention-expiry/#retention-policies>
> time to infinity
>
> 2. Have a dummy subscriber which never consumes
>
> But in either case, it seems to me that the main backlog will grow
> forever.  The compacted topic is an optimisation for *readers* who want to
> skip all key/value pairs except the most recent ones for each key; but the
> original topic continues to grow without bounds.
>
> Have I understood that correctly - or else what have I missed?
>

Your understanding is correct. The decision made for topic compaction is to
keep both raw data and compacted data so that user can choose which copy of
the data to consume.

However I think there are a couple tasks are not completed for topic
compaction. For example, provide the ability to truncate raw data after
compacted. Feel free to create a github issue for requesting this feature.


> Is there a way to recover the storage from the original topic?
>

You mean reclaim the storage occupied by the raw data?


>   Logically what I want to do is replace the original topic with the
> compacted one.  Would the application be expected to copy all messages from
> the compacted topic to a new one periodically?
>
> Thanks,
>
> Brian.
>

Reply via email to