On Tue, Nov 7, 2017 at 10:42 PM, Sijie Guo <guosi...@gmail.com> wrote: > yeah, if you are looking for this feature, you probably should checkout > pulsar (which is bookkeeper based pub/sub): > https://pulsar.incubator.apache.org/ > > the topic compaction feature might come in next release or so.
Ok, good to know! >> My use case is really simple: a website is crawled in regular periods, >> and it is easy to create a content identifier out of it. I would store >> the different versions of it in a ledger for downstream processing, >> but there is really no need to preserve all of the versions of the >> past if the content identifier is the same. > > > it seems a messaging pub/sub system like pulsar is good for you use case. > > just fyi, a bookkeeper ledger is single writer semantic. once the ledger is > closed or the writer fails, you can not reopen the ledger to write. Is this > an expected behavior for you? I believe I'm missing the implication of this. Does that mean we need to logically name ledgers in a way that can keep track, because each has only one uninterrupted session of write operations, otherwise it is read only? Thanks, Istvan