On Tue, Nov 7, 2017 at 12:18 PM, Istvan Soos <[email protected]> wrote:
> On Tue, Nov 7, 2017 at 8:09 PM, Sijie Guo <[email protected]> wrote: > > But I would to learn more about your use case and to see how we can > support > > you. > > It is a nice feature in Kafka, and I've seen a complex app using it: > https://kafka.apache.org/documentation.html#compaction yeah, if you are looking for this feature, you probably should checkout pulsar (which is bookkeeper based pub/sub): https://pulsar.incubator.apache.org/ the topic compaction feature might come in next release or so. > > > My use case is really simple: a website is crawled in regular periods, > and it is easy to create a content identifier out of it. I would store > the different versions of it in a ledger for downstream processing, > but there is really no need to preserve all of the versions of the > past if the content identifier is the same. > it seems a messaging pub/sub system like pulsar is good for you use case. just fyi, a bookkeeper ledger is single writer semantic. once the ledger is closed or the writer fails, you can not reopen the ledger to write. Is this an expected behavior for you? > > I know this specific case can be handled in many different ways, but > if the ledger could do that on its own, it could simplify the overall > architecture (free GC). > yeah, I totally see the value for it. > > Istvan >
