Hello Blake, thank you a lot for sharing the CEP, it looks really promising and should address many of the current pain points! I have a few questions to clarify: 1) Is a single separate commit log expected to be created for all tables with the new replication type? 2) What is a granularity of storing mutation ids in memtable, is it per cell? 3) If we update the same row multiple times while it is in a memtable - are all mutation ids appended to a kind of collection? 4) What is the expected size of a single id? 5) Do we plan to support multi-table batches (single or multi-partition) for this replication type?
Thank you, Dmitry On Wed, 8 Jan 2025 at 19:22, Blake Eggleston <beggles...@apple.com> wrote: > Hello dev@, > > We'd like to propose CEP-45: Mutation Tracking for adoption by the > community. CEP-45 proposes adding a replication mechanism to track and > reconcile individual mutations, as well as processes to actively reconcile > missing mutations. > > For keyspaces with mutation tracking enabled, the immediate benefits of > this CEP are: > * reduced replication lag with a continuous background reconciliation > process > * eliminate the disk load caused by repair merkle tree calculation > * eliminate repair overstreaming > * reduce disk load of reads on cluster to close to 1/CL > * fix longstanding mutation atomicity issues caused by read repair and > short read protection > > Additionally, although it's outside the scope of this CEP, mutation > tracking would enable: > * completion of witness replicas / transient replication, making the > feature usable for all workloads > * lightweight witness only datacenters > > The CEP is linked here: > https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-45%3A+Mutation+Tracking, > but please keep the discussion on the dev list. > > Thanks! > > Blake Eggleston > -- Dmitry Konstantinov