Re: [DISCUSS] CEP-11: Pluggable memtable implementations

Michael Burman Wed, 21 Jul 2021 03:34:16 -0700

Hi,

It is nice to see these going forward (and a great use of CEP) so thanks
for the proposal. I have my reservations regarding the linking of memtable
to CommitLog and flushing and should not leak abstraction from one to
another. And I don't see the reasoning why they should be, it doesn't seem
to add anything else than tight coupling of components, reducing reuse and
making things unnecessarily complicated. Also, the streaming notions seem
weird to me - how are they related to memtable? Why should memtable care
about the behavior outside memtable's responsibility?

Some misc (with some thoughts split / duplicated to different parts) quotes
and comments:

> Tight coupling between CFS and memtable will be reduced: flushing
functionality is to be extracted, controlling memtable memory and period
expiration will be handled by the memtable.

Why is flushing control bad to do in CFS and better in the memtable? Doing
it outside memtable would allow to control the flushing regardless of how
the actual memtable is implemented. For example, lets say someone would
want to implement the HBase's accordion to Cassandra. It shouldn't matter
what the implementation of memtable is as the compaction of different
memtables could be beneficial to all implementations. Or the flushing would
push the memtable to a proper caching instead of only to disk.

Or if we had per table caching structure, we could control the flushing of
memtables and the cache structure separately. Some data benefits from LRU
and some from MRW (most-recently-written) caching strategies. But both
could benefit from the same memtable implementation, it's the data and how
its used that could control how the flushing should work. For example time
series data behaves quite differently in terms of data accesses to
something more "random".

Or even "total memory control" which would check which tables need more
memory to do their writes and which do not. Or that the memory doesn't grow
over a boundary and needs to manually maintain how much is dedicated to
caching and how much to memtables waiting to be flushed. Or delay flushing
because the disks can't keep up etc. Not to be implemented in this CEP, but
pushing this strategy to memtable would prevent many features.

> Beyond thread-safety, the concurrency constraints of the memtable are
intentionally left unspecified.

I like this. I could see use-cases where a single-thread implementation
could actually outperform some concurrent data structures. But it also
provides me with a question, is this proposal going to take an angle
towards per-range memtables? There are certainly benefits to splitting the
memtables as it would reduce the "n" in the operations, thus providing less
overhead in lookups and writes. Although, taking it one step backwards I
could see the benefit of having a commitlog per range also, which would
allow higher utilization of NVME drives with larger queue depths. And why
not per-range-sstables for faster scale-outs and .. a bit outside the scope
of CEP, but just to ensure that the implementation does not block such
improvement.

Interfaces:

> boolean writesAreDurable()
> boolean writesShouldSkipCommitLog()

The placement inside memtable implementation for these methods just feels
incredibly wrong to me. The writing pipeline should have these configured
and they could differ for each table even with the same memtable
implementation. Lets take the example of an in-memory memtable use case
that's never written to a SSTable. We could have one table with just simply
in-memory cached storage and another one with a Redis style persistence of
AOF, where writes would be written to the commitlog for fast recovery, but
the data is otherwise always only kept in the memtable instead of writing
to the SSTable (for performance reasons). Same implementation of memtable
still.

Why would the write process of the table not ask the table what settings it
has and instead asks the memtable what settings the table has? This seems
counterintuitive to me. Even the persistent memory case is a bit
questionable, why not simply disable commitlog in the writing process? Why
ask the memtable?

This feels like memtable is going to be the write pipeline, but to me that
doesn't feel like the correct architectural decision. I'd rather see these
decisions done outside the memtable. Even a persistent memory memtable user
might want to have a commitlog enabled for data capture / shipping logs, or
layers of persistence speed. The whole persistent memory without any
commercially known future is a bit weird at the moment (even Optane has no
known manufacturing anymore with last factory being dismantled based on
public information).

> boolean streamToMemtable()

And that one I don't understand. Why is streaming in the memtable? This
smells like a scope creep from something else. The explanation would
indicate to me that the wanted behavior is just disabling automated
flushing.

But these are just some questions that came to my mind while reading this.
And I don't want to sound too negative (most of the features are really
something I'd like to see), perhaps I just misunderstood some of the
motivations why stuff should be brought to memtable instead of being
implemented outside memtable. Perhaps there's something else in the write
pipeline arch that needs fixing but is now masqueraded inside this CEP.

I'm definitely interested to hear more.

  - Micke

On Wed, 21 Jul 2021 at 08:24, Berenguer Blasi <berenguerbl...@gmail.com>
wrote:

> +1. De-tangling, going more modular and clean interfaces sgtm.
>
> On 20/7/21 21:45, Nate McCall wrote:
> > Yay for pluggable memtables!! I havent gone over this in detail yet, but
> > personally I've always thought integrating something like Arrow would be
> > cool for sharing data (that's as far as i've gotten, but anything that
> > makes that kind of experimentation easier would also help with mocking
> test
> > plumbing, so +1 from me).
> >
> > Thanks for putting this together!
> >
> > -Nate
> >
> > On Tue, Jul 20, 2021 at 10:11 PM Branimir Lambov <
> > branimir.lam...@datastax.com> wrote:
> >
> >> Proposal for a mechanism for plugging in memtable implementations:
> >>
> >>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-11%3A+Pluggable+memtable+implementations
> >>
> >> The proposal supports using custom memtable implementations to support
> >> development and testing of improved alternatives, but also enables a
> >> broader definition of "memtable" to better support more advanced use
> cases
> >> like persistent memory. To this end, memtable implementations are given
> >> control over flushing and storing data in the commit log, enabling
> >> solutions that implement their own durability mechanisms and live much
> >> longer than their classical counterparts. Taken to the extreme, this
> also
> >> enables memtables that never flush (in other words, alternative storage
> >> engines) in a minimally-invasive manner.
> >>
> >> I am curious to hear your thoughts on the proposal.
> >>
> >> Regards,
> >> Branimir
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>

Re: [DISCUSS] CEP-11: Pluggable memtable implementations

Reply via email to