TBH, I don't have an opinion on the configuration. I just want to say
that if at the end we decide the configuration in the YAML should
override the table schema, I would like to recommend that we specifying
a list of whitelisted (or blacklisted) "templates" in the YAML file, and
the template chosen by the table schema is used if it's enabled,
otherwise fallback to a default template, which could be the first
element in the whitelist if that's used, or a separate configuration
entry if a blacklist is used. The list should be optional in the YAML,
and an empty list or the absent of it means everything is enabled.
Advantage of this:
1. it doesn't require the operator to configure this, as an empty or
absent list by default enables all templates and should work fine in
most cases.
2. it allows the operator to whitelist / blacklist any template if ever
needed (e.g. due to a bug), and also allow them to choose a fallback option.
3. the table schema has priority as long as the chosen template is not
explicitly disabled by the YAML.
4. it allows the operator to selectively disable some templates without
forcing all tables to use the same template specified by the YAML.
On 09/02/2022 09:43, bened...@apache.org wrote:
Why not have some default templates that can be specified by the
schema without touching the yaml, but overridden in the yaml as necessary?
*From: *Branimir Lambov <blam...@apache.org>
*Date: *Wednesday, 9 February 2022 at 09:35
*To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
*Subject: *Re: [DISCUSS] CEP-19: Trie memtable implementation
If I understand this correctly, you prefer _not_ to have an option to
give the configuration explicitly in the schema. I.e. force the
configurations ("templates" in current terms) to be specified in the
yaml, and only allow tables to specify which one to use among them?
This does sound at least as good to me, and I'll happily change the API.
Regards,
Branimir
On Tue, Feb 8, 2022 at 10:40 PM Dinesh Joshi <djo...@apache.org> wrote:
My quick reading of the code suggests that schema will override
the operator's default preference in the YAML. In the event of a
bug in the new implementation, there could be situation where the
operator might need to override this via the YAML.
On Feb 8, 2022, at 12:29 PM, Jeremiah D Jordan
<jeremiah.jor...@gmail.com> wrote:
I don’t really see most users touching the default
implementation. I would expect the main reason someone would
change would be
1. They run into some bug that is only in one of the
implementations.
2. They have persistent memory and so want to use
https://issues.apache.org/jira/browse/CASSANDRA-13981
Given that I doubt most people will touch it, I think it is
good to give advanced operators the ability to have more
control over switching to things that have new performance
characteristics. So I like the idea that the proposed
configuration approach which allows someone to change to a new
implementation one node at a time and only for specific tables.
On Feb 8, 2022, at 2:21 PM, Dinesh Joshi
<djo...@apache.org> wrote:
Thank you for sharing the perf test results.
Going back to the schema vs yaml configuration. I am
concerned users may pick the wrong implementation for
their use-case. Is there any chance for us to
automatically pick a MemTable implementation based on
heuristics? Do we foresee users ever picking the existing
SkipList implementation over the Trie Given the
performance tests, it seems the Trie implementation is the
clear winner.
To be clear, I am not suggesting we remove the existing
implementation. I am for maintaining a pluggable API for
various components.
Dinesh
On Feb 7, 2022, at 8:39 AM, Branimir Lambov
<blam...@apache.org> wrote:
Added some performance results to the ticket:
https://issues.apache.org/jira/browse/CASSANDRA-17240
Regards,
Branimir
On Sat, Feb 5, 2022 at 10:59 PM Dinesh Joshi
<djo...@apache.org> wrote:
This is excellent. Thanks for opening up this CEP.
It would be great to get some stats around GC
allocation rate / memory pressure, read & write
latencies, etc. compared to existing implementation.
Dinesh
On Jan 18, 2022, at 2:13 AM, Branimir Lambov
<blam...@apache.org> wrote:
The memtable pluggability API (CEP-11) is
per-table to enable memtable selection
that suits specific workflows. It also makes
full sense to permit per-node configuration,
both to be able to modify the configuration to
suit heterogeneous deployments better, as well
as to test changes for improvements such as
this one.
Recognizing this, the patch comes with a
modification to the API
<https://github.com/blambov/cassandra/commit/24b558ba2f71a2f040804e28993cc914b31298f5>
that defines memtable templates in
cassandra.yaml (i.e. per node) and allows the
schema to select a template (in addition to
being able to specify the full memtable
configuration). One could use this e.g. by adding:
*memtable_templates*:
*trie*:
*class*: TrieMemtable
*shards*: 16
*skiplist*:
*class*: SkipListMemtable
*memtable*:
*template*: skiplist
(which defines two templates and specifies the
default memtable implementation to use) to
cassandra.yaml and specifying *WITH memtable =
{'template' : 'trie'} *in the table schema.
I intend to commit this modification with the
memtable API (CASSANDRA-17034/CEP-11).
Performance comparisons will be published soon.
Regards,
Branimir
On Fri, Jan 14, 2022 at 4:15 PM Jeff Jirsa
<jji...@gmail.com> wrote:
Sounds like a great addition
Can you share some of the details around
gc and latency improvements you’ve
observed with the list?
Any specific reason the confirmation is
through schema vs yaml? Presumably it’s so
a user can test per table, but this
changes every host in a cluster, so the
impact of a bug/regression is much higher.
On Jan 10, 2022, at 1:30 AM, Branimir
Lambov <blam...@apache.org> wrote:
We would like to contribute our
TrieMemtable to Cassandra.
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-19%3A+Trie+memtable+implementation
This is a new memtable solution aimed
to replace the legacy implementation,
developed with the following objectives:
- lowering the on-heap complexity and
the ability to store memtable indexing
structures off-heap,
- leveraging byte order and a trie
structure to lower the memory
footprint and improve mutation and
lookup performance.
The new memtable relies on
CASSANDRA-6936 to translate to and
from byte-ordered representations of
types, and CASSANDRA-17034 / CEP-11 to
plug into Cassandra. The memtable is
built on multiple shards of custom
in-memory single-writer
multiple-reader tries, whose
implementation uses a combination of
state-of-the-art and novel features
for greater efficiency.
The CEP's JIRA ticket
(https://issues.apache.org/jira/browse/CASSANDRA-17240)
contains the initial version of the
implementation. In its current form it
achieves much better garbage
collection latency, significantly
bigger data sizes between flushes for
the same memory allocation, as well as
drastically increased write
throughput, and we expect the memory
and garbage collection improvements to
go much further with upcoming
improvements to the solution.
I am interested in hearing your
thoughts on the proposal.
Regards,
Branimir