[
https://issues.apache.org/jira/browse/CASSANDRA-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17623988#comment-17623988
]
David Capwell edited comment on CASSANDRA-17240 at 10/25/22 7:04 PM:
---------------------------------------------------------------------
Been talking in Slack, moving context here.
The config side of this patch is problematic, the table param depends on a
local yaml being consistent (at least the name) cross the cluster, but our
schema logic doesn't actually allow this which then causes a schema mismatch
cross the cluster... This will also cause the conflicting nodes to fail to
re-boot (guess a slight positive?)...
I strongly feel that we shouldn't depend on the config if we are adding a table
param, the table param should be self contained similar to
compression/compaction; we should do something like this
{code}
WITH memtable = {'type': 'trie', 'shards': 42}
{code}
Now, this also gets to the question, what do you do if a node doesn't
understand the type? Simple example is rolling upgrades picking up a new
implementation or custom one... in this case we should revert back to the
default (coordinator should attempt to validate and reject unknown, rest should
fall back).
Second comment is the yaml is very dense and hard to use, we could simplify by
making it strongly typed; we should move away from
{code}
memtable:
configurations:
node1:
class_name: TrieMemtable
parameters:
shards: 42
{code}
To
{code}
memtable:
type: TrieMemtable # or "trie", cool with alias or explicit
shards: 42
{code}
If we want to override at the table level we could also add a
{code}
memtable_table_overrides:
ks.table: # or how/ever we wish to call out the name
type: trie
shards: 4
ks: # override at the key space level
type: skiplist
{code}
was (Author: dcapwell):
Been talking in Slack, moving context here.
The config side of this patch is problematic, the table param depends on a
local yaml being consistent (at least the name) cross the cluster, but our
schema logic doesn't actually allow this which then causes a schema mismatch
cross the cluster... This will also cause the conflicting nodes to fail to
re-boot (guess a slight positive?)...
I strongly feel that we shouldn't depend on the config if we are adding a table
param, the table param should be self contained similar to
compression/compaction; we should do something like this
{code}
WITH memtable = {'type': 'trie', 'shards': 42}
{code}
Now, this also gets to the question, what do you do if a node doesn't
understand the type? Simple example is rolling upgrades picking up a new
implementation or custom one... in this case we should revert back to the
default (coordinator should attempt to validate and reject unknown, rest should
fall back).
Second comment is the yaml is very dense and hard to use, we could simplify by
making it strongly typed; we should move away from
{code}
memtable:
configurations:
node1:
class_name: TrieMemtable
parameters:
shards: 42
{code}
To
{code}
memtable:
type: TrieMemtable # or "trie", cool with alias or explicit
shards: 42
{code}
If we want to override at the table level we could also add a
{code}
memtable_table_overrides:
ks.table: # or how/ever we wish to call out the name
type: trie
shards: 4
ks: # override at the key space level
type: skiplist
shards:
{code}
> CEP-19: Trie memtable implementation
> ------------------------------------
>
> Key: CASSANDRA-17240
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17240
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Memtable
> Reporter: Branimir Lambov
> Assignee: Branimir Lambov
> Priority: Normal
> Fix For: 4.2
>
> Attachments: SkipListMemtable-OSS.png, TrieMemtable-OSS.png,
> density_SG.html.gz, density_test_with_sharding.html.gz, latency-1_1-95.png,
> latency-9_1-95.png, throughput_SG.png, throughput_apache.png
>
> Time Spent: 13.5h
> Remaining Estimate: 0h
>
> Trie-based memtable implementation as described in CEP-19, built on top of
> CASSANDRA-17034 and CASSANDRA-6936.
> The implementation is available in this
> [branch|https://github.com/blambov/cassandra/tree/CASSANDRA-17240].
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]