Re: [DISCUSS] CEP-10: Cluster and Code Simulations

bened...@apache.org Tue, 13 Jul 2021 03:11:08 -0700

Perhaps it’s worth looking forward at the roadmap that we plan to develop, and 
consider whether such a facility would be welcome for proving their safety, and 
we can then worry about evolving the specifics of any API(s) together as we 
deploy the capability? Looking ahead, there are very few major features I 
wouldn’t want to see exercised with this approach, given the choice.


The LWT Verifier by itself is an integration test that covers many of the 
affected subsystems, including sstables, memtables and repair. But we will have 
the ability to introduce dedicated verification for each of these features and 
systems, and we will necessarily produce more robust code (repair is a great 
example of a brittle system that would be impossible to produce with such an 
adversarial test system)


*Query side improvements:*

  * Storage Attached Index or SAI. The CEP can be found at
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-7%3A+Storage+Attached+Index
  * Add support for OR predicates in the CQL where clause
  * Allow to aggregate by time intervals (CASSANDRA-11871) and allow UDFs
in GROUP BY clause
  * Ability to read the TTL and WRITE TIME of an element in a collection
(CASSANDRA-8877)
  * Multi-Partition LWTs
  * Materialized views hardening: Addressing the different Materialized
Views issues (see CASSANDRA-15921 and [1] for some of the work involved)

*Security improvements:*

  * SSTables encryption (CASSANDRA-9633)
  * Add support for Dynamic Data Masking (CEP pending)
  * Allow the creation of roles that have the ability to assign arbitrary
privileges, or scoped privileges without also granting those roles access
to database objects.
  * Filter rows from system and system_schema based on users permissions
(CASSANDRA-15871)

*Performance improvements:*

  * Trie-based index format (CEP pending)
  * Trie-based memtables (CEP pending)
  * Paxos improvements: Paxos / LWT implementation that would enable the
database to serve serial writes with two round-trips and serial reads with
one round-trip in the uncontended case

*Safety/Usability improvements:*

  * Guardrails. The CEP can be found at
https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
  * Add ability to track state in repair (CASSANDRA-15399)
  * Repair coordinator improvements (CASSANDRA-15399)
  * Make incremental backup configurable per keyspace and table
(CASSANDRA-15402)
  * Add ability to blacklist a CQL partition so all requests are ignored
(CASSANDRA-12106)
  * Add default and required keyspace replication options (CASSANDRA-14557)
  * Transactional Cluster Metadata: Use of transactions to propagate
cluster metadata
  * Downgrade-ability: Ability to downgrade to downgrade in the event that
a serious issue has been identified

*Pluggability improvements:*

  * Pluggable schema manager (CEP pending)
  * Pluggable filesystem (CEP pending)
  * Pluggable authenticator for CQLSH (CASSANDRA-16456). A CEP draft can be
found at
https://docs.google.com/document/d/1_G-OZCAEmDyuQuAN2wQUYUtZBEJpMkHWnkYELLhqvKc/edit
  * Memtable API (CEP pending). The goal being to allow improvements such
as CASSANDRA-13981 to be easily plugged into Cassandra

*Memtable pluggable implementation:*

  * Enable Cassandra for Persistent Memory (CASSANDRA-13981)




From: bened...@apache.org <bened...@apache.org>
Date: Tuesday, 13 July 2021 at 10:51
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
Ach, editing code in the email editor isn’t smart when editors all have 
different meanings for key combinations (accidentally hit send), but you get 
the idea. The simulator would intercept these thread executions, the memory 
accesses for the annotated field, and evaluate them so that in some cases the 
assertions would fail.

This is obviously a toy example that is not very interesting, but the main real 
example we have is too complicated to produce a snippet to demonstrate. In my 
view, the long term outcome of this work is likely the enablement of many unit 
tests that are a little more complicated than this, on less obvious code.

But the headline goal of the CEP is not. By itself, the LWT Verifier 
demonstrates the power and utility of the work. I don’t believe it is terribly 
helpful to focus on secondary justifications like the example I gave. For me, 
the _ability_ to prove the correctness of difficult but critical systems is 
justification enough, whether or not we deliver a simple API as part of the CEP.



From: bened...@apache.org <bened...@apache.org>
Date: Tuesday, 13 July 2021 at 10:43
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
> Should target release be 4.1. (not 4.0.x) ?



No, in my opinion the target should be 4.0.x. We are reaching for a shippable 
trunk and this has no public API impacts. This work is IMO central to achieving 
a shippable trunk, either way. The only reason I do not target 3.x is that it 
would be too burdensome.

> My concern is that changing code and tests at the same time risks regressions…



I’ve never heard this position before. Would you care to elaborate? It is quite 
normal for us to update tests alongside changes to the code.

> And seconding Benjamin's comments… some documentation on how to write a test, 
> and a simple test example, that this CEP then allows us to write would help a 
> lot (a la "working backwards").

1) This work is to _enable_ the development of tests, with the only test 
originally planned to arrive alongside it the fairly sophisticated LWT 
Verifier. This is something we have sorely needed as a project, as we have had 
serious correctness violations for multiple years. This broad category of 
integrated test for verifying correctness is the main goal of the work and is 
not easily condensed into an example snippet.
2) It is _possible_ that some simple and fluid APIs will be introduced in a 
later phase of this work, but they haven’t been designed yet, so I cannot share 
snippets.

In principle, however, you would be able to do something like:

@Nemesis volatile int x = 0;
int foo() {
    x = x + 1;
    return x;
}

@Test
void test() {
    Future<?> f1 = executor.submit(() -> foo());
    Future<?> f2 = executor.submit(() -> foo());
    Assert.assertTrue(f1.get() == 1 || f2.get() == 1);
}


From: Mick Semb Wever <m...@apache.org>
Date: Tuesday, 13 July 2021 at 10:28
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: [DISCUSS] CEP-10: Cluster and Code Simulations
>
> To achieve this, significant modifications will be required to the codebase, 
> mostly cleaning up existing abstractions. Specifically, we will need to be 
> able to mock executors, any blocking concurrency primitives, time, filesystem 
> access and internode streaming.
>
> The work is – in large part – already complete, with JIRA and PRs to follow 
> in the coming weeks. Of course, the work is subject to the usual community 
> input and review, so this does not preclude changes to the work (even 
> significant ones, if they are warranted). I know a lot of incoming CEP are 
> likely to be backed up by significant off-list development as a result of the 
> focus on a shippable 4.0. Hopefully this is just a temporary growing pain, 
> particularly as we move towards a shippable trunk.
>
> I hope this work will be of huge value to the project, particularly as we 
> race to catch up on years of limited feature development.
>
> JIRA and PRs will follow, but I wanted to kick-off discussion in advance.
>



Should target release be 4.1. (not 4.0.x) ?

I'd be interested in seeing a rough timeline/plan of how the proposed
changes are to be defined in JIRAs and ordered.

I'd like to hear a bit more about the test plan. Not so much about how
the CEP itself improves testability of the project, but for example
the testing required to be in place to introduce the changes of the
CEP (and if it already exists, where). My concern is that changing
code and tests at the same time risks regressions…

And seconding Benjamin's comments… some documentation on how to write
a test, and a simple test example, that this CEP then allows us to
write would help a lot (a la "working backwards").

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

Re: [DISCUSS] CEP-10: Cluster and Code Simulations

Reply via email to