I am generally for this CEP, particularly the sizeOf guardrail. For example, we recently had an incident caused by a client who wrote outside of the contract we had verbally established. The constraint would have let us encode that contract into the database. In this case, clients are writing large blobs at the application layer and internally the client performs chunking. We had established a chunk size of 64k, for example. However, the application team wanted to use a different programming language than the ones we provide clients for so they wrote their own. The new client had a bug that did not honor the agreed upon chunk size and wrote chunks that were MBs in size. This eventually led to a production incident and the issue was discovered as a result of a bunch of analysis (dumping sstables, etc). Had we had the sizeOf guardrail it would have turned a production incident with hours of investigation into a bug found immediately during development. Could this be done with a node-level guardrail? Likely. But config has the issues described above and its possible to have two tables with different constraints around similar fields (for example, two different chunk size configs due to data shape). Could it be done at the client layer? Yes that's what we are doing now, but this incident highlights the weakness with that approach (having to implement the contract everywhere and having disjoint features across clients).
I also think there is benefit to application owners. Encoding constraints in the database ensures continuity as ownership and contributors change and reduces the need for comments or documentation as the means to enforce or share this knowledge. I think enforcing them at write time makes sense. Thinking about it in the scope of compaction for example reminds me of a data loss incident where someone ran a validation in an older version (like 2.0 or 2.1) and a bunch of 4 byte ints were thrown away because the field expected an 8 byte long. My primary concern would be ensuring that we don't implement constraints that require a read before right (not inList comes to mind as an example of one that could imply reading before writing and could confuse a user if it doesn't). Regarding the conflict with existing guardrails, I do think that is tougher. On one hand I find this feature to be more evolved than those guardrails and would be fine to see them be replaced by it. On the other, the guardrails provide sole control to the operator which is nice but adds some complexity that has been rightly called out. But I don't see that as a reason not to go forward with this feature. We should pick a path and accept the tradeoffs. Jordan On Thu, Jun 13, 2024 at 2:39 PM Bernardo Botella < conta...@bernardobotella.com> wrote: > Thanks a lot for your comments Abe! > > I do agree that the Constraint clause should be as simple as possible. I > will add a note on the CEP along with some specifics about the proposed > constraints (removing the ones that are contentious, and adding them to a > possible future additions section). And yeah, I also think that these > constraints will help different Cassandra operating paradigms (multi-tenant > clusters and diverse workflows). > > Besides that, I hope that I’ve addressed all the potential concerns and > feedback on the thread. Let’s let a bit more time for others to chime in > (any further feedback will be more than welcome), but I’d like to move > forward with a voting soon if no other concerns are pointed out. > > All and all, thanks a lot to everyone that participated in the thread and > added to the discussion! > Bernardo > > > > > On Jun 12, 2024, at 2:37 PM, Abe Ratnofsky <a...@aber.io> wrote: > > > > I've thought about this some more. It would be useful for Cassandra to > support user-defined "guardrails" (or constraints, whatever you want to > call them), that could be applied per keyspace or table. Whether a user or > an operator is considered the owner of a table depends on the organization > deploying Cassandra, so allowing both parties to protect their tables > against mis-use seems good to me, especially for large multi-tenant > clusters with diverse workloads. > > > > For example, it would be really useful if a user could set the > Guardrails.{read,write}ConsistencyLevels for their tables, or declare > whether all operations should be over LWTs to avoid mixing regular and LWT > workloads. > > > > I'm hesitant about adding lots of expression syntax to the CONSTRAINT > clause. I think I'd prefer a function calling syntax that represents: > > 1. Whether the constraint is system / keyspace / table scoped > > 2. Where in query processing the constraint is checked > > 3. What is executed by the check > >