This is counterintuitive to me. The constraint should be applied to the table, 
not to the update. NOT NULL should imply a value is always specified.

How are you handling this for tables that already exist? Can we alter table to 
add constraints, and if so what are the semantics?

> On 10 Feb 2025, at 14:50, Bernardo Botella <conta...@bernardobotella.com> 
> wrote:
> 
> Hi everyone,
> 
> Stefan Miklosovic and I have been working on a NOT_NULL 
> (https://github.com/apache/cassandra/pull/3867) constraint to be added to the 
> constraints tool belt, and a really interesting conversation came up.
> 
> First, as a problem statement, let's consider this:
> 
> -----------------------------------------
> CREATE TABLE ks.tb2 (
>    id int,
>    cl1 int,
>    cl2 int,
>    val text CHECK NOT_NULL(val),
>    PRIMARY KEY (id, cl1, cl2)
> )
> 
> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3, 
> null);
> InvalidRequest: Error from server: code=2200 [Invalid query] message="Column 
> value does not satisfy value constraint for column 'val' as it is null."
> 
> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3, 
> “text");
> cassandra@cqlsh> select * from ks.tb2;
> 
> id | cl1 | cl2 | val
> ----+-----+-----+------
>  1 |   2 |   3 | text
> 
> (1 rows)
> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2) VALUES ( 1, 2, 4);
> cassandra@cqlsh> select * from ks.tb2;
> 
> id | cl1 | cl2 | val
> ----+-----+-----+------
>  1 |   2 |   3 | text
>  1 |   2 |   4 | null
> 
> -----------------------------------------
> 
> As you see, we have a hole in which a 'null' value is getting written on 
> column val even if we have a NOT_NULL on that particular column whenever the 
> column is NOT specified on the write. That raises the question on how this 
> particular constraint should behave.
> 
> If we consider the other constraints (scalar constraint and length constraint 
> so far), this particular behavior is fine. But, if the constraint is 
> NOT_NULL, then it becomes a little bit trickier.
> 
> The conclusions we have reached is that the meaning of constraints should be 
> interpreted like: I check whatever you give me as part of the write, ignoring 
> everything else. Let me elaborate:
> If we decide to treat this particular NOT_NULL constraint differently, and 
> check if the value for that column is present in the insert statement, we 
> then open a different can of worms. What happens if the row already exists 
> with a valid value, and that insert statement is only trying to do an update 
> to a different column in the row? If that was the case, we would be forcing 
> the user to specify the 'val' column value for every update, even if it is 
> not needed.
> 
> Mainly for this reason, we think it is better to treat this NOT_NULL 
> constraint just like the other constraints, and execute it ONLY on the values 
> that are present on the insert statement.
> 
> The main con is that it may lead to a little bit of confussion (as in, why I 
> just added a null value to the table even if I have a NOT_NULL constraint?). 
> We have thought on aliviating this particular confusion by:
> - Extensive documentation. Let's be upfront on what this constraint does and 
> does not. 
> (https://github.com/apache/cassandra/blob/ed58c404e8c880b69584e71a3690d3d9f73ef9fa/doc/modules/cassandra/pages/developing/cql/constraints.adoc#not_null-constraint)
> - Adding, as part of this patch, yet another constraint (STRICTLY_NOT_NULL), 
> that checks for the actual column value to be present in the insert 
> statement..
> 
> If you've made it until here, that means you are really interested in 
> constraints. Thanks! The question for you is, would you have any concern with 
> this approach?
> 
> Thanks,
> Bernardo

Reply via email to