We have consensus then. Let’s ditch the non strict version, and rename the 
STRICTLY_NOT_NULL to NOT_NULL.

Thanks everyone!
Bernardo

> On Feb 10, 2025, at 8:58 AM, Štefan Miklošovič <smikloso...@apache.org> wrote:
> 
> I agree.
> 
> The only reason would be purely practical: if a user has a table consisting 
> of 1000 columns not being null and a user wants to modify 1 column only, then 
> a user would be forced to specify the remaining 999 columns just for the sake 
> of it.
> 
> But in this case, I think it would be more practical just to ensure in the 
> application that what he is putting there is not null rather than having 1000 
> constraints on the table.
> 
> On Mon, Feb 10, 2025 at 5:52 PM Yifan Cai <yc25c...@gmail.com 
> <mailto:yc25c...@gmail.com>> wrote:
>> While LOOSE_NOT_NULL might improve the clarity a bit, what is the value of 
>> such constraint provides to users? It still permits null. Meanwhile, it is 
>> easier to check the nullness of the bound values on the application side.
>> IMO, what benefits users is a way to ensure no null value can exist for the 
>> constrained columns. Reading the thread, it is the behavior of the strict 
>> version. 
>> How about we just drop the LOOSE one and call the STRICT one “NOT_NULL”?
>> 
>> - Yifan
>> From: Bernardo Botella <conta...@bernardobotella.com 
>> <mailto:conta...@bernardobotella.com>>
>> Sent: Monday, February 10, 2025 8:44:13 AM
>> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> 
>> <dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>>
>> Subject: Re: [DISCUSS] NOT_NULL constraint vs STRICTLY_NOT_NULL constraint
>>  
>> To recap,
>> 
>> The sentiment I am getting is that NOT_NULL allowing null values is too 
>> confusing. Nice, that’s why we started the thread.
>> 
>> As an alternative, instead of ditching the loose not null constraint, I 
>> propose we change the “default” behavior. From my initial proposal, I 
>> suggest renaming the Constraints:
>> - NOT_NULL -> LOOSE_NOT_NULL
>> - STRICTLY_NOT_NULL -> NOT_NULL
>> 
>> The reasoning behind trying to keep it is:
>> - It is already implemented.
>> - By being explicit with it being loose, we avoid the confusion of allowing 
>> nulls.
>> - It still adds value on its own.
>> 
>> With, the “by default” not null doesn’t allow null or non present values on 
>> the insert statement, while we still support the more relaxed LOOSE_NOT_NULL 
>> for updates.
>> 
>> Thoughts?
>> 
>> 
>>> On Feb 10, 2025, at 8:29 AM, Štefan Miklošovič <smikloso...@apache.org 
>>> <mailto:smikloso...@apache.org>> wrote:
>>> 
>>> 
>>> 
>>> On Mon, Feb 10, 2025 at 5:20 PM Dinesh Joshi <djo...@apache.org 
>>> <mailto:djo...@apache.org>> wrote:
>>> In my head NOT_NULL constraint implies that the column must be specified on 
>>> each write and must not be NULL. If a column with the NOT_NULL constraint 
>>> is omitted during a write then shouldn’t it be treated as if it was 
>>> specified and set to NULL?
>>> 
>>> Well, yes. One may also look at it that way. But then we would end up with 
>>> "null" in a column, while it would be quite surprising for users to see 
>>> that because they were thinking that if they specified it as NOT NULL on a 
>>> table creation, then it is "guaranteed" that it will not be null ever 
>>> again. It just looks strange to say in table schema it is not null but then 
>>> it actually might be.
>>>  
>>> 
>>> If the column has a non-NULL value that was previously written and you’re 
>>> updating the rest of the columns, you still have to force the user to 
>>> specify it otherwise you will have to perform a read before write to 
>>> validate that the column was not NULL. I think this is a fine compromise 
>>> given that the goal here is to ensure that an application shouldn’t 
>>> inadvertently write a NULL value for a column specified as NOT_NULL.
>>> 
>>> 
>>> Yes. I see it the same way. 
>>>  
>>> On Mon, Feb 10, 2025 at 6:50 AM Bernardo Botella 
>>> <conta...@bernardobotella.com <mailto:conta...@bernardobotella.com>> wrote:
>>> Hi everyone,
>>> 
>>> Stefan Miklosovic and I have been working on a NOT_NULL 
>>> (https://github.com/apache/cassandra/pull/3867) constraint to be added to 
>>> the constraints tool belt, and a really interesting conversation came up.
>>> 
>>> First, as a problem statement, let's consider this:
>>> 
>>> -----------------------------------------
>>> CREATE TABLE ks.tb2 (
>>>     id int,
>>>     cl1 int,
>>>     cl2 int,
>>>     val text CHECK NOT_NULL(val),
>>>     PRIMARY KEY (id, cl1, cl2)
>>> ) 
>>> 
>>> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3, 
>>> null);
>>> InvalidRequest: Error from server: code=2200 [Invalid query] 
>>> message="Column value does not satisfy value constraint for column 'val' as 
>>> it is null."
>>> 
>>> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3, 
>>> “text");
>>> cassandra@cqlsh> select * from ks.tb2;
>>> 
>>>  id | cl1 | cl2 | val
>>> ----+-----+-----+------
>>>   1 |   2 |   3 | text
>>> 
>>> (1 rows)
>>> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2) VALUES ( 1, 2, 4);
>>> cassandra@cqlsh> select * from ks.tb2;
>>> 
>>>  id | cl1 | cl2 | val
>>> ----+-----+-----+------
>>>   1 |   2 |   3 | text
>>>   1 |   2 |   4 | null
>>> 
>>> -----------------------------------------
>>> 
>>> As you see, we have a hole in which a 'null' value is getting written on 
>>> column val even if we have a NOT_NULL on that particular column whenever 
>>> the column is NOT specified on the write. That raises the question on how 
>>> this particular constraint should behave.
>>> 
>>> If we consider the other constraints (scalar constraint and length 
>>> constraint so far), this particular behavior is fine. But, if the 
>>> constraint is NOT_NULL, then it becomes a little bit trickier.
>>> 
>>> The conclusions we have reached is that the meaning of constraints should 
>>> be interpreted like: I check whatever you give me as part of the write, 
>>> ignoring everything else. Let me elaborate:
>>> If we decide to treat this particular NOT_NULL constraint differently, and 
>>> check if the value for that column is present in the insert statement, we 
>>> then open a different can of worms. What happens if the row already exists 
>>> with a valid value, and that insert statement is only trying to do an 
>>> update to a different column in the row? If that was the case, we would be 
>>> forcing the user to specify the 'val' column value for every update, even 
>>> if it is not needed. 
>>> 
>>> Mainly for this reason, we think it is better to treat this NOT_NULL 
>>> constraint just like the other constraints, and execute it ONLY on the 
>>> values that are present on the insert statement.
>>> 
>>> The main con is that it may lead to a little bit of confussion (as in, why 
>>> I just added a null value to the table even if I have a NOT_NULL 
>>> constraint?). We have thought on aliviating this particular confusion by:
>>> - Extensive documentation. Let's be upfront on what this constraint does 
>>> and does not. 
>>> (https://github.com/apache/cassandra/blob/ed58c404e8c880b69584e71a3690d3d9f73ef9fa/doc/modules/cassandra/pages/developing/cql/constraints.adoc#not_null-constraint)
>>> - Adding, as part of this patch, yet another constraint 
>>> (STRICTLY_NOT_NULL), that checks for the actual column value to be present 
>>> in the insert statement..
>>> 
>>> If you've made it until here, that means you are really interested in 
>>> constraints. Thanks! The question for you is, would you have any concern 
>>> with this approach?
>>> 
>>> Thanks,
>>> Bernardo
>> 

Reply via email to