We have consensus then. Let’s ditch the non strict version, and rename the STRICTLY_NOT_NULL to NOT_NULL.
Thanks everyone! Bernardo > On Feb 10, 2025, at 8:58 AM, Štefan Miklošovič <smikloso...@apache.org> wrote: > > I agree. > > The only reason would be purely practical: if a user has a table consisting > of 1000 columns not being null and a user wants to modify 1 column only, then > a user would be forced to specify the remaining 999 columns just for the sake > of it. > > But in this case, I think it would be more practical just to ensure in the > application that what he is putting there is not null rather than having 1000 > constraints on the table. > > On Mon, Feb 10, 2025 at 5:52 PM Yifan Cai <yc25c...@gmail.com > <mailto:yc25c...@gmail.com>> wrote: >> While LOOSE_NOT_NULL might improve the clarity a bit, what is the value of >> such constraint provides to users? It still permits null. Meanwhile, it is >> easier to check the nullness of the bound values on the application side. >> IMO, what benefits users is a way to ensure no null value can exist for the >> constrained columns. Reading the thread, it is the behavior of the strict >> version. >> How about we just drop the LOOSE one and call the STRICT one “NOT_NULL”? >> >> - Yifan >> From: Bernardo Botella <conta...@bernardobotella.com >> <mailto:conta...@bernardobotella.com>> >> Sent: Monday, February 10, 2025 8:44:13 AM >> To: dev@cassandra.apache.org <mailto:dev@cassandra.apache.org> >> <dev@cassandra.apache.org <mailto:dev@cassandra.apache.org>> >> Subject: Re: [DISCUSS] NOT_NULL constraint vs STRICTLY_NOT_NULL constraint >> >> To recap, >> >> The sentiment I am getting is that NOT_NULL allowing null values is too >> confusing. Nice, that’s why we started the thread. >> >> As an alternative, instead of ditching the loose not null constraint, I >> propose we change the “default” behavior. From my initial proposal, I >> suggest renaming the Constraints: >> - NOT_NULL -> LOOSE_NOT_NULL >> - STRICTLY_NOT_NULL -> NOT_NULL >> >> The reasoning behind trying to keep it is: >> - It is already implemented. >> - By being explicit with it being loose, we avoid the confusion of allowing >> nulls. >> - It still adds value on its own. >> >> With, the “by default” not null doesn’t allow null or non present values on >> the insert statement, while we still support the more relaxed LOOSE_NOT_NULL >> for updates. >> >> Thoughts? >> >> >>> On Feb 10, 2025, at 8:29 AM, Štefan Miklošovič <smikloso...@apache.org >>> <mailto:smikloso...@apache.org>> wrote: >>> >>> >>> >>> On Mon, Feb 10, 2025 at 5:20 PM Dinesh Joshi <djo...@apache.org >>> <mailto:djo...@apache.org>> wrote: >>> In my head NOT_NULL constraint implies that the column must be specified on >>> each write and must not be NULL. If a column with the NOT_NULL constraint >>> is omitted during a write then shouldn’t it be treated as if it was >>> specified and set to NULL? >>> >>> Well, yes. One may also look at it that way. But then we would end up with >>> "null" in a column, while it would be quite surprising for users to see >>> that because they were thinking that if they specified it as NOT NULL on a >>> table creation, then it is "guaranteed" that it will not be null ever >>> again. It just looks strange to say in table schema it is not null but then >>> it actually might be. >>> >>> >>> If the column has a non-NULL value that was previously written and you’re >>> updating the rest of the columns, you still have to force the user to >>> specify it otherwise you will have to perform a read before write to >>> validate that the column was not NULL. I think this is a fine compromise >>> given that the goal here is to ensure that an application shouldn’t >>> inadvertently write a NULL value for a column specified as NOT_NULL. >>> >>> >>> Yes. I see it the same way. >>> >>> On Mon, Feb 10, 2025 at 6:50 AM Bernardo Botella >>> <conta...@bernardobotella.com <mailto:conta...@bernardobotella.com>> wrote: >>> Hi everyone, >>> >>> Stefan Miklosovic and I have been working on a NOT_NULL >>> (https://github.com/apache/cassandra/pull/3867) constraint to be added to >>> the constraints tool belt, and a really interesting conversation came up. >>> >>> First, as a problem statement, let's consider this: >>> >>> ----------------------------------------- >>> CREATE TABLE ks.tb2 ( >>> id int, >>> cl1 int, >>> cl2 int, >>> val text CHECK NOT_NULL(val), >>> PRIMARY KEY (id, cl1, cl2) >>> ) >>> >>> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3, >>> null); >>> InvalidRequest: Error from server: code=2200 [Invalid query] >>> message="Column value does not satisfy value constraint for column 'val' as >>> it is null." >>> >>> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2, val) VALUES ( 1, 2, 3, >>> “text"); >>> cassandra@cqlsh> select * from ks.tb2; >>> >>> id | cl1 | cl2 | val >>> ----+-----+-----+------ >>> 1 | 2 | 3 | text >>> >>> (1 rows) >>> cassandra@cqlsh> INSERT INTO ks.tb2 (id, cl1, cl2) VALUES ( 1, 2, 4); >>> cassandra@cqlsh> select * from ks.tb2; >>> >>> id | cl1 | cl2 | val >>> ----+-----+-----+------ >>> 1 | 2 | 3 | text >>> 1 | 2 | 4 | null >>> >>> ----------------------------------------- >>> >>> As you see, we have a hole in which a 'null' value is getting written on >>> column val even if we have a NOT_NULL on that particular column whenever >>> the column is NOT specified on the write. That raises the question on how >>> this particular constraint should behave. >>> >>> If we consider the other constraints (scalar constraint and length >>> constraint so far), this particular behavior is fine. But, if the >>> constraint is NOT_NULL, then it becomes a little bit trickier. >>> >>> The conclusions we have reached is that the meaning of constraints should >>> be interpreted like: I check whatever you give me as part of the write, >>> ignoring everything else. Let me elaborate: >>> If we decide to treat this particular NOT_NULL constraint differently, and >>> check if the value for that column is present in the insert statement, we >>> then open a different can of worms. What happens if the row already exists >>> with a valid value, and that insert statement is only trying to do an >>> update to a different column in the row? If that was the case, we would be >>> forcing the user to specify the 'val' column value for every update, even >>> if it is not needed. >>> >>> Mainly for this reason, we think it is better to treat this NOT_NULL >>> constraint just like the other constraints, and execute it ONLY on the >>> values that are present on the insert statement. >>> >>> The main con is that it may lead to a little bit of confussion (as in, why >>> I just added a null value to the table even if I have a NOT_NULL >>> constraint?). We have thought on aliviating this particular confusion by: >>> - Extensive documentation. Let's be upfront on what this constraint does >>> and does not. >>> (https://github.com/apache/cassandra/blob/ed58c404e8c880b69584e71a3690d3d9f73ef9fa/doc/modules/cassandra/pages/developing/cql/constraints.adoc#not_null-constraint) >>> - Adding, as part of this patch, yet another constraint >>> (STRICTLY_NOT_NULL), that checks for the actual column value to be present >>> in the insert statement.. >>> >>> If you've made it until here, that means you are really interested in >>> constraints. Thanks! The question for you is, would you have any concern >>> with this approach? >>> >>> Thanks, >>> Bernardo >>