Re: reserved keywords, “check” is currently not, and I don’t think it needs to 
be a reserved keyword with the proposal.

________________________________
From: C. Scott Andreas <sc...@paradoxica.net>
Sent: Thursday, April 10, 2025 7:59:35 AM
To: dev@cassandra.apache.org <dev@cassandra.apache.org>
Cc: dev@cassandra.apache.org <dev@cassandra.apache.org>
Subject: Re: Constraint's "not null" alignment with transactions and their 
simplification

If the proposal does not introduce “check” as a reserved keyword that would 
require quoting in existing DDL/DML, this concern doesn’t apply and the email 
below can be ignored. This might be the case if “CHECK NOT NULL” is the full 
token introduced rather than “CHECK” separately from constraints that are 
checked.

If “check” is introduced as a standalone reserved keyword: my primary feedback 
is on the introduction of reserved words in the CQL grammar that may affect 
compatibility of existing schemas.

In the Cassandra 3.x series, several new CQL reserved words were added (more 
than necessary) and subsequently backed out, because it required users to begin 
quoting schemas and introduced incompatibility between 3.x and 4.x for queries 
and DDL that “just worked” before.

The word “check” is used in many domains (test/evaluation engineering, finance, 
business processes, etc) and is likely to be used in user schemas. If the 
proposal introduces this as a reserved word that would require it to be quoted 
if used in table or column names, this will create incompatibility for existing 
user queries on upgrade.

Otherwise, ignore me. :)

Thanks,

– Scott

–––
Mobile

On Apr 10, 2025, at 7:47 AM, Jon Haddad <j...@rustyrazorblade.com> wrote:


This looks like a really nice improvement to me.


On Thu, Apr 10, 2025 at 7:27 AM Štefan Miklošovič 
<smikloso...@apache.org<mailto:smikloso...@apache.org>> wrote:
Recently, David Capwell was commenting on constraints in one of Slack threads 
(1) in dev channel and he suggested that the current form of "not null" 
constraint we have right now in place, e.g like this

create table ks.tb (id int primary key, val int check not_null(val));

could be instead of that form used like this:

create table ks.tb (id int primary key, val int check not null);

That is - without the name of a column in the constraint's argument. The 
reasoning behind that was that it is not only easier to read but there is also 
this concept in transactions (cep-15) where there is also "not null" used in 
some fashion and it would be nice if this was aligned so a user does not 
encounter two usages of "not null"-s which are written down differently, 
syntax-wise.

Could the usage of "not null" in transactions be confirmed?

This rather innocent suggestion brought an idea to us that constraints could be 
quite simplified when it comes to their syntax, consider this:

val int check not_null(val)
val text check json(val)
val text check lenght(val) < 1000

to be used like this:

val int check not null
val text check json
val text check length() < 1000

more involved checks like this:

val text check not_null(val) and json(val) and length(val) < 1000

might be just simplified to:

val text check not null and json and length() < 1000

It almost reads like plain English. Isn't this just easier for an eye?

The reason we kept the column names in constraint definitions is that, frankly 
speaking, we just did not know any better at the time it was about to be 
implemented. It is a little bit more tricky to be able to use it without column 
names because in Parser.g / Antlr we just bound the grammar around constraints 
to a column name directly there. When column names are not going to be there 
anymore, we need to bind it later in the code behind the parser in server code. 
It is doable, it was just about being a little bit more involved there.

Also, one reason to keep the name of a column was that we might specify 
different columns in a constraint from a column that is defined on to have 
cross-column constraints but we abandoned this idea altogether for other 
reasons which rendered the occurrence of a column name in a constraint 
definition redundant.

To have some overview of what would be possible to do with this proposal:

val3 text CHECK SOMECONSTRAINT('a');
val3 text CHECK JSON;
val3 text CHECK SOMECONSTRAINT('a') > 1;
val3 text CHECK SOMECONSTRAINT('a', 'b', 'c') > 1;
val3 text CHECK JSON AND LENGTH() < 600;
afternoon time CHECK afternoon >= '12:00:00' AND afternoon =< '23:59:59';
val3 text CHECK NOT NULL AND JSON AND LENGTH() < 1024

In addition to the specification of constraints without columns, what would be 
possible to do is to also specify arguments to constraints. It is currently not 
possible and there is no constraint which would accept arguments to its 
function but I think that to be as flexible as possible and prepare for the 
future, we might implement it as well.

Constraints in their current form are already usable however I just think that 
if we do not simplify, align and extend the syntax right now, before it is 
baked in in a release, then we will never do it as it will be quite tricky to 
extend this without breaking it and maintaining two grammars at the same time 
would be very complex if not flat out impossible.

Are you open to the simplification of constraint definitions as suggested and 
what is your feedback about that? I already have a working POC which just needs 
to be polished and tests fixed to accommodate the new approach.

Regards

(1) https://the-asf.slack.com/archives/CK23JSY2K/p1742409054164389

Reply via email to