This looks like a really nice improvement to me.
On Thu, Apr 10, 2025 at 7:27 AM Štefan Miklošovič <smikloso...@apache.org> wrote: > Recently, David Capwell was commenting on constraints in one of Slack > threads (1) in dev channel and he suggested that the current form of "not > null" constraint we have right now in place, e.g like this > > create table ks.tb (id int primary key, val int check not_null(val)); > > could be instead of that form used like this: > > create table ks.tb (id int primary key, val int check not null); > > That is - without the name of a column in the constraint's argument. The > reasoning behind that was that it is not only easier to read but there is > also this concept in transactions (cep-15) where there is also "not null" > used in some fashion and it would be nice if this was aligned so a user > does not encounter two usages of "not null"-s which are written down > differently, syntax-wise. > > Could the usage of "not null" in transactions be confirmed? > > This rather innocent suggestion brought an idea to us that constraints > could be quite simplified when it comes to their syntax, consider this: > > val int check not_null(val) > val text check json(val) > val text check lenght(val) < 1000 > > to be used like this: > > val int check not null > val text check json > val text check length() < 1000 > > more involved checks like this: > > val text check not_null(val) and json(val) and length(val) < 1000 > > might be just simplified to: > > val text check not null and json and length() < 1000 > > It almost reads like plain English. Isn't this just easier for an eye? > > The reason we kept the column names in constraint definitions is that, > frankly speaking, we just did not know any better at the time it was about > to be implemented. It is a little bit more tricky to be able to use it > without column names because in Parser.g / Antlr we just bound the grammar > around constraints to a column name directly there. When column names are > not going to be there anymore, we need to bind it later in the code behind > the parser in server code. It is doable, it was just about being a little > bit more involved there. > > Also, one reason to keep the name of a column was that we might specify > different columns in a constraint from a column that is defined on to have > cross-column constraints but we abandoned this idea altogether for other > reasons which rendered the occurrence of a column name in a constraint > definition redundant. > > To have some overview of what would be possible to do with this proposal: > > val3 text CHECK SOMECONSTRAINT('a'); > val3 text CHECK JSON; > val3 text CHECK SOMECONSTRAINT('a') > 1; > val3 text CHECK SOMECONSTRAINT('a', 'b', 'c') > 1; > val3 text CHECK JSON AND LENGTH() < 600; > afternoon time CHECK afternoon >= '12:00:00' AND afternoon =< '23:59:59'; > val3 text CHECK NOT NULL AND JSON AND LENGTH() < 1024 > > In addition to the specification of constraints without columns, what > would be possible to do is to also specify arguments to constraints. It is > currently not possible and there is no constraint which would accept > arguments to its function but I think that to be as flexible as possible > and prepare for the future, we might implement it as well. > > Constraints in their current form are already usable however I just think > that if we do not simplify, align and extend the syntax right now, before > it is baked in in a release, then we will never do it as it will be quite > tricky to extend this without breaking it and maintaining two grammars at > the same time would be very complex if not flat out impossible. > > Are you open to the simplification of constraint definitions as suggested > and what is your feedback about that? I already have a working POC which > just needs to be polished and tests fixed to accommodate the new approach. > > Regards > > (1) https://the-asf.slack.com/archives/CK23JSY2K/p1742409054164389 >