This looks like a really nice improvement to me.

On Thu, Apr 10, 2025 at 7:27 AM Štefan Miklošovič <smikloso...@apache.org>
wrote:

> Recently, David Capwell was commenting on constraints in one of Slack
> threads (1) in dev channel and he suggested that the current form of "not
> null" constraint we have right now in place, e.g like this
>
> create table ks.tb (id int primary key, val int check not_null(val));
>
> could be instead of that form used like this:
>
> create table ks.tb (id int primary key, val int check not null);
>
> That is - without the name of a column in the constraint's argument. The
> reasoning behind that was that it is not only easier to read but there is
> also this concept in transactions (cep-15) where there is also "not null"
> used in some fashion and it would be nice if this was aligned so a user
> does not encounter two usages of "not null"-s which are written down
> differently, syntax-wise.
>
> Could the usage of "not null" in transactions be confirmed?
>
> This rather innocent suggestion brought an idea to us that constraints
> could be quite simplified when it comes to their syntax, consider this:
>
> val int check not_null(val)
> val text check json(val)
> val text check lenght(val) < 1000
>
> to be used like this:
>
> val int check not null
> val text check json
> val text check length() < 1000
>
> more involved checks like this:
>
> val text check not_null(val) and json(val) and length(val) < 1000
>
> might be just simplified to:
>
> val text check not null and json and length() < 1000
>
> It almost reads like plain English. Isn't this just easier for an eye?
>
> The reason we kept the column names in constraint definitions is that,
> frankly speaking, we just did not know any better at the time it was about
> to be implemented. It is a little bit more tricky to be able to use it
> without column names because in Parser.g / Antlr we just bound the grammar
> around constraints to a column name directly there. When column names are
> not going to be there anymore, we need to bind it later in the code behind
> the parser in server code. It is doable, it was just about being a little
> bit more involved there.
>
> Also, one reason to keep the name of a column was that we might specify
> different columns in a constraint from a column that is defined on to have
> cross-column constraints but we abandoned this idea altogether for other
> reasons which rendered the occurrence of a column name in a constraint
> definition redundant.
>
> To have some overview of what would be possible to do with this proposal:
>
> val3 text CHECK SOMECONSTRAINT('a');
> val3 text CHECK JSON;
> val3 text CHECK SOMECONSTRAINT('a') > 1;
> val3 text CHECK SOMECONSTRAINT('a', 'b', 'c') > 1;
> val3 text CHECK JSON AND LENGTH() < 600;
> afternoon time CHECK afternoon >= '12:00:00' AND afternoon =< '23:59:59';
> val3 text CHECK NOT NULL AND JSON AND LENGTH() < 1024
>
> In addition to the specification of constraints without columns, what
> would be possible to do is to also specify arguments to constraints. It is
> currently not possible and there is no constraint which would accept
> arguments to its function but I think that to be as flexible as possible
> and prepare for the future, we might implement it as well.
>
> Constraints in their current form are already usable however I just think
> that if we do not simplify, align and extend the syntax right now, before
> it is baked in in a release, then we will never do it as it will be quite
> tricky to extend this without breaking it and maintaining two grammars at
> the same time would be very complex if not flat out impossible.
>
> Are you open to the simplification of constraint definitions as suggested
> and what is your feedback about that? I already have a working POC which
> just needs to be polished and tests fixed to accommodate the new approach.
>
> Regards
>
> (1) https://the-asf.slack.com/archives/CK23JSY2K/p1742409054164389
>

Reply via email to