We can think about this. I don't mind supporting both. It needs a little bit more love in parser I guess. Most probably doable, as syntactic sugar.
On Fri, Apr 11, 2025 at 6:25 PM Jon Haddad <j...@rustyrazorblade.com> wrote: > I agree NOT NULL is nicer than check not null. > > Taking the JSON bit a step further, a dedicated JSON type is better than a > check, but that's probably out of scope. > > For inspiration, I've worked with Postgre's JSONB type in the past, and it > was awesome. > > https://www.postgresql.org/docs/current/functions-json.html > > On Fri, Apr 11, 2025 at 9:15 AM Benedict <bened...@apache.org> wrote: > >> [image: elephant.png] >> >> 5.5. Constraints >> <https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-CHECK-CONSTRAINTS> >> postgresql.org >> <https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-CHECK-CONSTRAINTS> >> >> <https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-CHECK-CONSTRAINTS> >> >> >> See the note directly above section 5.5.2 - Postgres supports functions >> in the CHECK predicate and requires they are pure functions (ie always >> yield the same result). >> >> The point is, CHECK is consistently a predicate defined over a field >> expression. If we aren’t matching the feature, why are we matching the >> keyword? >> >> I also disagree that CHECK is a useful prefix to all constraints, and >> would prefer consistency with Postgres here that also uses plain NOT NULL >> for a NOT NULL constraint. >> >> >> On 11 Apr 2025, at 17:05, Štefan Miklošovič <smikloso...@apache.org> >> wrote: >> >> >> I went through Postgres' docs in a quite elaborate manner and I do not >> see any usage of "constraint functions" as we have them. Both MySQL and >> Postgres seem to only provide simple predicates (using relational >> operators) and "functions" as we have them (e.g. json / length) are not >> supported. So in this regard we are providing more than they do. Where have >> you taken "is_json(field)" form from? Not saying it's wrong, I am just >> curious where this is coming from. >> >> I do not like that we would have "NOT NULL" without "CHECK". I think we >> can go a little bit our own way as we have the comfort of modelling this >> from scratch. CQL is already different from SQL as is and I do not think >> that trying to follow SQL _orthodoxly_ is absolutely necessary but at the >> same time I find it easier and more welcoming for users coming to Cassandra >> for the first time to have syntax which is as close as possible to what >> they are used to. >> >> I find having constraints starting with "CHECK" _every time_ consistent. >> They do not need to think twice if "check" is going to be there or not. It >> is there every time. I do not know why SQL did not do it the same way, most >> probably because "NOT NULL" was the first being introduced and "CHECK" >> followed afterwards and it was just too late to it consistent. >> >> >> >> On Fri, Apr 11, 2025 at 5:33 PM Bernardo Botella < >> conta...@bernardobotella.com> wrote: >> >>> Benedict: >>> >>> An alternative for that, keeping the CHECK word, would be to change the >>> constraint name to IS_JSON. CHECK IS_JSON would read as you intend without >>> the need to jump to REQUIRE. I think that’s true for the rest of provided >>> constraints as well. >>> >>> Bernardo >>> >>> >>> On Apr 11, 2025, at 6:02 AM, Benedict <bened...@apache.org> wrote: >>> >>> We have taken a different approach though, as we do not actually take a >>> predicate on the RHS and do not supply the column name. In our examples we >>> had eg CHECK JSON, which doesn’t parse unambiguously to a human. The >>> equivalent to Postgres would seem to be CHECK is_json(field). >>> >>> I’m all for following an existing example, but once we decide to diverge >>> the justification is gone and we should decide holistically what we think >>> is best. So if we want to elide the column entirely and have a list of >>> built in restrictions, I’d prefer eg REQUIRE JSON since this parses >>> unambiguously to a human, whereas if we want to follow Postgres let’s do >>> that but do it but that means eg CHECK is_json(field). >>> >>> On 11 Apr 2025, at 10:57, Štefan Miklošovič <smikloso...@apache.org> >>> wrote: >>> >>> >>> While modelling that, we followed how it is done in SQL world, >>> PostgreSQL as well as MySQL both use CHECK. >>> >>> >>> https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-CHECK-CONSTRAINTS >>> >>> https://dev.mysql.com/doc/refman/8.4/en/create-table-check-constraints.html >>> >>> On Fri, Apr 11, 2025 at 10:43 AM Benedict <bened...@apache.org> wrote: >>> >>>> I would prefer require/expect/is over check >>>> >>>> On 11 Apr 2025, at 08:05, Štefan Miklošovič <smikloso...@apache.org> >>>> wrote: >>>> >>>> >>>> Yes, you will have it like that :) Thank you for this idea. Great >>>> example of cooperation over diverse domains. >>>> >>>> On Fri, Apr 11, 2025 at 12:29 AM David Capwell <dcapw...@apple.com> >>>> wrote: >>>> >>>>> I am biased but I do prefer >>>>> >>>>> val3 text CHECK NOT NULL AND JSON AND LENGTH() < 1024 >>>>> >>>>> Here is a similar accord CQL >>>>> >>>>> BEGIN TRANSACTION >>>>> LET a = (…); >>>>> IF a IS NOT NULL >>>>> AND a.b IS NOT NULL >>>>> AND a.c IS NULL; THEN >>>>> — profit >>>>> END IF >>>>> COMMIT TRANSACTION >>>>> >>>>> On Apr 10, 2025, at 8:46 AM, Yifan Cai <yc25c...@gmail.com> wrote: >>>>> >>>>> Re: reserved keywords, “check” is currently not, and I don’t think it >>>>> needs to be a reserved keyword with the proposal. >>>>> >>>>> ------------------------------ >>>>> *From:* C. Scott Andreas <sc...@paradoxica.net> >>>>> *Sent:* Thursday, April 10, 2025 7:59:35 AM >>>>> *To:* dev@cassandra.apache.org <dev@cassandra.apache.org> >>>>> *Cc:* dev@cassandra.apache.org <dev@cassandra.apache.org> >>>>> *Subject:* Re: Constraint's "not null" alignment with transactions >>>>> and their simplification >>>>> >>>>> If the proposal does not introduce “check” as a reserved keyword that >>>>> would require quoting in existing DDL/DML, this concern doesn’t apply and >>>>> the email below can be ignored. This might be the case if “CHECK NOT NULL” >>>>> is the full token introduced rather than “CHECK” separately from >>>>> constraints that are checked. >>>>> >>>>> If “check” is introduced as a standalone reserved keyword: my primary >>>>> feedback is on the introduction of reserved words in the CQL grammar that >>>>> may affect compatibility of existing schemas. >>>>> >>>>> In the Cassandra 3.x series, several new CQL reserved words were added >>>>> (more than necessary) and subsequently backed out, because it required >>>>> users to begin quoting schemas and introduced incompatibility between 3.x >>>>> and 4.x for queries and DDL that “just worked” before. >>>>> >>>>> The word “check” is used in many domains (test/evaluation engineering, >>>>> finance, business processes, etc) and is likely to be used in user >>>>> schemas. >>>>> If the proposal introduces this as a reserved word that would require it >>>>> to >>>>> be quoted if used in table or column names, this will create >>>>> incompatibility for existing user queries on upgrade. >>>>> >>>>> Otherwise, ignore me. :) >>>>> >>>>> Thanks, >>>>> >>>>> – Scott >>>>> >>>>> ––– >>>>> Mobile >>>>> >>>>> On Apr 10, 2025, at 7:47 AM, Jon Haddad <j...@rustyrazorblade.com> >>>>> wrote: >>>>> >>>>> >>>>> This looks like a really nice improvement to me. >>>>> >>>>> >>>>> On Thu, Apr 10, 2025 at 7:27 AM Štefan Miklošovič < >>>>> smikloso...@apache.org> wrote: >>>>> >>>>> Recently, David Capwell was commenting on constraints in one of Slack >>>>> threads (1) in dev channel and he suggested that the current form of "not >>>>> null" constraint we have right now in place, e.g like this >>>>> >>>>> create table ks.tb (id int primary key, val int check not_null(val)); >>>>> >>>>> could be instead of that form used like this: >>>>> >>>>> create table ks.tb (id int primary key, val int check not null); >>>>> >>>>> That is - without the name of a column in the constraint's argument. >>>>> The reasoning behind that was that it is not only easier to read but there >>>>> is also this concept in transactions (cep-15) where there is also "not >>>>> null" used in some fashion and it would be nice if this was aligned so a >>>>> user does not encounter two usages of "not null"-s which are written down >>>>> differently, syntax-wise. >>>>> >>>>> Could the usage of "not null" in transactions be confirmed? >>>>> >>>>> This rather innocent suggestion brought an idea to us that constraints >>>>> could be quite simplified when it comes to their syntax, consider this: >>>>> >>>>> val int check not_null(val) >>>>> val text check json(val) >>>>> val text check lenght(val) < 1000 >>>>> >>>>> to be used like this: >>>>> >>>>> val int check not null >>>>> val text check json >>>>> val text check length() < 1000 >>>>> >>>>> more involved checks like this: >>>>> >>>>> val text check not_null(val) and json(val) and length(val) < 1000 >>>>> >>>>> might be just simplified to: >>>>> >>>>> val text check not null and json and length() < 1000 >>>>> >>>>> It almost reads like plain English. Isn't this just easier for an eye? >>>>> >>>>> The reason we kept the column names in constraint definitions is that, >>>>> frankly speaking, we just did not know any better at the time it was about >>>>> to be implemented. It is a little bit more tricky to be able to use it >>>>> without column names because in Parser.g / Antlr we just bound the grammar >>>>> around constraints to a column name directly there. When column names are >>>>> not going to be there anymore, we need to bind it later in the code behind >>>>> the parser in server code. It is doable, it was just about being a little >>>>> bit more involved there. >>>>> >>>>> Also, one reason to keep the name of a column was that we might >>>>> specify different columns in a constraint from a column that is defined on >>>>> to have cross-column constraints but we abandoned this idea altogether for >>>>> other reasons which rendered the occurrence of a column name in a >>>>> constraint definition redundant. >>>>> >>>>> To have some overview of what would be possible to do with this >>>>> proposal: >>>>> >>>>> val3 text CHECK SOMECONSTRAINT('a'); >>>>> val3 text CHECK JSON; >>>>> val3 text CHECK SOMECONSTRAINT('a') > 1; >>>>> val3 text CHECK SOMECONSTRAINT('a', 'b', 'c') > 1; >>>>> val3 text CHECK JSON AND LENGTH() < 600; >>>>> afternoon time CHECK afternoon >= '12:00:00' AND afternoon =< >>>>> '23:59:59'; >>>>> val3 text CHECK NOT NULL AND JSON AND LENGTH() < 1024 >>>>> >>>>> In addition to the specification of constraints without columns, what >>>>> would be possible to do is to also specify arguments to constraints. It is >>>>> currently not possible and there is no constraint which would accept >>>>> arguments to its function but I think that to be as flexible as possible >>>>> and prepare for the future, we might implement it as well. >>>>> >>>>> Constraints in their current form are already usable however I just >>>>> think that if we do not simplify, align and extend the syntax right now, >>>>> before it is baked in in a release, then we will never do it as it will be >>>>> quite tricky to extend this without breaking it and maintaining two >>>>> grammars at the same time would be very complex if not flat out >>>>> impossible. >>>>> >>>>> Are you open to the simplification of constraint definitions as >>>>> suggested and what is your feedback about that? I already have a working >>>>> POC which just needs to be polished and tests fixed to accommodate the new >>>>> approach. >>>>> >>>>> Regards >>>>> >>>>> (1) https://the-asf.slack.com/archives/CK23JSY2K/p1742409054164389 >>>>> >>>>> >>>>> >>>