> > 2) > Is part of an enum is somehow suplying the lack of enum types. Constraint > could be something like CONSTRAINT belongsToEnum([list of valid values], > field): > CREATE TABLE keyspace.table ( > field text CONSTRAINT belongsToEnum(['foo', 'foo2'], field), > ... > ); > 3) > Similarly, we can check and reject if a term is part of a list of blocked > terms: > CREATE TABLE keyspace.table ( > field text CONSTRAINT isNotBlocked(['blocked_foo', 'blocked_foo2'], > field), > ... > );
Are these not just "CONSTRAINT inList([List of valid values], field);" and "CONSTRAINT not inList([List of valid values], field);"? At this point doesn't "CONSTRAINT p1 != p2" devolve to "CONSTRAINT not inList([p1], p2);"? Can "[List of values]" point to a variable containing a list? Or does it require hard coding in the constraint itself? On Tue, Jun 11, 2024 at 6:23 PM Bernardo Botella < conta...@bernardobotella.com> wrote: > Hi Štephan > > I'll address the different points: > 1) > An example (possibly a stretch) of use case for != constraint would be: > Let's say you have a table in which you want to record a movement, from > position p1 to position p2. You may want to check that those two are > different to make sure there is actual movement. > > CREATE TABLE keyspace.table ( > p1 int, > p2 int, > ..., > CONSTRAINT p1 != p2 > ); > > For the case of ==, I agree that it is harder to come up with a valid use > case, and I added it for completion. > > 2) > Is part of an enum is somehow suplying the lack of enum types. Constraint > could be something like CONSTRAINT belongsToEnum([list of valid values], > field): > CREATE TABLE keyspace.table ( > field text CONSTRAINT belongsToEnum(['foo', 'foo2'], field), > ... > ); > > 3) > Similarly, we can check and reject if a term is part of a list of blocked > terms: > CREATE TABLE keyspace.table ( > field text CONSTRAINT isNotBlocked(['blocked_foo', 'blocked_foo2'], > field), > ... > ); > > Please let me know if this helps, > Bernardo > > > > On Jun 11, 2024, at 6:29 AM, Štefan Miklošovič < > stefan.mikloso...@gmail.com> wrote: > > Hi Bernardo, > > 1) Could you elaborate on these two constraints? > > == and != ? > > What is the use case? Why would I want to have data in a database stored > in some column which would need to be _same as my constraint_ and which > _could not_ be same as my constraint? Can you give me at least one example > of each? It looks like I am going to put a constant into a database in case > of ==, wouldn't a static column be better? > > 2) For examples of text based types you mentioned: "is part of an enum" - > how would you enforce this in Cassandra? What enum do we have in CQL? > 3) What does "is it block listed" mean? > > In the meanwhile, I made changes to CEP-24 to move transactionality into > optional features. > > On Tue, Jun 11, 2024 at 12:18 AM Bernardo Botella < > conta...@bernardobotella.com> wrote: > >> Hi everyone, >> >> After the feedback, I'd like to make a recap of what we have discussed in >> this thread and try to move forward with the conversation. >> >> I made some clarifications: >> - Constraints are only applied at write time. >> - Guardrail configurations should maintain preference over what's being >> defined as a constraint. >> >> *Specify constraints:* >> There is a general feedback around adding more concrete examples than the >> ones that can be found on the CEP document. >> Basically, the initial constraints I am proposing are: >> - SizeOf Constraint for String types, as in >> name text CONSTRAINT sizeOf(name) < 256 >> >> - Value Constraint for numeric types >> number_of_items int CONSTRAINT number_of_items < 1000 >> >> Those two alone and combined provide a lot of flexibility, and allow >> complex validations that enable "new types" such as: >> >> CREATE TYPE keyspace.cidr_address_ipv4 ( >> ip_adress inet, >> subnet_mask int, >> CONSTRAINT subnet_mask > 0, >> CONSTRAINT subnet_mask < 32 >> ) >> >> CREATE TYPE keyspace.color ( >> r int, >> g int, >> b int, >> CONSTRAINT r >= 0, >> CONSTRAINT r < 255, >> CONSTRAINT g >= 0, >> CONSTRAINT g < 255, >> CONSTRAINT b >= 0, >> CONSTRAINT b < 255, >> ) >> >> >> Those two initial Constraints are de fundamental constraints that would >> give value to the feature. The framework can (and will) be extended with >> other Constraints, leaving us with the following: >> >> For numeric types: >> - Max (<) >> - Min (>) >> - Equality ( = = ) >> - Difference (!=) >> >> For date types: >> - Before (<) >> - After (>) >> >> For text based types: >> - Size (sizeOf) >> - isJson (is the text a json?) >> - complies with a given pattern >> - Is it block listed? >> - Is it part of an enum? >> >> General table constraints (including more than one column): >> - Compare between numeric types (a < b, a > b, a != b, …) >> - Compare between date types (date1 < date2, date1>date2, date1!=date2, …) >> >> I have updated the CEP with this information. >> >> *Potential dependency on CEP-24:* >> Giving that the Constraints Framework provides a set of checks to be >> performed along side those that can be made using the Guardrails framework, >> there may be some relation with CEP-24, which mentions transactional >> Guardrails to prevent situation in which the limit configurations are >> different across the cluster. >> >> This CEP-42 is not proposing modifying the Guardrails framework, and >> therefore should not be affected by CEP-24. It is true that the >> improvements provided by CEP-24 would benefit this Constraints framework, >> but it is not dependent on them. >> >> >> I hope I included all the points and addressed them on the CEP, >> otherwise, please call it out and I’ll be more than happy to include it. >> >> Thanks everyone for all the inputs! >> Bernardo >> >> On Jun 7, 2024, at 11:54 AM, Štefan Miklošovič < >> stefan.mikloso...@gmail.com> wrote: >> >> How I see it is that in 5.1 there will be TCM for the very first time and >> I do not think that config in TCM would make it into 5.1 based on what Sam >> talks about (need for some stability etc), that makes total sense to me. >> TCM is quite a big feature to deliver on its own and putting even way more >> stuff into that might be detrimental to the quality if we rush it. >> >> Then sometimes after 5.1 we might take a serious look for config in TCM >> itself. >> >> My plan, ideally, is to still ship CEP-24 without config in TCM, then >> after 5.1 when config in TCM lands, CEP-24 might integrate with that on a >> deeper level. >> >> If CEP-42 (this one) makes it into 5.1 as well, I think the similar case >> might be done about that as well (integration with guardrails). >> >> On Fri, Jun 7, 2024 at 8:49 PM Sam Tunnicliffe <s...@beobal.com> wrote: >> >>> We've been working on a draft CEP for migrating config from yaml to >>> cluster metadata but have been a bit short of time recently, I'll try to >>> get something out for discussion as soon as possible. >>> A little delay isn't such a bad thing IMO, as we're still ironing out >>> the kinks in the TCM implementation itself. It'd be good to get a bit more >>> road testing done with that before we start adding more to it, which I'm >>> sure will start to ramp up once 5.0 is out. >>> >>> Thanks, >>> Sam >>> >>> On 7 Jun 2024, at 19:19, Štefan Miklošovič <stefan.mikloso...@gmail.com> >>> wrote: >>> >>> Yes, all configuration should be transactional (configuration which >>> makes sense to require to be the same cluster-wide). Guardrails in TCM are >>> just a subset of this problem. When I started to do CEP-24 I started with >>> guardrails in TCM but then I realized it leads to more general "all config >>> in TCM" and I found myself rabbit-hole-ing endlessly. >>> >>> BTW I do not think that once CEP-24 is in place without guardrails in >>> TCM then implementing it would blow up things a lot. It is really just >>> about a couple mutable virtual tables and a couple transformations for >>> various guardrail types we have but I expect that its integration into more >>> general config in TCM should be rather straightforward. >>> >>> Config in TCM definitely deserves its own CEP, it is too much to handle >>> under CEP-24 and CEP-24 can go without it already. It just put a little bit >>> more configuration acumen to nail it down correctly. >>> >>> Regards >>> >>> On Fri, Jun 7, 2024 at 8:12 PM Doug Rohrer <droh...@apple.com> wrote: >>> >>>> There’s a difference between the two though. Constraints are part of >>>> the table schema, and (independent of the interaction with Guardrails), >>>> have no dependency on yaml files being perfectly in sync across the >>>> cluster. Therefore, the feature (Constraints) on its own doesn’t depend on >>>> configuration files to be correct in its own right. The only place where >>>> this isn’t true is it’s interaction with Guardrails, which happen to be >>>> yaml-file based and cause issues. >>>> >>>> CEP-24’s password length requirements, however, is intended to be >>>> implemented *by adding a new guardrail*, which is totally dependent on >>>> YAML files today (and thus the concerns around a single misconfigured >>>> server allowing someone to use an insecure password). If CEP-24 fixes >>>> guardrails’ dependence on yaml files, it would *also* fix the >>>> problematic interaction between guardrails and constraints. >>>> >>>> I agree that it would be incredibly valuable to find a solution to the >>>> “yaml files need to be correct everywhere or something breaks” problem, and >>>> I think CEP-24, being security-focused, is more likely to be problematic >>>> without a solution to this issue. That said, I think Dinesh is right in >>>> that, at the end of the day, CEP-24 could be implemented without fixing the >>>> yaml config issue. >>>> >>>> I do wonder if the “Guardrails should be transactional” should really >>>> be “configuration should be transactional”, or at least as much config as >>>> possible should be, but that would blow up CEP-24 fairly dramatically >>>> (maybe?). Maybe “cluster-wide configuration should be read from a >>>> distributed source on startup/joining the cluster” or something would make >>>> sense, so the yaml file works as the source of truth on startup, but as >>>> soon as possible it’s read from a TCM-backed data source, and anything the >>>> node can get from other nodes it would… but now I’m designing a different >>>> CEP in a discuss thread, which is probably a bad idea... >>>> >>>> Regardless, I hope that I’m explaining why I see a difference between >>>> constraints and guardrails, and why I think it makes sense that constraints >>>> can move forward without a solution the misconfiguration problem where I >>>> also think you were right in calling it out in CEP-24 (even if we >>>> eventually move forward on CEP-24 without the solution in place). >>>> >>>> Doug >>>> >>>> >>>> >>>> On Jun 7, 2024, at 1:51 AM, Dinesh Joshi <djo...@apache.org> wrote: >>>> >>>> On Thu, Jun 6, 2024 at 1:03 PM Štefan Miklošovič < >>>> stefan.mikloso...@gmail.com> wrote: >>>> >>>>> It is interesting to see this feedback. When I look at CEP-24 where I >>>>> am obsessing about a user being able to misconfigure the password >>>>> validation strength so if a user hits a "weak" node then she would be able >>>>> to bypass it, and I see what is our approach here, then I am not sure what >>>>> I was waiting so long for and I should probably be just more aggressive >>>>> with the CEP and all the "caveats" could be just overlooked and deferred >>>>> to >>>>> "sometimes later". >>>>> >>>> >>>> Stefan, unfortunately I didn't participate in the CEP-24 DISCUSS >>>> thread. Had I paid attention I would have suggested waiting on TCM doesn't >>>> make the feature any different. The feature is less likely to be >>>> misconfigured in a cluster. CEP-24 is valuable and password compliance with >>>> policies is a super useful feature which IMO shouldn't have been held back >>>> due to lack of TCM. >>>> >>>> >>>> >>>> >>> >> >