Hi Bernardo, 1) Could you elaborate on these two constraints?
== and != ? What is the use case? Why would I want to have data in a database stored in some column which would need to be _same as my constraint_ and which _could not_ be same as my constraint? Can you give me at least one example of each? It looks like I am going to put a constant into a database in case of ==, wouldn't a static column be better? 2) For examples of text based types you mentioned: "is part of an enum" - how would you enforce this in Cassandra? What enum do we have in CQL? 3) What does "is it block listed" mean? In the meanwhile, I made changes to CEP-24 to move transactionality into optional features. On Tue, Jun 11, 2024 at 12:18 AM Bernardo Botella < conta...@bernardobotella.com> wrote: > Hi everyone, > > After the feedback, I'd like to make a recap of what we have discussed in > this thread and try to move forward with the conversation. > > I made some clarifications: > - Constraints are only applied at write time. > - Guardrail configurations should maintain preference over what's being > defined as a constraint. > > *Specify constraints:* > There is a general feedback around adding more concrete examples than the > ones that can be found on the CEP document. > Basically, the initial constraints I am proposing are: > - SizeOf Constraint for String types, as in > name text CONSTRAINT sizeOf(name) < 256 > > - Value Constraint for numeric types > number_of_items int CONSTRAINT number_of_items < 1000 > > Those two alone and combined provide a lot of flexibility, and allow > complex validations that enable "new types" such as: > > CREATE TYPE keyspace.cidr_address_ipv4 ( > ip_adress inet, > subnet_mask int, > CONSTRAINT subnet_mask > 0, > CONSTRAINT subnet_mask < 32 > ) > > CREATE TYPE keyspace.color ( > r int, > g int, > b int, > CONSTRAINT r >= 0, > CONSTRAINT r < 255, > CONSTRAINT g >= 0, > CONSTRAINT g < 255, > CONSTRAINT b >= 0, > CONSTRAINT b < 255, > ) > > > Those two initial Constraints are de fundamental constraints that would > give value to the feature. The framework can (and will) be extended with > other Constraints, leaving us with the following: > > For numeric types: > - Max (<) > - Min (>) > - Equality ( = = ) > - Difference (!=) > > For date types: > - Before (<) > - After (>) > > For text based types: > - Size (sizeOf) > - isJson (is the text a json?) > - complies with a given pattern > - Is it block listed? > - Is it part of an enum? > > General table constraints (including more than one column): > - Compare between numeric types (a < b, a > b, a != b, …) > - Compare between date types (date1 < date2, date1>date2, date1!=date2, …) > > I have updated the CEP with this information. > > *Potential dependency on CEP-24:* > Giving that the Constraints Framework provides a set of checks to be > performed along side those that can be made using the Guardrails framework, > there may be some relation with CEP-24, which mentions transactional > Guardrails to prevent situation in which the limit configurations are > different across the cluster. > > This CEP-42 is not proposing modifying the Guardrails framework, and > therefore should not be affected by CEP-24. It is true that the > improvements provided by CEP-24 would benefit this Constraints framework, > but it is not dependent on them. > > > I hope I included all the points and addressed them on the CEP, otherwise, > please call it out and I’ll be more than happy to include it. > > Thanks everyone for all the inputs! > Bernardo > > On Jun 7, 2024, at 11:54 AM, Štefan Miklošovič < > stefan.mikloso...@gmail.com> wrote: > > How I see it is that in 5.1 there will be TCM for the very first time and > I do not think that config in TCM would make it into 5.1 based on what Sam > talks about (need for some stability etc), that makes total sense to me. > TCM is quite a big feature to deliver on its own and putting even way more > stuff into that might be detrimental to the quality if we rush it. > > Then sometimes after 5.1 we might take a serious look for config in TCM > itself. > > My plan, ideally, is to still ship CEP-24 without config in TCM, then > after 5.1 when config in TCM lands, CEP-24 might integrate with that on a > deeper level. > > If CEP-42 (this one) makes it into 5.1 as well, I think the similar case > might be done about that as well (integration with guardrails). > > On Fri, Jun 7, 2024 at 8:49 PM Sam Tunnicliffe <s...@beobal.com> wrote: > >> We've been working on a draft CEP for migrating config from yaml to >> cluster metadata but have been a bit short of time recently, I'll try to >> get something out for discussion as soon as possible. >> A little delay isn't such a bad thing IMO, as we're still ironing out the >> kinks in the TCM implementation itself. It'd be good to get a bit more road >> testing done with that before we start adding more to it, which I'm sure >> will start to ramp up once 5.0 is out. >> >> Thanks, >> Sam >> >> On 7 Jun 2024, at 19:19, Štefan Miklošovič <stefan.mikloso...@gmail.com> >> wrote: >> >> Yes, all configuration should be transactional (configuration which makes >> sense to require to be the same cluster-wide). Guardrails in TCM are just a >> subset of this problem. When I started to do CEP-24 I started with >> guardrails in TCM but then I realized it leads to more general "all config >> in TCM" and I found myself rabbit-hole-ing endlessly. >> >> BTW I do not think that once CEP-24 is in place without guardrails in TCM >> then implementing it would blow up things a lot. It is really just about a >> couple mutable virtual tables and a couple transformations for various >> guardrail types we have but I expect that its integration into more general >> config in TCM should be rather straightforward. >> >> Config in TCM definitely deserves its own CEP, it is too much to handle >> under CEP-24 and CEP-24 can go without it already. It just put a little bit >> more configuration acumen to nail it down correctly. >> >> Regards >> >> On Fri, Jun 7, 2024 at 8:12 PM Doug Rohrer <droh...@apple.com> wrote: >> >>> There’s a difference between the two though. Constraints are part of the >>> table schema, and (independent of the interaction with Guardrails), have no >>> dependency on yaml files being perfectly in sync across the cluster. >>> Therefore, the feature (Constraints) on its own doesn’t depend on >>> configuration files to be correct in its own right. The only place where >>> this isn’t true is it’s interaction with Guardrails, which happen to be >>> yaml-file based and cause issues. >>> >>> CEP-24’s password length requirements, however, is intended to be >>> implemented *by adding a new guardrail*, which is totally dependent on >>> YAML files today (and thus the concerns around a single misconfigured >>> server allowing someone to use an insecure password). If CEP-24 fixes >>> guardrails’ dependence on yaml files, it would *also* fix the >>> problematic interaction between guardrails and constraints. >>> >>> I agree that it would be incredibly valuable to find a solution to the >>> “yaml files need to be correct everywhere or something breaks” problem, and >>> I think CEP-24, being security-focused, is more likely to be problematic >>> without a solution to this issue. That said, I think Dinesh is right in >>> that, at the end of the day, CEP-24 could be implemented without fixing the >>> yaml config issue. >>> >>> I do wonder if the “Guardrails should be transactional” should really be >>> “configuration should be transactional”, or at least as much config as >>> possible should be, but that would blow up CEP-24 fairly dramatically >>> (maybe?). Maybe “cluster-wide configuration should be read from a >>> distributed source on startup/joining the cluster” or something would make >>> sense, so the yaml file works as the source of truth on startup, but as >>> soon as possible it’s read from a TCM-backed data source, and anything the >>> node can get from other nodes it would… but now I’m designing a different >>> CEP in a discuss thread, which is probably a bad idea... >>> >>> Regardless, I hope that I’m explaining why I see a difference between >>> constraints and guardrails, and why I think it makes sense that constraints >>> can move forward without a solution the misconfiguration problem where I >>> also think you were right in calling it out in CEP-24 (even if we >>> eventually move forward on CEP-24 without the solution in place). >>> >>> Doug >>> >>> >>> >>> On Jun 7, 2024, at 1:51 AM, Dinesh Joshi <djo...@apache.org> wrote: >>> >>> On Thu, Jun 6, 2024 at 1:03 PM Štefan Miklošovič < >>> stefan.mikloso...@gmail.com> wrote: >>> >>>> It is interesting to see this feedback. When I look at CEP-24 where I >>>> am obsessing about a user being able to misconfigure the password >>>> validation strength so if a user hits a "weak" node then she would be able >>>> to bypass it, and I see what is our approach here, then I am not sure what >>>> I was waiting so long for and I should probably be just more aggressive >>>> with the CEP and all the "caveats" could be just overlooked and deferred to >>>> "sometimes later". >>>> >>> >>> Stefan, unfortunately I didn't participate in the CEP-24 DISCUSS thread. >>> Had I paid attention I would have suggested waiting on TCM doesn't make >>> the feature any different. The feature is less likely to be misconfigured >>> in a cluster. CEP-24 is valuable and password compliance with policies is a >>> super useful feature which IMO shouldn't have been held back due to lack of >>> TCM. >>> >>> >>> >>> >> >