Could I make a suggestion? Well, I will make a suggestion :-) , but if it's not useful then feel free to ignore it.

Could we talk a bit about how users/operators would work with the CREATE ROLE features you're proposing?

The use-case as I understand it is that there are organizations that have or are going to create large numbers of clusters (say > 3), and they would appreciate some automation around creating role names and credentials for all those clusters. The proposal is to extend the CREATE ROLE statement to enable the database to generate those names and credentials automatically, including persisting them in the database itself.

One thing I'm wondering about is what kind of tooling those organizations are likely to be using for creating/managing all those clusters. Are they going to be scripting, or are they going to be using some third-party tooling like Terraform, CloudFormation, Puppet, etc.? If they're using tooling like that, which is going to be a more natural fit: making role/password generation available through CQL, or through Sidecar APIs, or ... ? I don't have an opinion at the moment so that's not a rhetorical question. I'd actually like to reason through what's going to work best for the folks who actually have to manage tons of clusters all day  long.

Somewhat related to that ... is there any need for role "stability" across clusters: e.g. I want to create a role that can access existing tables but not create/drop tables or keyspaces, and for my own sanity I want that role to have the same name on every cluster I operate. Do I have to implement a custom role name generator to do that, or is that common enough functionality that it should be supportable by the tooling I'm using to manage my clusters?

I don't have strong opinions on CQL vs Sidecar, but I think one way to frame the debate is to look at which will work best with the tooling that people already use to manage large numbers of clusters.

Thanks -- Joel.

On 9/16/2025 3:15 PM, Štefan Miklošovič wrote:

Oh crap, what a feedback! If nothing else this shows a lesson to everybody that the most sure way to have a fast feedback if you are tired of waiting or impatient so you can move quickly is to just propose your ideas, then boldly proclaim you go to do something and the universe will mysteriously take care of finding out somebody who will reject it. Because people are not always interested in agreeing. A lot of times, they take action only in case they don't and are put in front of it. So don't be afraid to take some flak as soon as possible!



On Tue, Sep 16, 2025 at 9:05 PM Patrick McFadin <pmcfa...@gmail.com> wrote:

    Thanks Mick, I'm just digging into this more after a long week of
    travel.

    Generally, I'm -1 for adding more custom syntax. Another concern
    of mine is adding control plane actions in DDL. I understand the
    usefulness of a feature like this in ops. It's a great idea.. Here
    would be my counter proposal:

     - Leave the CQL as is and keep "CREATE ROLE" etc as is, and avoid
    making changes to core Cassandra.


Why should we keep it "as is"? Genuinely asking. Why? Where is this need for conserving stuff coming from? Is this what we are doing here? Adding as little as possible? I think we are stifling innovation unnecessarily. There was the same discussion about constraints and CHECK NOT NULL / NOT NULL where we were trying to follow "the Holy Postgres Grail". I just don't get it. Are we not obsessed with that at this point? Literally nobody cares if there will be CREATE GENERATED ROLE. Nobody. Cares. So I do not take this point of yours as valid without some strong backing from your side.

     - Move the generation & policy to the sidecar project. A sidecar
    endpoint will generate the role name/password, enforce

    prefix/suffix/length requirements, ensure uniqueness, and then
    return the role and password (or a secret handle) to the caller.


Well the problem I see in putting this to Sidecar is that this would be only possible to do via HTTP(S). Not everybody is interested in it. Hardly. Zero interest. Sidecar is 0.2.0 at this point. I think that realistically speaking I am not far from the truth at all if I say that there is practically nobody who is using 0.2.0 in production. 0.2.0. I do not count exceptions as early adopters or Analytics.

Putting this to Sidecar almost guarantees nobody is going to use this particular functionality. People have their own control planes, their own way of generating this stuff and they are not going to deploy Sidecar just because they want to delegate this task to it. Come on. I think that it would, paradoxically, create more problems for them. Not less. So again, I do not take this point as something which is solving anything. This will have 0 users when put in Sidecar. I think it would be better if we just flat out refuse this instead of putting that to Sidecar. It is even worse imho. Another problem with Sidecar I see is that the current implementation is pluggable. How do you want to make this pluggable in Sidecar? Pluggable how? People might have their own opinion on how role names should be generated. That is why you can just code your own generator / validator, put it on the class path and be done with it. How are you supposed to "patch Sidecar"? You create a custom implementation, then you put it on the class path of Sidecar? Is this even supported? I think that you have proposed it with a good will but I don't think that would fly.


    Why?
     - End users will have it faster since it will work with any
    version of Cassandra supporting the CREATE syntax. (No having to
    backport either)
     - Keeps control plane actions optional and separated. Not an
    attack surface inside core Cassandra


Thirdly, what _attack surface_? I think you are pretty aware of the fact that this feature is by default turned off. If you have an organisation deploying hundreds of clusters and for each they have to figure out some role name for a user which is going to use it, how is this going to be abused concretely? There are dedicated accounts for CQL management, creation of a role is tied to some workflow etc. What is attacked exactly and how? Concrete examples please.

Dineshi had the concern that "what if we just have a script which will generate roles repeatedly nonstop?" How is this different from having a script which would generate roles itself instead of Cassandra and it would execute that? What's the difference really? If you want to abuse it you will. There is no protection against that unless we put some rate limiting in front of it - which I do not have a problem to address in follow-up work as already explained.

     - We keep the syntax of CQL more generic and less one-off.


I don't think this is relevant, really. I think we should abandon this mindset. At this point, to make the point, I suspect that CQL had to "hurt you" somehow :)

Regards

     - k8s/Cloud native friendly with separation of control plane/data
    plane.

    Patrick


    On Tue, Sep 16, 2025 at 7:31 AM Mick <m...@apache.org> wrote:




        > I think enough time passed for everybody to participate in
        the discussion so I would just move on and start the voting
        thread soon.



        Can we give CEP discussions longer than ~one week, please.

        Folk are easily away/offline for a whole week.  Take for
        example many who were at Community over Code and may still be
        catching up on their inbox, thinking dev@ is a less urgent folder.

        I haven't look at how fast the other CEP discuss threads have
        turned around, I apologise if I'm only singling one out, my
        concern applies generally.

Reply via email to