Could I make a suggestion? Well, I will make a suggestion :-) , but if
it's not useful then feel free to ignore it.
Could we talk a bit about how users/operators would work with the CREATE
ROLE features you're proposing?
The use-case as I understand it is that there are organizations that
have or are going to create large numbers of clusters (say > 3), and
they would appreciate some automation around creating role names and
credentials for all those clusters. The proposal is to extend the CREATE
ROLE statement to enable the database to generate those names and
credentials automatically, including persisting them in the database itself.
One thing I'm wondering about is what kind of tooling those
organizations are likely to be using for creating/managing all those
clusters. Are they going to be scripting, or are they going to be using
some third-party tooling like Terraform, CloudFormation, Puppet, etc.?
If they're using tooling like that, which is going to be a more natural
fit: making role/password generation available through CQL, or through
Sidecar APIs, or ... ? I don't have an opinion at the moment so that's
not a rhetorical question. I'd actually like to reason through what's
going to work best for the folks who actually have to manage tons of
clusters all day long.
Somewhat related to that ... is there any need for role "stability"
across clusters: e.g. I want to create a role that can access existing
tables but not create/drop tables or keyspaces, and for my own sanity I
want that role to have the same name on every cluster I operate. Do I
have to implement a custom role name generator to do that, or is that
common enough functionality that it should be supportable by the tooling
I'm using to manage my clusters?
I don't have strong opinions on CQL vs Sidecar, but I think one way to
frame the debate is to look at which will work best with the tooling
that people already use to manage large numbers of clusters.
Thanks -- Joel.
On 9/16/2025 3:15 PM, Štefan Miklošovič wrote:
Oh crap, what a feedback! If nothing else this shows a lesson to
everybody that the most sure way to have a fast feedback if you are
tired of waiting or impatient so you can move quickly is to just
propose your ideas, then boldly proclaim you go to do something and
the universe will mysteriously take care of finding out somebody who
will reject it. Because people are not always interested in agreeing.
A lot of times, they take action only in case they don't and are put
in front of it. So don't be afraid to take some flak as soon as possible!
On Tue, Sep 16, 2025 at 9:05 PM Patrick McFadin <pmcfa...@gmail.com>
wrote:
Thanks Mick, I'm just digging into this more after a long week of
travel.
Generally, I'm -1 for adding more custom syntax. Another concern
of mine is adding control plane actions in DDL. I understand the
usefulness of a feature like this in ops. It's a great idea.. Here
would be my counter proposal:
- Leave the CQL as is and keep "CREATE ROLE" etc as is, and avoid
making changes to core Cassandra.
Why should we keep it "as is"? Genuinely asking. Why? Where is this
need for conserving stuff coming from? Is this what we are doing here?
Adding as little as possible? I think we are stifling innovation
unnecessarily. There was the same discussion about constraints and
CHECK NOT NULL / NOT NULL where we were trying to follow "the Holy
Postgres Grail". I just don't get it. Are we not obsessed with that at
this point? Literally nobody cares if there will be CREATE GENERATED
ROLE. Nobody. Cares. So I do not take this point of yours as valid
without some strong backing from your side.
- Move the generation & policy to the sidecar project. A sidecar
endpoint will generate the role name/password, enforce
prefix/suffix/length requirements, ensure uniqueness, and then
return the role and password (or a secret handle) to the caller.
Well the problem I see in putting this to Sidecar is that this would
be only possible to do via HTTP(S). Not everybody is interested in it.
Hardly. Zero interest. Sidecar is 0.2.0 at this point. I think that
realistically speaking I am not far from the truth at all if I say
that there is practically nobody who is using 0.2.0 in production.
0.2.0. I do not count exceptions as early adopters or Analytics.
Putting this to Sidecar almost guarantees nobody is going to use this
particular functionality. People have their own control planes, their
own way of generating this stuff and they are not going to deploy
Sidecar just because they want to delegate this task to it. Come on. I
think that it would, paradoxically, create more problems for them. Not
less. So again, I do not take this point as something which is solving
anything. This will have 0 users when put in Sidecar. I think it would
be better if we just flat out refuse this instead of putting that to
Sidecar. It is even worse imho.
Another problem with Sidecar I see is that the current implementation
is pluggable. How do you want to make this pluggable in Sidecar?
Pluggable how? People might have their own opinion on how role names
should be generated. That is why you can just code your own generator
/ validator, put it on the class path and be done with it. How are you
supposed to "patch Sidecar"? You create a custom implementation, then
you put it on the class path of Sidecar? Is this even supported? I
think that you have proposed it with a good will but I don't think
that would fly.
Why?
- End users will have it faster since it will work with any
version of Cassandra supporting the CREATE syntax. (No having to
backport either)
- Keeps control plane actions optional and separated. Not an
attack surface inside core Cassandra
Thirdly, what _attack surface_? I think you are pretty aware of the
fact that this feature is by default turned off. If you have an
organisation deploying hundreds of clusters and for each they have to
figure out some role name for a user which is going to use it, how is
this going to be abused concretely? There are dedicated accounts for
CQL management, creation of a role is tied to some workflow etc. What
is attacked exactly and how? Concrete examples please.
Dineshi had the concern that "what if we just have a script which will
generate roles repeatedly nonstop?" How is this different from having
a script which would generate roles itself instead of Cassandra and it
would execute that? What's the difference really? If you want to abuse
it you will. There is no protection against that unless we put some
rate limiting in front of it - which I do not have a problem to
address in follow-up work as already explained.
- We keep the syntax of CQL more generic and less one-off.
I don't think this is relevant, really. I think we should abandon this
mindset. At this point, to make the point, I suspect that CQL had to
"hurt you" somehow :)
Regards
- k8s/Cloud native friendly with separation of control plane/data
plane.
Patrick
On Tue, Sep 16, 2025 at 7:31 AM Mick <m...@apache.org> wrote:
> I think enough time passed for everybody to participate in
the discussion so I would just move on and start the voting
thread soon.
Can we give CEP discussions longer than ~one week, please.
Folk are easily away/offline for a whole week. Take for
example many who were at Community over Code and may still be
catching up on their inbox, thinking dev@ is a less urgent folder.
I haven't look at how fast the other CEP discuss threads have
turned around, I apologise if I'm only singling one out, my
concern applies generally.