Thanks for looking into this in such a depth!
I am OK with EXECUTE but what caught my eye is "WITH ARGS {"keyspace":
"distributed_test_keyspace", "table": "tbl", "keys":["k4", "k2", "k7"]};"
Nice thing about having it like JSON (seems like that) is that if we ever
offer running it via Sidecar, then this JSON would be basically a body in
an HTTP request (maybe in some additional field but anyway, still JSON
object). So we would have a pretty easy job to pass it there.
However, we would need to be careful about not introducing some "injection"
attacks there, so we would probably need to sanitize the output anyway.
On Tue, Oct 7, 2025 at 9:23 PM Maxim Muzafarov <[email protected]> wrote:
> Hello Folks,
>
>
> First of all, thank you for your comments. Your feedback motivates me
> to implement these changes and refine the final result to the highest
> standard. To keep the vote thread clean, I'm addressing your questions
> in the discussion thread.
>
> The vote is here:
> https://lists.apache.org/thread/zmgvo2ty5nqvlz1xccsls2kcrgnbjh5v
>
>
> = The idea: =
>
> First, let me focus on the general idea, and then I will answer your
> questions in more detail.
>
> The main focus is on introducing a new API (CQL) to invoke the same
> node management commands. While this has an indirect effect on tooling
> (cqlsh, nodetool), the tooling itself is not the main focus. The scope
> (or Phase 1) of the initial changes is narrowed down only to the API
> only, to ensure the PR remains reviewable.
>
> This implies the following:
> - the nodetool commands and the way they are implemented won't change
> - the nodetool commands will be accessible via CQL, their
> implementation will not change (and the execution locality)
> - this change introduces ONLY a new way of how management commands
> will be invoked
> - this change is not about the tooling (cqlsh, nodetool), it will help
> them evolve, however
> - these changes are being introduced as an experimental API with a
> feature flag, disabled by default
>
>
> = The answers: =
>
> > how will the new CQL API behave if the user does not specify a hostname?
>
> The changes only affect the API part; improvements to the tooling will
> follow later. The command is executed on the node that the client is
> connected to.
> Note also that the port differs from 9042 (default) as a new
> management port will be introduced. See examples here [1].
>
> cqlsh 10.20.88.164 11211 -u myusername -p mypassword
> nodetool -h 10.20.88.164 -p 8081 -u myusername -pw mypassword
>
> If a host is not specified, the cli tool will attempt to connect to
> localhost. I suppose.
>
>
> > My understanding is that commands like nodetool bootstrap typically run
> on a single node.
>
> This is correct; however, as I don't control the implementation of the
> command, it may actually involve communication with other nodes. This
> is actually not part of this CEP. I'm only reusing the commands we
> already have.
>
>
> > Will we continue requiring users to specify a hostname/port explicitly,
> or will the CQL API be responsible for orchestrating the command safely
> across the entire cluster or datacenter?
>
> It seems that you are confusing the API with the tooling. The tooling
> (cqlsh, nodetool) will continue to work as it does now. I am only
> adding a new way in which commands can be invoked - CQL,
> orchestration, however, is the subject of other projects. Cassandra
> Sidecar?
>
>
> > It might, however, be worth verifying that the proposed CQL syntax
> aligns with PostgreSQL conventions, and adjusting it if needed for
> cross-compatibility.
>
> It's a bit new info to me that we're targeting PostgreSQL as the main
> reference and drifting towards the invoking management operations the
> same way. I'm inclined to agree that the syntax should probably be
> similar, more or less, however.
>
> We are introducing a new CQL syntax in a minimal and isolated manner.
> The CEP-38 defines a small set of management-oriented CQL statements
> (EXECUTE COMMAND / DESCRIBE COMMAND) that can be used to match all
> existing nodetool commands at once, introducing further aliases as an
> option. This eliminates the need to introduce a new antlr grammar for
> each management operation.
>
> The command execution syntax is the main thing that users interact
> with in this CEP, but I'm taking a more relaxed approach to it for the
> following reasons:
> - the tip of the iceberg, the unification of the JMX, CQL and possible
> REST API for Cassandra is priority;
> - the feature will be in experimental state in the major release, we
> need collect the real feedback from users and their deployments;
> - the aliasing will be used for some important commands like
> compaction, bootstrap;
>
> Taking all of the above into account, I still think it's important to
> reach an agreement, or at least to avoid objections.
> So, I've checked the PostgreSQL and SQL standards to identify areas of
> alignment. The latter I think is relatively easy to support as
> aliases.
>
>
> The syntax proposed in the CEP:
>
> EXECUTE COMMAND forcecompact WITH keyspace=distributed_test_keyspace
> AND table=tbl AND keys=["k4", "k2", "k7"];
>
> Other Cassandra-style options that I had previously considered:
>
> 1. EXECUTE COMMAND forcecompact (keyspace=distributed_test_keyspace,
> table=tbl, keys=["k4", "k2", "k7"]);
> 2. EXECUTE COMMAND forcecompact WITH ARGS {"keyspace":
> "distributed_test_keyspace", "table": "tbl", "keys":["k4", "k2",
> "k7"]};
>
> With the postgresql context [2] it could look like:
>
> COMPACT (keys=["k4", "k2", "k7"]) distributed_test_keyspace.tbl;
>
> The SQL-standard [3][4] procedural approach:
>
> CALL system_mgmt.forcecompact(
> keyspace => 'distributed_test_keyspace',
> table => 'tbl',
> keys => ['k4','k2','k7'],
> options => { "parallel": 2, "verbose": true }
> );
>
>
> Please let me know if you have any questions, or if you would like us
> to arrange a call to discuss all the details.
>
>
> [1]
> https://www.instaclustr.com/support/documentation/cassandra/using-cassandra/connect-to-cassandra-with-cqlsh/
> [2] https://www.postgresql.org/docs/current/sql-vacuum.html
> [3]
> https://en.wikipedia.org/wiki/Stored_procedure?utm_source=chatgpt.com#Implementation
> [4] https://www.postgresql.org/docs/9.3/functions-admin.html
>
> On Fri, 5 Sept 2025 at 14:12, Maxim Muzafarov <[email protected]> wrote:
> >
> > Hi Bernardo,
> >
> > Thanks for bumping up the discussion.
> > I plan to schedule the vote for next week.
> >
> > If anyone has any comments or concerns, please let me know so that I
> > can incorporate them into the CEP. The general design remains the
> > same, and with picolci taking his place we can reuse the same commands
> > for the CQL.
> >
> > On Wed, 3 Sept 2025 at 17:58, Bernardo Botella
> > <[email protected]> wrote:
> > >
> > > Hi Maxim!
> > >
> > > I just wanted to resurface this thread as it looks like it felt down
> the cracks (unless I missed something?). I am excited about this feature as
> well (it should also help with the configuration via CQL we also discussed
> on CEP-44).
> > >
> > > I guess that the CEP has been up for discussion for a while, and if
> there is no further feedback or concerns, we could call a vote on it?
> > >
> > > Regards,
> > > Bernardo
> > >
> > > > On Jul 29, 2025, at 8:16 AM, Maxim Muzafarov <[email protected]>
> wrote:
> > > >
> > > > Hello everyone,
> > > >
> > > >
> > > > Now that the dust has settled on the Picocli transition, I would like
> > > > to update my prototype and prepare it for review. It will take some
> > > > time, but I hope to have everything ready within the next couple of
> > > > months. Although we haven't voted on this CEP yet, as far as I can
> > > > see, there is more or less consensus on the path forward.
> > > >
> > > > So, my question is:
> > > >
> > > > Should we wait until the prototype is ready for review, or should we
> > > > initiate a vote? I saw some concerns about this CEP online since it
> > > > hasn't been voted on, but I'm still eager to implement it. Anyway, a
> > > > new feature flag will be added within implementation and the feature
> > > > will be disabled by default in the next release.
> > > >
> > > >
> > > >> no. Maxim and I have had some offline discussions. We need to make
> some changes before we can be ready to vote on it.
> > > >
> > > > I believe this has already been addressed. I've added new sections,
> > > > "Command Authorization" [1] and "AdminPort"[2] to the CEP.
> > > > Let me know if this is okay with you, Dinesh.
> > > >
> > > > [1]
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-38%3A+CQL+Management+API#CEP38:CQLManagementAPI-CommandAuthorization
> > > > [2]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=278465810#CEP38:CQLManagementAPI-AdminPort
> > > >
> > > > On Thu, 19 Sept 2024 at 20:11, Dinesh Joshi <[email protected]>
> wrote:
> > > >>
> > > >> no. Maxim and I have had some offline discussions. We need to make
> some changes before we can be ready to vote on it.
> > > >>
> > > >> On Thu, Sep 19, 2024 at 11:09 AM Patrick McFadin <
> [email protected]> wrote:
> > > >>>
> > > >>> There is no VOTE thread for this CEP. Is this ready for one?
> > > >>>
> > > >>> On Tue, Jan 9, 2024 at 3:28 AM Maxim Muzafarov <[email protected]>
> wrote:
> > > >>>>
> > > >>>> Jon,
> > > >>>>
> > > >>>> That sounds good. Let's make these commands rely on the settings
> > > >>>> virtual table and keep the initial changes as minimal as possible.
> > > >>>>
> > > >>>> We've also scheduled a Cassandra Contributor Meeting on January
> 30th
> > > >>>> 2024, so I'll prepare some slides with everything we've got so
> far and
> > > >>>> try to prepare some drafts to demonstrate the design.
> > > >>>>
> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Contributor+Meeting
> > > >>>>
> > > >>>> On Tue, 9 Jan 2024 at 00:55, Jon Haddad <[email protected]>
> wrote:
> > > >>>>>
> > > >>>>> It's great to see where this is going and thanks for the
> discussion on the ML.
> > > >>>>>
> > > >>>>> Personally, I think adding two new ways of accomplishing the
> same thing is a net negative. It means we need more documentation and
> creates inconsistencies across tools and users. The tradeoffs you've
> listed are worth considering, but in my opinion adding 2 new ways to
> accomplish the same thing hurts the project more than it helps.
> > > >>>>>
> > > >>>>>> - I'd like to see a symmetry between the JMX and CQL APIs, so
> that users will have a sense of the commands they are using and are less
> > > >>>>> likely to check the documentation;
> > > >>>>>
> > > >>>>> I've worked with a couple hundred teams and I can only think of
> a few who use JMX directly. It's done very rarely. After 10 years, I
> still have to look up the JMX syntax to do anything useful, especially if
> there's any quoting involved. Power users might know a handful of JMX
> commands by heart, but I suspect most have a handful of bash scripts they
> use instead, or have a sidecar. I also think very few users will migrate
> their management code from JMX to CQL, nor do I imagine we'll move our own
> tools until the `disablebinary` problem is solved.
> > > >>>>>
> > > >>>>>> - It will be easier for us to move the nodetool from the jmx
> client that is used under the hood to an implementation based on a
> java-driver and use the CQL for the same;
> > > >>>>>
> > > >>>>> I can't imagine this would make a material difference. If
> someone's rewriting a nodetool command, how much time will be spent
> replacing the JMX call with a CQL one? Looking up a virtual table isn't
> going to be what consumes someone's time in this process. Again, this
> won't be done without solving `nodetool disablebinary`.
> > > >>>>>
> > > >>>>>> if we have cassandra-15254 merged, it will cost almost nothing
> to support the exec syntax for setting properties;
> > > >>>>>
> > > >>>>> My concern is more about the weird user experience of having two
> ways of doing the same thing, less about the technical overhead of adding a
> second implementation. I propose we start simple, see if any of the
> reasons you've listed are actually a real problem, then if they are,
> address the issue in a follow up.
> > > >>>>>
> > > >>>>> If I'm wrong, it sounds like it's fairly easy to add `exec` for
> changing configs. If I'm right, we'll have two confusing syntaxes
> forever. It's a lot easier to add something later than take it away.
> > > >>>>>
> > > >>>>> How does that sound?
> > > >>>>>
> > > >>>>> Jon
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On Mon, Jan 8, 2024 at 7:55 PM Maxim Muzafarov <
> [email protected]> wrote:
> > > >>>>>>
> > > >>>>>>> Some operations will no doubt require a stored procedure
> syntax, but perhaps it would be a good idea to split the work into two:
> > > >>>>>>
> > > >>>>>> These are exactly the first steps I have in mind:
> > > >>>>>>
> > > >>>>>> [Ready for review]
> > > >>>>>> Allow UPDATE on settings virtual table to change running
> configurations
> > > >>>>>> https://issues.apache.org/jira/browse/CASSANDRA-15254
> > > >>>>>>
> > > >>>>>> This issue is specifically aimed at changing the configuration
> > > >>>>>> properties we are talking about (value is in yaml format):
> > > >>>>>> e.g. UPDATE system_views.settings SET compaction_throughput =
> 128Mb/s;
> > > >>>>>>
> > > >>>>>> [Ready for review]
> > > >>>>>> Expose all table metrics in virtual table
> > > >>>>>> https://issues.apache.org/jira/browse/CASSANDRA-14572
> > > >>>>>>
> > > >>>>>> This is to observe the running configuration and all available
> metrics:
> > > >>>>>> e.g. select * from system_views.thread_pools;
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> I hope both of the issues above will become part of the trunk
> branch
> > > >>>>>> before we move on to the CQL management commands. In this
> topic, I'd
> > > >>>>>> like to discuss the design of the CQL API, and gather feedback,
> so
> > > >>>>>> that I can prepare a draft of changes to look at without any
> > > >>>>>> surprises, and that's exactly what this discussion is about.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> cqlsh> UPDATE system.settings SET compaction_throughput = 128;
> > > >>>>>> cqlsh> exec setcompactionthroughput 128
> > > >>>>>>
> > > >>>>>> I don't mind removing the exec command from the CQL command API
> which
> > > >>>>>> is intended to change settings. Personally, I see the second
> option as
> > > >>>>>> just an alias for the first command, and in fact, they will
> have the
> > > >>>>>> same implementation under the hood, so please consider the
> rationale
> > > >>>>>> below:
> > > >>>>>>
> > > >>>>>> - I'd like to see a symmetry between the JMX and CQL APIs, so
> that
> > > >>>>>> users will have a sense of the commands they are using and are
> less
> > > >>>>>> likely to check the documentation;
> > > >>>>>> - It will be easier for us to move the nodetool from the jmx
> client
> > > >>>>>> that is used under the hood to an implementation based on a
> > > >>>>>> java-driver and use the CQL for the same;
> > > >>>>>> - if we have cassandra-15254 merged, it will cost almost
> nothing to
> > > >>>>>> support the exec syntax for setting properties;
> > > >>>>>>
> > > >>>>>> On Mon, 8 Jan 2024 at 20:13, Jon Haddad <[email protected]>
> wrote:
> > > >>>>>>>
> > > >>>>>>> Ugh, I moved some stuff around and 2 paragraphs got merged
> that shouldn't have been.
> > > >>>>>>>
> > > >>>>>>> I think there's no way we could rip out JMX, there's just too
> many benefits to having it and effectively zero benefits to removing.
> > > >>>>>>>
> > > >>>>>>> Regarding disablebinary, part of me wonders if this is a bit
> of a hammer, and what we really want is "disable binary for non-admins".
> I'm not sure what the best path is to get there. The local unix socket
> might be the easiest path as it allows us to disable network binary easily
> and still allow local admins, and allows the OS to reject the incoming
> connections vs passing that work onto a connection handler which would have
> to evaluate whether or not the user can connect. If a node is already in a
> bad spot requring disable binary, it's probably not a good idea to have it
> get DDOS'ed as part of the remediation.
> > > >>>>>>>
> > > >>>>>>> Sorry for multiple emails.
> > > >>>>>>>
> > > >>>>>>> Jon
> > > >>>>>>>
> > > >>>>>>> On Mon, Jan 8, 2024 at 4:11 PM Jon Haddad <[email protected]>
> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Syntactically, if we’re updating settings like compaction
> throughput, I would prefer to simply update a virtual settings table
> > > >>>>>>>>> e.g. UPDATE system.settings SET compaction_throughput = 128
> > > >>>>>>>>
> > > >>>>>>>> I agree with this, sorry if that wasn't clear in my previous
> email.
> > > >>>>>>>>
> > > >>>>>>>>> Some operations will no doubt require a stored procedure
> syntax,
> > > >>>>>>>>
> > > >>>>>>>> The alternative to the stored procedure syntax is to have
> first class support for operations like REPAIR or COMPACT, which could be
> interesting. It might be a little nicer if the commands are first class
> citizens. I'm not sure what the downside would be besides adding complexity
> to the parser. I think I like the idea as it would allow for intuitive tab
> completion (REPAIR <tab>) and mentally fit in with the rest of the
> permission system, and be fairly obvious what permission relates to what
> action.
> > > >>>>>>>>
> > > >>>>>>>> cqlsh > GRANT INCREMENTAL REPAIR ON mykeyspace.mytable TO jon;
> > > >>>>>>>>
> > > >>>>>>>> I realize the ability to grant permissions could be done for
> the stored procedure syntax as well, but I think it's a bit more consistent
> to represent it the same way as DDL and probably better for the end user.
> > > >>>>>>>>
> > > >>>>>>>> Postgres seems to generally do admin stuff with SELECT
> function(): https://www.postgresql.org/docs/9.3/functions-admin.html. It
> feels a bit weird to me to use SELECT to do things like kill DB
> connections, but that might just be b/c it's not how I typically work with
> a database. VACUUM is a standalone command though.
> > > >>>>>>>>
> > > >>>>>>>> Curious to hear what people's thoughts are on this.
> > > >>>>>>>>
> > > >>>>>>>>> I would like to see us move to decentralised structured
> settings management at the same time, so that we can set properties for the
> whole cluster, or data centres, or individual nodes via the same mechanism
> - all from any node in the cluster. I would be happy to help out with this
> work, if time permits.
> > > >>>>>>>>
> > > >>>>>>>> This would be nice. Spinnaker has this feature and I found
> it to be very valuable at Netflix when making large changes.
> > > >>>>>>>>
> > > >>>>>>>> Regarding JMX - I think since it's about as close as we can
> get to "free" I don't really consider it to be additional overhead, a
> decent escape hatch, and I can't see us removing any functionality that
> most teams would consider critical.
> > > >>>>>>>>
> > > >>>>>>>>> We need something that's available for use before the node
> comes fully online
> > > >>>>>>>>> Supporting backwards compat, especially for automated ops
> (i.e. nodetool, JMX, etc), is crucial. Painful, but crucial.
> > > >>>>>>>>
> > > >>>>>>>> I think there's no way we could rip out JMX, there's just too
> many benefits to having it and effectively zero benefits to removing. Part
> of me wonders if this is a bit of a hammer, and what we really want is
> "disable binary for non-admins". I'm not sure what the best path is to get
> there. The local unix socket might be the easiest path as it allows us to
> disable network binary easily and still allow local admins, and allows the
> OS to reject the incoming connections vs passing that work onto a
> connection handler which would have to evaluate whether or not the user can
> connect. If a node is already in a bad spot requring disable binary, it's
> probably not a good idea to have it get DDOS'ed as part of the remediation.
> > > >>>>>>>>
> > > >>>>>>>> I think it's safe to say there's no appetite to remove JMX,
> at least not for anyone that would have to rework their entire admin
> control plane, plus whatever is out there in OSS provisioning tools like
> puppet / chef / etc that rely on JMX. I see no value whatsoever in
> removing it.
> > > >>>>>>>>
> > > >>>>>>>> I should probably have phrased my earlier email a bit
> differently. Maybe this is better:
> > > >>>>>>>>
> > > >>>>>>>> Fundamentally, I think it's better for the project if
> administration is fully supported over CQL in addition to JMX, without
> introducing a redundant third option, with the project's preference being
> CQL.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Mon, Jan 8, 2024 at 9:10 AM Benedict Elliott Smith <
> [email protected]> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> Syntactically, if we’re updating settings like compaction
> throughput, I would prefer to simply update a virtual settings table
> > > >>>>>>>>>
> > > >>>>>>>>> e.g. UPDATE system.settings SET compaction_throughput = 128
> > > >>>>>>>>>
> > > >>>>>>>>> Some operations will no doubt require a stored procedure
> syntax, but perhaps it would be a good idea to split the work into two: one
> part to address settings like those above, and another for maintenance
> operations such as triggering major compactions, repair and the like?
> > > >>>>>>>>>
> > > >>>>>>>>> I would like to see us move to decentralised structured
> settings management at the same time, so that we can set properties for the
> whole cluster, or data centres, or individual nodes via the same mechanism
> - all from any node in the cluster. I would be happy to help out with this
> work, if time permits.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On 8 Jan 2024, at 11:42, Josh McKenzie <[email protected]>
> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> Fundamentally, I think it's better for the project if
> administration is fully done over CQL and we have a consistent, single way
> of doing things.
> > > >>>>>>>>>
> > > >>>>>>>>> Strongly agree here. With 2 caveats:
> > > >>>>>>>>>
> > > >>>>>>>>> Supporting backwards compat, especially for automated ops
> (i.e. nodetool, JMX, etc), is crucial. Painful, but crucial.
> > > >>>>>>>>> We need something that's available for use before the node
> comes fully online; the point Jeff always brings up when we discuss moving
> away from JMX. So long as we have some kind of "out-of-band" access to
> nodes or accommodation for that, we should be good.
> > > >>>>>>>>>
> > > >>>>>>>>> For context on point 2, see slack:
> https://the-asf.slack.com/archives/CK23JSY2K/p1688745128122749?thread_ts=1688662169.018449&cid=CK23JSY2K
> > > >>>>>>>>>
> > > >>>>>>>>> I point out that JMX works before and after the native
> protocol is running (startup, shutdown, joining, leaving), and also it's
> semi-common for us to disable the native protocol in certain circumstances,
> so at the very least, we'd then need to implement a totally different cql
> protocol interface just for administration, which nobody has committed to
> building yet.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> I think this is a solvable problem, and I think the benefits
> of having a single, elegant way of interacting with a cluster and
> configuring it justifies the investment for us as a project. Assuming
> someone has the cycles to, you know, actually do the work. :D
> > > >>>>>>>>>
> > > >>>>>>>>> On Sun, Jan 7, 2024, at 10:41 PM, Jon Haddad wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> I like the idea of the ability to execute certain commands
> via CQL, but I think it only makes sense for the nodetool commands that
> cause an action to take place, such as compact or repair. We already have
> virtual tables, I don't think we need another layer to run informational
> queries. I see little value in having the following (I'm using exec here
> for simplicity):
> > > >>>>>>>>>
> > > >>>>>>>>> cqlsh> exec tpstats
> > > >>>>>>>>>
> > > >>>>>>>>> which returns a string in addition to:
> > > >>>>>>>>>
> > > >>>>>>>>> cqlsh> select * from system_views.thread_pools
> > > >>>>>>>>>
> > > >>>>>>>>> which returns structured data.
> > > >>>>>>>>>
> > > >>>>>>>>> I'd also rather see updatable configuration virtual tables
> instead of
> > > >>>>>>>>>
> > > >>>>>>>>> cqlsh> exec setcompactionthroughput 128
> > > >>>>>>>>>
> > > >>>>>>>>> Fundamentally, I think it's better for the project if
> administration is fully done over CQL and we have a consistent, single way
> of doing things. I'm not dead set on it, I just think less is more in a
> lot of situations, this being one of them.
> > > >>>>>>>>>
> > > >>>>>>>>> Jon
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> On Wed, Jan 3, 2024 at 2:56 PM Maxim Muzafarov <
> [email protected]> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>> Happy New Year to everyone! I'd like to thank everyone for
> their
> > > >>>>>>>>> questions, because answering them forces us to move towards
> the right
> > > >>>>>>>>> solution, and I also like the ML discussions for the time
> they give to
> > > >>>>>>>>> investigate the code :-)
> > > >>>>>>>>>
> > > >>>>>>>>> I'm deliberately trying to limit the scope of the initial
> solution
> > > >>>>>>>>> (e.g. exclude the agent part) to keep the discussion short
> and clear,
> > > >>>>>>>>> but it's also important to have a glimpse of what we can do
> next once
> > > >>>>>>>>> we've finished with the topic.
> > > >>>>>>>>>
> > > >>>>>>>>> My view of the Command<> is that it is an abstraction in the
> broader
> > > >>>>>>>>> sense of an operation that can be performed on the local
> node,
> > > >>>>>>>>> involving one of a few internal components. This means that
> updating a
> > > >>>>>>>>> property in the settings virtual table via an update
> statement, or
> > > >>>>>>>>> executing e.g. the setconcurrentcompactors command are just
> aliases of
> > > >>>>>>>>> the same internal command via different APIs. Another
> example is the
> > > >>>>>>>>> netstats command, which simply aggregates the MessageService
> metrics
> > > >>>>>>>>> and returns them in a human-readable format (just another
> way of
> > > >>>>>>>>> looking at key-value metric pairs). More broadly, the
> command input is
> > > >>>>>>>>> Map<String, String> and String as the result (or
> List<String>).
> > > >>>>>>>>>
> > > >>>>>>>>> As Abe mentioned, Command and CommandRegistry should be
> largely based
> > > >>>>>>>>> on the nodetool command set at the beginning. We have a few
> options
> > > >>>>>>>>> for how we can initially construct command metadata during
> the
> > > >>>>>>>>> registry implementation (when moving command metadata from
> the
> > > >>>>>>>>> nodetool to the core part), so I'm planning to consult with
> the
> > > >>>>>>>>> command representations of the k8cassandra project in the
> way of any
> > > >>>>>>>>> further registry adoptions have zero problems (by writing a
> test
> > > >>>>>>>>> openapi registry exporter and comparing the representation
> results).
> > > >>>>>>>>>
> > > >>>>>>>>> So, the MVP is the following:
> > > >>>>>>>>> - Command
> > > >>>>>>>>> - CommandRegistry
> > > >>>>>>>>> - CQLCommandExporter
> > > >>>>>>>>> - JMXCommandExporter
> > > >>>>>>>>> - the nodetool uses the JMXCommandExporter
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> = Answers =
> > > >>>>>>>>>
> > > >>>>>>>>>> What do you have in mind specifically there? Do you plan on
> rewriting a brand new implementation which would be partially inspired by
> our agent? Or would the project integrate our agent code in-tree or as a
> dependency?
> > > >>>>>>>>>
> > > >>>>>>>>> Personally, I like the state of the k8ssandra project as it
> is now. My
> > > >>>>>>>>> understanding is that the server part of a database always
> lags behind
> > > >>>>>>>>> the client and sidecar parts in terms of the jdk version and
> the
> > > >>>>>>>>> features it provides. In contrast, sidecars should always be
> on top of
> > > >>>>>>>>> the market, so if we want to make an agent part in-tree,
> this should
> > > >>>>>>>>> be carefully considered for the flexibility which we may
> lose, as we
> > > >>>>>>>>> will not be able to change the agent part within the
> sidecar. The only
> > > >>>>>>>>> closest change I can see is that we can remove the
> interceptor part
> > > >>>>>>>>> once the CQL command interface is available. I suggest we
> move the
> > > >>>>>>>>> agent part to phase 2 and research it. wdyt?
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>> How are the results of the commands expressed to the CQL
> client? Since the command is being treated as CQL, I guess it will be rows,
> right? If yes, some of the nodetool commands output are a bit hierarchical
> in nature (e.g. cfstats, netstats etc...). How are these cases handled?
> > > >>>>>>>>>
> > > >>>>>>>>> I think the result of the execution should be a simple
> string (or set
> > > >>>>>>>>> of strings), which by its nature matches the nodetool
> output. I would
> > > >>>>>>>>> avoid building complex output or output schemas for now to
> simplify
> > > >>>>>>>>> the initial changes.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>> Any changes expected at client/driver side?
> > > >>>>>>>>>
> > > >>>>>>>>> I'd like to keep the initial changes to a server part only,
> to avoid
> > > >>>>>>>>> scope inflation. For the driver part, I have checked the
> ExecutionInfo
> > > >>>>>>>>> interface provided by the java-driver, which should probably
> be used
> > > >>>>>>>>> as a command execution status holder. We'd like to have a
> unique
> > > >>>>>>>>> command execution id for each command that is executed on
> the node, so
> > > >>>>>>>>> the ExecutionInfo should probably hold such an id. Currently
> it has
> > > >>>>>>>>> the UUID getTracingId(), which is not well suited for our
> case and I
> > > >>>>>>>>> think further changes and follow-ups will be required here
> (including
> > > >>>>>>>>> the binary protocol, I think).
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>> The term COMMAND is a bit abstract I feel (subjective)...
> And I also feel the settings part is overlapping with virtual tables.
> > > >>>>>>>>>
> > > >>>>>>>>> I think we should keep the term Command as broad as it
> possible. As
> > > >>>>>>>>> long as we have a single implementation of a command, and
> the cost of
> > > >>>>>>>>> maintaining that piece of the source code is low, it's even
> better if
> > > >>>>>>>>> we have a few ways to achieve the same result using
> different APIs.
> > > >>>>>>>>> Personally, the only thing I would vote for is the
> separation of
> > > >>>>>>>>> command and metric terms (they shouldn't be mixed up).
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>> How are the responses of different operations expressed
> through the Command API? If the Command Registry Adapters depend upon the
> command metadata for invoking/validating the command, then I think there
> has to be a way for them to interpret the response format also, right?
> > > >>>>>>>>>
> > > >>>>>>>>> I'm not sure, that I've got the question correctly. Are you
> talking
> > > >>>>>>>>> about the command execution result schema and the validation
> of that
> > > >>>>>>>>> schema?
> > > >>>>>>>>>
> > > >>>>>>>>> For now, I see the interface as follows, the result of the
> execution
> > > >>>>>>>>> is a type that can be converted to the same string as the
> nodetool has
> > > >>>>>>>>> for the corresponding command (so that the outputs match):
> > > >>>>>>>>>
> > > >>>>>>>>> Command<A, R>
> > > >>>>>>>>> {
> > > >>>>>>>>> printResult(A argument, R result, Consumer<String>
> printer);
> > > >>>>>>>>> }
> > > >>>>>>>>>
> > > >>>>>>>>> On Tue, 5 Dec 2023 at 16:51, Abe Ratnofsky <[email protected]>
> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Adding to Hari's comments:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Any changes expected at client/driver side? While using
> JMX/nodetool, it is clear that the command/operations are getting executed
> against which Cassandra node. But a client can connect to multiple hosts
> and trigger queries, then how can it ensure that commands are executed
> against the desired Cassandra instance?
> > > >>>>>>>>>>
> > > >>>>>>>>>> Clients are expected to set the node for the given CQL
> statement in cases like this; see docstring for example:
> https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/cql/Statement.java#L124-L147
> > > >>>>>>>>>>
> > > >>>>>>>>>>> The term COMMAND is a bit abstract I feel (subjective).
> Some of the examples quoted are referring to updating settings (for
> example: EXECUTE COMMAND setconcurrentcompactors WITH
> concurrent_compactors=5;) and some are referring to operations. Updating
> settings and running operations are considerably different things. They may
> have to be handled in their own way. And I also feel the settings part is
> overlapping with virtual tables. If virtual tables support writes (at least
> the settings virtual table), then settings can be updated using the virtual
> table itself.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I agree with this - I actually think it would be clearer if
> this was referred to as nodetool, if the set of commands is going to be
> largely based on nodetool at the beginning. There is a lot of documentation
> online that references nodetool by name, and changing the nomenclature
> would make that existing documentation harder to understand. If a user can
> understand this as "nodetool, but better and over CQL not JMX" I think
> that's a clearer transition than a new concept of "commands".
> > > >>>>>>>>>>
> > > >>>>>>>>>> I understand that this proposal includes more than just
> nodetool, but there's a benefit to having a tool with a name, and a web
> search for "cassandra commands" is going to have more competition and
> ambiguity.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > >
>