Happy New Year to everyone! I'd like to thank everyone for their questions, because answering them forces us to move towards the right solution, and I also like the ML discussions for the time they give to investigate the code :-)
I'm deliberately trying to limit the scope of the initial solution (e.g. exclude the agent part) to keep the discussion short and clear, but it's also important to have a glimpse of what we can do next once we've finished with the topic. My view of the Command<> is that it is an abstraction in the broader sense of an operation that can be performed on the local node, involving one of a few internal components. This means that updating a property in the settings virtual table via an update statement, or executing e.g. the setconcurrentcompactors command are just aliases of the same internal command via different APIs. Another example is the netstats command, which simply aggregates the MessageService metrics and returns them in a human-readable format (just another way of looking at key-value metric pairs). More broadly, the command input is Map<String, String> and String as the result (or List<String>). As Abe mentioned, Command and CommandRegistry should be largely based on the nodetool command set at the beginning. We have a few options for how we can initially construct command metadata during the registry implementation (when moving command metadata from the nodetool to the core part), so I'm planning to consult with the command representations of the k8cassandra project in the way of any further registry adoptions have zero problems (by writing a test openapi registry exporter and comparing the representation results). So, the MVP is the following: - Command - CommandRegistry - CQLCommandExporter - JMXCommandExporter - the nodetool uses the JMXCommandExporter = Answers = > What do you have in mind specifically there? Do you plan on rewriting a brand > new implementation which would be partially inspired by our agent? Or would > the project integrate our agent code in-tree or as a dependency? Personally, I like the state of the k8ssandra project as it is now. My understanding is that the server part of a database always lags behind the client and sidecar parts in terms of the jdk version and the features it provides. In contrast, sidecars should always be on top of the market, so if we want to make an agent part in-tree, this should be carefully considered for the flexibility which we may lose, as we will not be able to change the agent part within the sidecar. The only closest change I can see is that we can remove the interceptor part once the CQL command interface is available. I suggest we move the agent part to phase 2 and research it. wdyt? > How are the results of the commands expressed to the CQL client? Since the > command is being treated as CQL, I guess it will be rows, right? If yes, some > of the nodetool commands output are a bit hierarchical in nature (e.g. > cfstats, netstats etc...). How are these cases handled? I think the result of the execution should be a simple string (or set of strings), which by its nature matches the nodetool output. I would avoid building complex output or output schemas for now to simplify the initial changes. > Any changes expected at client/driver side? I'd like to keep the initial changes to a server part only, to avoid scope inflation. For the driver part, I have checked the ExecutionInfo interface provided by the java-driver, which should probably be used as a command execution status holder. We'd like to have a unique command execution id for each command that is executed on the node, so the ExecutionInfo should probably hold such an id. Currently it has the UUID getTracingId(), which is not well suited for our case and I think further changes and follow-ups will be required here (including the binary protocol, I think). > The term COMMAND is a bit abstract I feel (subjective)... And I also feel the > settings part is overlapping with virtual tables. I think we should keep the term Command as broad as it possible. As long as we have a single implementation of a command, and the cost of maintaining that piece of the source code is low, it's even better if we have a few ways to achieve the same result using different APIs. Personally, the only thing I would vote for is the separation of command and metric terms (they shouldn't be mixed up). > How are the responses of different operations expressed through the Command > API? If the Command Registry Adapters depend upon the command metadata for > invoking/validating the command, then I think there has to be a way for them > to interpret the response format also, right? I'm not sure, that I've got the question correctly. Are you talking about the command execution result schema and the validation of that schema? For now, I see the interface as follows, the result of the execution is a type that can be converted to the same string as the nodetool has for the corresponding command (so that the outputs match): Command<A, R> { printResult(A argument, R result, Consumer<String> printer); } On Tue, 5 Dec 2023 at 16:51, Abe Ratnofsky <a...@aber.io> wrote: > > Adding to Hari's comments: > > > Any changes expected at client/driver side? While using JMX/nodetool, it is > > clear that the command/operations are getting executed against which > > Cassandra node. But a client can connect to multiple hosts and trigger > > queries, then how can it ensure that commands are executed against the > > desired Cassandra instance? > > Clients are expected to set the node for the given CQL statement in cases > like this; see docstring for example: > https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/cql/Statement.java#L124-L147 > > > The term COMMAND is a bit abstract I feel (subjective). Some of the > > examples quoted are referring to updating settings (for example: EXECUTE > > COMMAND setconcurrentcompactors WITH concurrent_compactors=5;) and some are > > referring to operations. Updating settings and running operations are > > considerably different things. They may have to be handled in their own > > way. And I also feel the settings part is overlapping with virtual tables. > > If virtual tables support writes (at least the settings virtual table), > > then settings can be updated using the virtual table itself. > > I agree with this - I actually think it would be clearer if this was referred > to as nodetool, if the set of commands is going to be largely based on > nodetool at the beginning. There is a lot of documentation online that > references nodetool by name, and changing the nomenclature would make that > existing documentation harder to understand. If a user can understand this as > "nodetool, but better and over CQL not JMX" I think that's a clearer > transition than a new concept of "commands". > > I understand that this proposal includes more than just nodetool, but there's > a benefit to having a tool with a name, and a web search for "cassandra > commands" is going to have more competition and ambiguity.