Re: [DISCUSSION] CEP-38: CQL Management API

Jon Haddad Mon, 08 Jan 2024 11:11:47 -0800

> Syntactically, if we’re updating settings like compaction throughput, I
would prefer to simply update a virtual settings table
> e.g. UPDATE system.settings SET compaction_throughput = 128


I agree with this, sorry if that wasn't clear in my previous email.

> Some operations will no doubt require a stored procedure syntax,

The alternative to the stored procedure syntax is to have first class
support for operations like REPAIR or COMPACT, which could be interesting.
It might be a little nicer if the commands are first class citizens. I'm
not sure what the downside would be besides adding complexity to the
parser.  I think I like the idea as it would allow for intuitive tab
completion (REPAIR <tab>) and mentally fit in with the rest of the
permission system, and be fairly obvious what permission relates to what
action.

cqlsh > GRANT INCREMENTAL REPAIR ON mykeyspace.mytable TO jon;

I realize the ability to grant permissions could be done for the stored
procedure syntax as well, but I think it's a bit more consistent to
represent it the same way as DDL and probably better for the end user.

Postgres seems to generally do admin stuff with SELECT function():
https://www.postgresql.org/docs/9.3/functions-admin.html.  It feels a bit
weird to me to use SELECT to do things like kill DB connections, but that
might just be b/c it's not how I typically work with a database.  VACUUM is
a standalone command though.

Curious to hear what people's thoughts are on this.

> I would like to see us move to decentralised structured settings
management at the same time, so that we can set properties for the whole
cluster, or data centres, or individual nodes via the same mechanism - all
from any node in the cluster. I would be happy to help out with this work,
if time permits.

This would be nice.  Spinnaker has this feature and I found it to be very
valuable at Netflix when making large changes.

Regarding JMX - I think since it's about as close as we can get to "free" I
don't really consider it to be additional overhead, a decent escape hatch,
and I can't see us removing any functionality that most teams would
consider critical.

> We need something that's available for use before the node comes fully
online
> Supporting backwards compat, especially for automated ops (i.e. nodetool,
JMX, etc), is crucial. Painful, but crucial.

I think there's no way we could rip out JMX, there's just too many benefits
to having it and effectively zero benefits to removing.  Part of me wonders
if this is a bit of a hammer, and what we really want is "disable binary
for non-admins".  I'm not sure what the best path is to get there.  The
local unix socket might be the easiest path as it allows us to disable
network binary easily and still allow local admins, and allows the OS to
reject the incoming connections vs passing that work onto a connection
handler which would have to evaluate whether or not the user can connect.
If a node is already in a bad spot requring disable binary, it's probably
not a good idea to have it get DDOS'ed as part of the remediation.

I think it's safe to say there's no appetite to remove JMX, at least not
for anyone that would have to rework their entire admin control plane, plus
whatever is out there in OSS provisioning tools like puppet / chef / etc
that rely on JMX.  I see no value whatsoever in removing it.

I should probably have phrased my earlier email a bit differently.  Maybe
this is better:

Fundamentally, I think it's better for the project if administration is
fully supported over CQL in addition to JMX, without introducing a
redundant third option, with the project's preference being CQL.


On Mon, Jan 8, 2024 at 9:10 AM Benedict Elliott Smith <[email protected]>
wrote:

> Syntactically, if we’re updating settings like compaction throughput, I
> would prefer to simply update a virtual settings table
>
> e.g. UPDATE system.settings SET compaction_throughput = 128
>
> Some operations will no doubt require a stored procedure syntax, but
> perhaps it would be a good idea to split the work into two: one part to
> address settings like those above, and another for maintenance operations
> such as triggering major compactions, repair and the like?
>
> I would like to see us move to decentralised structured settings
> management at the same time, so that we can set properties for the whole
> cluster, or data centres, or individual nodes via the same mechanism - all
> from any node in the cluster. I would be happy to help out with this work,
> if time permits.
>
>
> On 8 Jan 2024, at 11:42, Josh McKenzie <[email protected]> wrote:
>
> Fundamentally, I think it's better for the project if administration is
> fully done over CQL and we have a consistent, single way of doing things.
>
> Strongly agree here. With 2 caveats:
>
>    1. Supporting backwards compat, especially for automated ops (i.e.
>    nodetool, JMX, etc), is crucial. Painful, but crucial.
>    2. We need something that's available for use before the node comes
>    fully online; the point Jeff always brings up when we discuss moving away
>    from JMX. So long as we have some kind of "out-of-band" access to nodes or
>    accommodation for that, we should be good.
>
> For context on point 2, see slack:
> https://the-asf.slack.com/archives/CK23JSY2K/p1688745128122749?thread_ts=1688662169.018449&cid=CK23JSY2K
>
> I point out that JMX works before and after the native protocol is running
> (startup, shutdown, joining, leaving), and also it's semi-common for us to
> disable the native protocol in certain circumstances, so at the very least,
> we'd then need to implement a totally different cql protocol interface just
> for administration, which nobody has committed to building yet.
>
>
> I think this is a solvable problem, and I think the benefits of having a
> single, elegant way of interacting with a cluster and configuring it
> justifies the investment for us as a project. Assuming someone has the
> cycles to, you know, actually do the work. :D
>
> On Sun, Jan 7, 2024, at 10:41 PM, Jon Haddad wrote:
>
> I like the idea of the ability to execute certain commands via CQL, but I
> think it only makes sense for the nodetool commands that cause an action to
> take place, such as compact or repair.  We already have virtual tables, I
> don't think we need another layer to run informational queries.  I see
> little value in having the following (I'm using exec here for simplicity):
>
> cqlsh> exec tpstats
>
> which returns a string in addition to:
>
> cqlsh> select * from system_views.thread_pools
>
> which returns structured data.
>
> I'd also rather see updatable configuration virtual tables instead of
>
> cqlsh> exec setcompactionthroughput 128
>
> Fundamentally, I think it's better for the project if administration is
> fully done over CQL and we have a consistent, single way of doing things.
> I'm not dead set on it, I just think less is more in a lot of situations,
> this being one of them.
>
> Jon
>
>
> On Wed, Jan 3, 2024 at 2:56 PM Maxim Muzafarov <[email protected]> wrote:
>
> Happy New Year to everyone! I'd like to thank everyone for their
> questions, because answering them forces us to move towards the right
> solution, and I also like the ML discussions for the time they give to
> investigate the code :-)
>
> I'm deliberately trying to limit the scope of the initial solution
> (e.g. exclude the agent part) to keep the discussion short and clear,
> but it's also important to have a glimpse of what we can do next once
> we've finished with the topic.
>
> My view of the Command<> is that it is an abstraction in the broader
> sense of an operation that can be performed on the local node,
> involving one of a few internal components. This means that updating a
> property in the settings virtual table via an update statement, or
> executing e.g. the setconcurrentcompactors command are just aliases of
> the same internal command via different APIs. Another example is the
> netstats command, which simply aggregates the MessageService metrics
> and returns them in a human-readable format (just another way of
> looking at key-value metric pairs). More broadly, the command input is
> Map<String, String> and String as the result (or List<String>).
>
> As Abe mentioned, Command and CommandRegistry should be largely based
> on the nodetool command set at the beginning. We have a few options
> for how we can initially construct command metadata during the
> registry implementation (when moving command metadata from the
> nodetool to the core part), so I'm planning to consult with the
> command representations of the k8cassandra project in the way of any
> further registry adoptions have zero problems (by writing a test
> openapi registry exporter and comparing the representation results).
>
> So, the MVP is the following:
> - Command
> - CommandRegistry
> - CQLCommandExporter
> - JMXCommandExporter
> - the nodetool uses the JMXCommandExporter
>
>
> = Answers =
>
> > What do you have in mind specifically there? Do you plan on rewriting a
> brand new implementation which would be partially inspired by our agent? Or
> would the project integrate our agent code in-tree or as a dependency?
>
> Personally, I like the state of the k8ssandra project as it is now. My
> understanding is that the server part of a database always lags behind
> the client and sidecar parts in terms of the jdk version and the
> features it provides. In contrast, sidecars should always be on top of
> the market, so if we want to make an agent part in-tree, this should
> be carefully considered for the flexibility which we may lose, as we
> will not be able to change the agent part within the sidecar. The only
> closest change I can see is that we can remove the interceptor part
> once the CQL command interface is available. I suggest we move the
> agent part to phase 2 and research it. wdyt?
>
>
> > How are the results of the commands expressed to the CQL client? Since
> the command is being treated as CQL, I guess it will be rows, right? If
> yes, some of the nodetool commands output are a bit hierarchical in nature
> (e.g. cfstats, netstats etc...). How are these cases handled?
>
> I think the result of the execution should be a simple string (or set
> of strings), which by its nature matches the nodetool output. I would
> avoid building complex output or output schemas for now to simplify
> the initial changes.
>
>
> > Any changes expected at client/driver side?
>
> I'd like to keep the initial changes to a server part only, to avoid
> scope inflation. For the driver part, I have checked the ExecutionInfo
> interface provided by the java-driver, which should probably be used
> as a command execution status holder. We'd like to have a unique
> command execution id for each command that is executed on the node, so
> the ExecutionInfo should probably hold such an id. Currently it has
> the UUID getTracingId(), which is not well suited for our case and I
> think further changes and follow-ups will be required here (including
> the binary protocol, I think).
>
>
> > The term COMMAND is a bit abstract I feel (subjective)... And I also
> feel the settings part is overlapping with virtual tables.
>
> I think we should keep the term Command as broad as it possible. As
> long as we have a single implementation of a command, and the cost of
> maintaining that piece of the source code is low, it's even better if
> we have a few ways to achieve the same result using different APIs.
> Personally, the only thing I would vote for is the separation of
> command and metric terms (they shouldn't be mixed up).
>
>
> > How are the responses of different operations expressed through the
> Command API? If the Command Registry Adapters depend upon the command
> metadata for invoking/validating the command, then I think there has to be
> a way for them to interpret the response format also, right?
>
> I'm not sure, that I've got the question correctly. Are you talking
> about the command execution result schema and the validation of that
> schema?
>
> For now, I see the interface as follows, the result of the execution
> is a type that can be converted to the same string as the nodetool has
> for the corresponding command (so that the outputs match):
>
> Command<A, R>
> {
>     printResult(A argument, R result, Consumer<String> printer);
> }
>
> On Tue, 5 Dec 2023 at 16:51, Abe Ratnofsky <[email protected]> wrote:
> >
> > Adding to Hari's comments:
> >
> > > Any changes expected at client/driver side? While using JMX/nodetool,
> it is clear that the command/operations are getting executed against which
> Cassandra node. But a client can connect to multiple hosts and trigger
> queries, then how can it ensure that commands are executed against the
> desired Cassandra instance?
> >
> > Clients are expected to set the node for the given CQL statement in
> cases like this; see docstring for example:
> https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/cql/Statement.java#L124-L147
> >
> > > The term COMMAND is a bit abstract I feel (subjective). Some of the
> examples quoted are referring to updating settings (for example: EXECUTE
> COMMAND setconcurrentcompactors WITH concurrent_compactors=5;) and some are
> referring to operations. Updating settings and running operations are
> considerably different things. They may have to be handled in their own
> way. And I also feel the settings part is overlapping with virtual tables.
> If virtual tables support writes (at least the settings virtual table),
> then settings can be updated using the virtual table itself.
> >
> > I agree with this - I actually think it would be clearer if this was
> referred to as nodetool, if the set of commands is going to be largely
> based on nodetool at the beginning. There is a lot of documentation online
> that references nodetool by name, and changing the nomenclature would make
> that existing documentation harder to understand. If a user can understand
> this as "nodetool, but better and over CQL not JMX" I think that's a
> clearer transition than a new concept of "commands".
> >
> > I understand that this proposal includes more than just nodetool, but
> there's a benefit to having a tool with a name, and a web search for
> "cassandra commands" is going to have more competition and ambiguity.
>
>
>

Re: [DISCUSSION] CEP-38: CQL Management API

Reply via email to