Re: [DISCUSSION] CEP-38: CQL Management API

Dinesh Joshi Thu, 19 Sep 2024 11:11:42 -0700

no. Maxim and I have had some offline discussions. We need to make some
changes before we can be ready to vote on it.


On Thu, Sep 19, 2024 at 11:09 AM Patrick McFadin <[email protected]> wrote:

> There is no VOTE thread for this CEP. Is this ready for one?
>
> On Tue, Jan 9, 2024 at 3:28 AM Maxim Muzafarov <[email protected]> wrote:
>
>> Jon,
>>
>> That sounds good.  Let's make these commands rely on the settings
>> virtual table and keep the initial changes as minimal as possible.
>>
>> We've also scheduled a Cassandra Contributor Meeting on January 30th
>> 2024, so I'll prepare some slides with everything we've got so far and
>> try to prepare some drafts to demonstrate the design.
>>
>> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Contributor+Meeting
>>
>> On Tue, 9 Jan 2024 at 00:55, Jon Haddad <[email protected]> wrote:
>> >
>> > It's great to see where this is going and thanks for the discussion on
>> the ML.
>> >
>> > Personally, I think adding two new ways of accomplishing the same thing
>> is a net negative.  It means we need more documentation and creates
>> inconsistencies across tools and users.  The tradeoffs you've listed are
>> worth considering, but in my opinion adding 2 new ways to accomplish the
>> same thing hurts the project more than it helps.
>> >
>> > > - I'd like to see a symmetry between the JMX and CQL APIs, so that
>> users will have a sense of the commands they are using and are less
>> > likely to check the documentation;
>> >
>> > I've worked with a couple hundred teams and I can only think of a few
>> who use JMX directly.  It's done very rarely.  After 10 years, I still have
>> to look up the JMX syntax to do anything useful, especially if there's any
>> quoting involved.  Power users might know a handful of JMX commands by
>> heart, but I suspect most have a handful of bash scripts they use instead,
>> or have a sidecar.  I also think very few users will migrate their
>> management code from JMX to CQL, nor do I imagine we'll move our own tools
>> until the `disablebinary` problem is solved.
>> >
>> > > - It will be easier for us to move the nodetool from the jmx client
>> that is used under the hood to an implementation based on a java-driver and
>> use the CQL for the same;
>> >
>> > I can't imagine this would make a material difference.  If someone's
>> rewriting a nodetool command, how much time will be spent replacing the JMX
>> call with a CQL one?  Looking up a virtual table isn't going to be what
>> consumes someone's time in this process.  Again, this won't be done without
>> solving `nodetool disablebinary`.
>> >
>> > > if we have cassandra-15254 merged, it will cost almost nothing to
>> support the exec syntax for setting properties;
>> >
>> > My concern is more about the weird user experience of having two ways
>> of doing the same thing, less about the technical overhead of adding a
>> second implementation.  I propose we start simple, see if any of the
>> reasons you've listed are actually a real problem, then if they are,
>> address the issue in a follow up.
>> >
>> > If I'm wrong, it sounds like it's fairly easy to add `exec` for
>> changing configs.  If I'm right, we'll have two confusing syntaxes
>> forever.  It's a lot easier to add something later than take it away.
>> >
>> > How does that sound?
>> >
>> > Jon
>> >
>> >
>> >
>> >
>> > On Mon, Jan 8, 2024 at 7:55 PM Maxim Muzafarov <[email protected]>
>> wrote:
>> >>
>> >> > Some operations will no doubt require a stored procedure syntax, but
>> perhaps it would be a good idea to split the work into two:
>> >>
>> >> These are exactly the first steps I have in mind:
>> >>
>> >> [Ready for review]
>> >> Allow UPDATE on settings virtual table to change running configurations
>> >> https://issues.apache.org/jira/browse/CASSANDRA-15254
>> >>
>> >> This issue is specifically aimed at changing the configuration
>> >> properties we are talking about (value is in yaml format):
>> >> e.g. UPDATE system_views.settings SET compaction_throughput = 128Mb/s;
>> >>
>> >> [Ready for review]
>> >> Expose all table metrics in virtual table
>> >> https://issues.apache.org/jira/browse/CASSANDRA-14572
>> >>
>> >> This is to observe the running configuration and all available metrics:
>> >> e.g. select * from system_views.thread_pools;
>> >>
>> >>
>> >> I hope both of the issues above will become part of the trunk branch
>> >> before we move on to the CQL management commands. In this topic, I'd
>> >> like to discuss the design of the CQL API, and gather feedback, so
>> >> that I can prepare a draft of changes to look at without any
>> >> surprises, and that's exactly what this discussion is about.
>> >>
>> >>
>> >> cqlsh> UPDATE system.settings SET compaction_throughput = 128;
>> >> cqlsh> exec setcompactionthroughput 128
>> >>
>> >> I don't mind removing the exec command from the CQL command API which
>> >> is intended to change settings. Personally, I see the second option as
>> >> just an alias for the first command, and in fact, they will have the
>> >> same implementation under the hood, so please consider the rationale
>> >> below:
>> >>
>> >> - I'd like to see a symmetry between the JMX and CQL APIs, so that
>> >> users will have a sense of the commands they are using and are less
>> >> likely to check the documentation;
>> >> - It will be easier for us to move the nodetool from the jmx client
>> >> that is used under the hood to an implementation based on a
>> >> java-driver and use the CQL for the same;
>> >> - if we have cassandra-15254 merged, it will cost almost nothing to
>> >> support the exec syntax for setting properties;
>> >>
>> >> On Mon, 8 Jan 2024 at 20:13, Jon Haddad <[email protected]> wrote:
>> >> >
>> >> > Ugh, I moved some stuff around and 2 paragraphs got merged that
>> shouldn't have been.
>> >> >
>> >> > I think there's no way we could rip out JMX, there's just too many
>> benefits to having it and effectively zero benefits to removing.
>> >> >
>> >> > Regarding disablebinary, part of me wonders if this is a bit of a
>> hammer, and what we really want is "disable binary for non-admins".  I'm
>> not sure what the best path is to get there.  The local unix socket might
>> be the easiest path as it allows us to disable network binary easily and
>> still allow local admins, and allows the OS to reject the incoming
>> connections vs passing that work onto a connection handler which would have
>> to evaluate whether or not the user can connect.  If a node is already in a
>> bad spot requring disable binary, it's probably not a good idea to have it
>> get DDOS'ed as part of the remediation.
>> >> >
>> >> > Sorry for multiple emails.
>> >> >
>> >> > Jon
>> >> >
>> >> > On Mon, Jan 8, 2024 at 4:11 PM Jon Haddad <[email protected]> wrote:
>> >> >>
>> >> >> > Syntactically, if we’re updating settings like compaction
>> throughput, I would prefer to simply update a virtual settings table
>> >> >> > e.g. UPDATE system.settings SET compaction_throughput = 128
>> >> >>
>> >> >> I agree with this, sorry if that wasn't clear in my previous email.
>> >> >>
>> >> >> > Some operations will no doubt require a stored procedure syntax,
>> >> >>
>> >> >> The alternative to the stored procedure syntax is to have first
>> class support for operations like REPAIR or COMPACT, which could be
>> interesting.  It might be a little nicer if the commands are first class
>> citizens. I'm not sure what the downside would be besides adding complexity
>> to the parser.  I think I like the idea as it would allow for intuitive tab
>> completion (REPAIR <tab>) and mentally fit in with the rest of the
>> permission system, and be fairly obvious what permission relates to what
>> action.
>> >> >>
>> >> >> cqlsh > GRANT INCREMENTAL REPAIR ON mykeyspace.mytable TO jon;
>> >> >>
>> >> >> I realize the ability to grant permissions could be done for the
>> stored procedure syntax as well, but I think it's a bit more consistent to
>> represent it the same way as DDL and probably better for the end user.
>> >> >>
>> >> >> Postgres seems to generally do admin stuff with SELECT function():
>> https://www.postgresql.org/docs/9.3/functions-admin.html.  It feels a
>> bit weird to me to use SELECT to do things like kill DB connections, but
>> that might just be b/c it's not how I typically work with a database.
>> VACUUM is a standalone command though.
>> >> >>
>> >> >> Curious to hear what people's thoughts are on this.
>> >> >>
>> >> >> > I would like to see us move to decentralised structured settings
>> management at the same time, so that we can set properties for the whole
>> cluster, or data centres, or individual nodes via the same mechanism - all
>> from any node in the cluster. I would be happy to help out with this work,
>> if time permits.
>> >> >>
>> >> >> This would be nice.  Spinnaker has this feature and I found it to
>> be very valuable at Netflix when making large changes.
>> >> >>
>> >> >> Regarding JMX - I think since it's about as close as we can get to
>> "free" I don't really consider it to be additional overhead, a decent
>> escape hatch, and I can't see us removing any functionality that most teams
>> would consider critical.
>> >> >>
>> >> >> > We need something that's available for use before the node comes
>> fully online
>> >> >> > Supporting backwards compat, especially for automated ops (i.e.
>> nodetool, JMX, etc), is crucial. Painful, but crucial.
>> >> >>
>> >> >> I think there's no way we could rip out JMX, there's just too many
>> benefits to having it and effectively zero benefits to removing.  Part of
>> me wonders if this is a bit of a hammer, and what we really want is
>> "disable binary for non-admins".  I'm not sure what the best path is to get
>> there.  The local unix socket might be the easiest path as it allows us to
>> disable network binary easily and still allow local admins, and allows the
>> OS to reject the incoming connections vs passing that work onto a
>> connection handler which would have to evaluate whether or not the user can
>> connect.  If a node is already in a bad spot requring disable binary, it's
>> probably not a good idea to have it get DDOS'ed as part of the remediation.
>> >> >>
>> >> >> I think it's safe to say there's no appetite to remove JMX, at
>> least not for anyone that would have to rework their entire admin control
>> plane, plus whatever is out there in OSS provisioning tools like puppet /
>> chef / etc that rely on JMX.  I see no value whatsoever in removing it.
>> >> >>
>> >> >> I should probably have phrased my earlier email a bit differently.
>> Maybe this is better:
>> >> >>
>> >> >> Fundamentally, I think it's better for the project if
>> administration is fully supported over CQL in addition to JMX, without
>> introducing a redundant third option, with the project's preference being
>> CQL.
>> >> >>
>> >> >>
>> >> >> On Mon, Jan 8, 2024 at 9:10 AM Benedict Elliott Smith <
>> [email protected]> wrote:
>> >> >>>
>> >> >>> Syntactically, if we’re updating settings like compaction
>> throughput, I would prefer to simply update a virtual settings table
>> >> >>>
>> >> >>> e.g. UPDATE system.settings SET compaction_throughput = 128
>> >> >>>
>> >> >>> Some operations will no doubt require a stored procedure syntax,
>> but perhaps it would be a good idea to split the work into two: one part to
>> address settings like those above, and another for maintenance operations
>> such as triggering major compactions, repair and the like?
>> >> >>>
>> >> >>> I would like to see us move to decentralised structured settings
>> management at the same time, so that we can set properties for the whole
>> cluster, or data centres, or individual nodes via the same mechanism - all
>> from any node in the cluster. I would be happy to help out with this work,
>> if time permits.
>> >> >>>
>> >> >>>
>> >> >>> On 8 Jan 2024, at 11:42, Josh McKenzie <[email protected]>
>> wrote:
>> >> >>>
>> >> >>> Fundamentally, I think it's better for the project if
>> administration is fully done over CQL and we have a consistent, single way
>> of doing things.
>> >> >>>
>> >> >>> Strongly agree here. With 2 caveats:
>> >> >>>
>> >> >>> Supporting backwards compat, especially for automated ops (i.e.
>> nodetool, JMX, etc), is crucial. Painful, but crucial.
>> >> >>> We need something that's available for use before the node comes
>> fully online; the point Jeff always brings up when we discuss moving away
>> from JMX. So long as we have some kind of "out-of-band" access to nodes or
>> accommodation for that, we should be good.
>> >> >>>
>> >> >>> For context on point 2, see slack:
>> https://the-asf.slack.com/archives/CK23JSY2K/p1688745128122749?thread_ts=1688662169.018449&cid=CK23JSY2K
>> >> >>>
>> >> >>> I point out that JMX works before and after the native protocol is
>> running (startup, shutdown, joining, leaving), and also it's semi-common
>> for us to disable the native protocol in certain circumstances, so at the
>> very least, we'd then need to implement a totally different cql protocol
>> interface just for administration, which nobody has committed to building
>> yet.
>> >> >>>
>> >> >>>
>> >> >>> I think this is a solvable problem, and I think the benefits of
>> having a single, elegant way of interacting with a cluster and configuring
>> it justifies the investment for us as a project. Assuming someone has the
>> cycles to, you know, actually do the work. :D
>> >> >>>
>> >> >>> On Sun, Jan 7, 2024, at 10:41 PM, Jon Haddad wrote:
>> >> >>>
>> >> >>> I like the idea of the ability to execute certain commands via
>> CQL, but I think it only makes sense for the nodetool commands that cause
>> an action to take place, such as compact or repair.  We already have
>> virtual tables, I don't think we need another layer to run informational
>> queries.  I see little value in having the following (I'm using exec here
>> for simplicity):
>> >> >>>
>> >> >>> cqlsh> exec tpstats
>> >> >>>
>> >> >>> which returns a string in addition to:
>> >> >>>
>> >> >>> cqlsh> select * from system_views.thread_pools
>> >> >>>
>> >> >>> which returns structured data.
>> >> >>>
>> >> >>> I'd also rather see updatable configuration virtual tables instead
>> of
>> >> >>>
>> >> >>> cqlsh> exec setcompactionthroughput 128
>> >> >>>
>> >> >>> Fundamentally, I think it's better for the project if
>> administration is fully done over CQL and we have a consistent, single way
>> of doing things.  I'm not dead set on it, I just think less is more in a
>> lot of situations, this being one of them.
>> >> >>>
>> >> >>> Jon
>> >> >>>
>> >> >>>
>> >> >>> On Wed, Jan 3, 2024 at 2:56 PM Maxim Muzafarov <[email protected]>
>> wrote:
>> >> >>>
>> >> >>> Happy New Year to everyone! I'd like to thank everyone for their
>> >> >>> questions, because answering them forces us to move towards the
>> right
>> >> >>> solution, and I also like the ML discussions for the time they
>> give to
>> >> >>> investigate the code :-)
>> >> >>>
>> >> >>> I'm deliberately trying to limit the scope of the initial solution
>> >> >>> (e.g. exclude the agent part) to keep the discussion short and
>> clear,
>> >> >>> but it's also important to have a glimpse of what we can do next
>> once
>> >> >>> we've finished with the topic.
>> >> >>>
>> >> >>> My view of the Command<> is that it is an abstraction in the
>> broader
>> >> >>> sense of an operation that can be performed on the local node,
>> >> >>> involving one of a few internal components. This means that
>> updating a
>> >> >>> property in the settings virtual table via an update statement, or
>> >> >>> executing e.g. the setconcurrentcompactors command are just
>> aliases of
>> >> >>> the same internal command via different APIs. Another example is
>> the
>> >> >>> netstats command, which simply aggregates the MessageService
>> metrics
>> >> >>> and returns them in a human-readable format (just another way of
>> >> >>> looking at key-value metric pairs). More broadly, the command
>> input is
>> >> >>> Map<String, String> and String as the result (or List<String>).
>> >> >>>
>> >> >>> As Abe mentioned, Command and CommandRegistry should be largely
>> based
>> >> >>> on the nodetool command set at the beginning. We have a few options
>> >> >>> for how we can initially construct command metadata during the
>> >> >>> registry implementation (when moving command metadata from the
>> >> >>> nodetool to the core part), so I'm planning to consult with the
>> >> >>> command representations of the k8cassandra project in the way of
>> any
>> >> >>> further registry adoptions have zero problems (by writing a test
>> >> >>> openapi registry exporter and comparing the representation
>> results).
>> >> >>>
>> >> >>> So, the MVP is the following:
>> >> >>> - Command
>> >> >>> - CommandRegistry
>> >> >>> - CQLCommandExporter
>> >> >>> - JMXCommandExporter
>> >> >>> - the nodetool uses the JMXCommandExporter
>> >> >>>
>> >> >>>
>> >> >>> = Answers =
>> >> >>>
>> >> >>> > What do you have in mind specifically there? Do you plan on
>> rewriting a brand new implementation which would be partially inspired by
>> our agent? Or would the project integrate our agent code in-tree or as a
>> dependency?
>> >> >>>
>> >> >>> Personally, I like the state of the k8ssandra project as it is
>> now. My
>> >> >>> understanding is that the server part of a database always lags
>> behind
>> >> >>> the client and sidecar parts in terms of the jdk version and the
>> >> >>> features it provides. In contrast, sidecars should always be on
>> top of
>> >> >>> the market, so if we want to make an agent part in-tree, this
>> should
>> >> >>> be carefully considered for the flexibility which we may lose, as
>> we
>> >> >>> will not be able to change the agent part within the sidecar. The
>> only
>> >> >>> closest change I can see is that we can remove the interceptor part
>> >> >>> once the CQL command interface is available. I suggest we move the
>> >> >>> agent part to phase 2 and research it. wdyt?
>> >> >>>
>> >> >>>
>> >> >>> > How are the results of the commands expressed to the CQL client?
>> Since the command is being treated as CQL, I guess it will be rows, right?
>> If yes, some of the nodetool commands output are a bit hierarchical in
>> nature (e.g. cfstats, netstats etc...). How are these cases handled?
>> >> >>>
>> >> >>> I think the result of the execution should be a simple string (or
>> set
>> >> >>> of strings), which by its nature matches the nodetool output. I
>> would
>> >> >>> avoid building complex output or output schemas for now to simplify
>> >> >>> the initial changes.
>> >> >>>
>> >> >>>
>> >> >>> > Any changes expected at client/driver side?
>> >> >>>
>> >> >>> I'd like to keep the initial changes to a server part only, to
>> avoid
>> >> >>> scope inflation. For the driver part, I have checked the
>> ExecutionInfo
>> >> >>> interface provided by the java-driver, which should probably be
>> used
>> >> >>> as a command execution status holder. We'd like to have a unique
>> >> >>> command execution id for each command that is executed on the
>> node, so
>> >> >>> the ExecutionInfo should probably hold such an id. Currently it has
>> >> >>> the UUID getTracingId(), which is not well suited for our case and
>> I
>> >> >>> think further changes and follow-ups will be required here
>> (including
>> >> >>> the binary protocol, I think).
>> >> >>>
>> >> >>>
>> >> >>> > The term COMMAND is a bit abstract I feel (subjective)... And I
>> also feel the settings part is overlapping with virtual tables.
>> >> >>>
>> >> >>> I think we should keep the term Command as broad as it possible. As
>> >> >>> long as we have a single implementation of a command, and the cost
>> of
>> >> >>> maintaining that piece of the source code is low, it's even better
>> if
>> >> >>> we have a few ways to achieve the same result using different APIs.
>> >> >>> Personally, the only thing I would vote for is the separation of
>> >> >>> command and metric terms (they shouldn't be mixed up).
>> >> >>>
>> >> >>>
>> >> >>> > How are the responses of different operations expressed through
>> the Command API? If the Command Registry Adapters depend upon the command
>> metadata for invoking/validating the command, then I think there has to be
>> a way for them to interpret the response format also, right?
>> >> >>>
>> >> >>> I'm not sure, that I've got the question correctly. Are you talking
>> >> >>> about the command execution result schema and the validation of
>> that
>> >> >>> schema?
>> >> >>>
>> >> >>> For now, I see the interface as follows, the result of the
>> execution
>> >> >>> is a type that can be converted to the same string as the nodetool
>> has
>> >> >>> for the corresponding command (so that the outputs match):
>> >> >>>
>> >> >>> Command<A, R>
>> >> >>> {
>> >> >>>     printResult(A argument, R result, Consumer<String> printer);
>> >> >>> }
>> >> >>>
>> >> >>> On Tue, 5 Dec 2023 at 16:51, Abe Ratnofsky <[email protected]> wrote:
>> >> >>> >
>> >> >>> > Adding to Hari's comments:
>> >> >>> >
>> >> >>> > > Any changes expected at client/driver side? While using
>> JMX/nodetool, it is clear that the command/operations are getting executed
>> against which Cassandra node. But a client can connect to multiple hosts
>> and trigger queries, then how can it ensure that commands are executed
>> against the desired Cassandra instance?
>> >> >>> >
>> >> >>> > Clients are expected to set the node for the given CQL statement
>> in cases like this; see docstring for example:
>> https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/cql/Statement.java#L124-L147
>> >> >>> >
>> >> >>> > > The term COMMAND is a bit abstract I feel (subjective). Some
>> of the examples quoted are referring to updating settings (for example:
>> EXECUTE COMMAND setconcurrentcompactors WITH concurrent_compactors=5;) and
>> some are referring to operations. Updating settings and running operations
>> are considerably different things. They may have to be handled in their own
>> way. And I also feel the settings part is overlapping with virtual tables.
>> If virtual tables support writes (at least the settings virtual table),
>> then settings can be updated using the virtual table itself.
>> >> >>> >
>> >> >>> > I agree with this - I actually think it would be clearer if this
>> was referred to as nodetool, if the set of commands is going to be largely
>> based on nodetool at the beginning. There is a lot of documentation online
>> that references nodetool by name, and changing the nomenclature would make
>> that existing documentation harder to understand. If a user can understand
>> this as "nodetool, but better and over CQL not JMX" I think that's a
>> clearer transition than a new concept of "commands".
>> >> >>> >
>> >> >>> > I understand that this proposal includes more than just
>> nodetool, but there's a benefit to having a tool with a name, and a web
>> search for "cassandra commands" is going to have more competition and
>> ambiguity.
>> >> >>>
>> >> >>>
>>
>

Re: [DISCUSSION] CEP-38: CQL Management API

Reply via email to