no. Maxim and I have had some offline discussions. We need to make some changes before we can be ready to vote on it.
On Thu, Sep 19, 2024 at 11:09 AM Patrick McFadin <pmcfa...@gmail.com> wrote: > There is no VOTE thread for this CEP. Is this ready for one? > > On Tue, Jan 9, 2024 at 3:28 AM Maxim Muzafarov <mmu...@apache.org> wrote: > >> Jon, >> >> That sounds good. Let's make these commands rely on the settings >> virtual table and keep the initial changes as minimal as possible. >> >> We've also scheduled a Cassandra Contributor Meeting on January 30th >> 2024, so I'll prepare some slides with everything we've got so far and >> try to prepare some drafts to demonstrate the design. >> >> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Contributor+Meeting >> >> On Tue, 9 Jan 2024 at 00:55, Jon Haddad <j...@jonhaddad.com> wrote: >> > >> > It's great to see where this is going and thanks for the discussion on >> the ML. >> > >> > Personally, I think adding two new ways of accomplishing the same thing >> is a net negative. It means we need more documentation and creates >> inconsistencies across tools and users. The tradeoffs you've listed are >> worth considering, but in my opinion adding 2 new ways to accomplish the >> same thing hurts the project more than it helps. >> > >> > > - I'd like to see a symmetry between the JMX and CQL APIs, so that >> users will have a sense of the commands they are using and are less >> > likely to check the documentation; >> > >> > I've worked with a couple hundred teams and I can only think of a few >> who use JMX directly. It's done very rarely. After 10 years, I still have >> to look up the JMX syntax to do anything useful, especially if there's any >> quoting involved. Power users might know a handful of JMX commands by >> heart, but I suspect most have a handful of bash scripts they use instead, >> or have a sidecar. I also think very few users will migrate their >> management code from JMX to CQL, nor do I imagine we'll move our own tools >> until the `disablebinary` problem is solved. >> > >> > > - It will be easier for us to move the nodetool from the jmx client >> that is used under the hood to an implementation based on a java-driver and >> use the CQL for the same; >> > >> > I can't imagine this would make a material difference. If someone's >> rewriting a nodetool command, how much time will be spent replacing the JMX >> call with a CQL one? Looking up a virtual table isn't going to be what >> consumes someone's time in this process. Again, this won't be done without >> solving `nodetool disablebinary`. >> > >> > > if we have cassandra-15254 merged, it will cost almost nothing to >> support the exec syntax for setting properties; >> > >> > My concern is more about the weird user experience of having two ways >> of doing the same thing, less about the technical overhead of adding a >> second implementation. I propose we start simple, see if any of the >> reasons you've listed are actually a real problem, then if they are, >> address the issue in a follow up. >> > >> > If I'm wrong, it sounds like it's fairly easy to add `exec` for >> changing configs. If I'm right, we'll have two confusing syntaxes >> forever. It's a lot easier to add something later than take it away. >> > >> > How does that sound? >> > >> > Jon >> > >> > >> > >> > >> > On Mon, Jan 8, 2024 at 7:55 PM Maxim Muzafarov <mmu...@apache.org> >> wrote: >> >> >> >> > Some operations will no doubt require a stored procedure syntax, but >> perhaps it would be a good idea to split the work into two: >> >> >> >> These are exactly the first steps I have in mind: >> >> >> >> [Ready for review] >> >> Allow UPDATE on settings virtual table to change running configurations >> >> https://issues.apache.org/jira/browse/CASSANDRA-15254 >> >> >> >> This issue is specifically aimed at changing the configuration >> >> properties we are talking about (value is in yaml format): >> >> e.g. UPDATE system_views.settings SET compaction_throughput = 128Mb/s; >> >> >> >> [Ready for review] >> >> Expose all table metrics in virtual table >> >> https://issues.apache.org/jira/browse/CASSANDRA-14572 >> >> >> >> This is to observe the running configuration and all available metrics: >> >> e.g. select * from system_views.thread_pools; >> >> >> >> >> >> I hope both of the issues above will become part of the trunk branch >> >> before we move on to the CQL management commands. In this topic, I'd >> >> like to discuss the design of the CQL API, and gather feedback, so >> >> that I can prepare a draft of changes to look at without any >> >> surprises, and that's exactly what this discussion is about. >> >> >> >> >> >> cqlsh> UPDATE system.settings SET compaction_throughput = 128; >> >> cqlsh> exec setcompactionthroughput 128 >> >> >> >> I don't mind removing the exec command from the CQL command API which >> >> is intended to change settings. Personally, I see the second option as >> >> just an alias for the first command, and in fact, they will have the >> >> same implementation under the hood, so please consider the rationale >> >> below: >> >> >> >> - I'd like to see a symmetry between the JMX and CQL APIs, so that >> >> users will have a sense of the commands they are using and are less >> >> likely to check the documentation; >> >> - It will be easier for us to move the nodetool from the jmx client >> >> that is used under the hood to an implementation based on a >> >> java-driver and use the CQL for the same; >> >> - if we have cassandra-15254 merged, it will cost almost nothing to >> >> support the exec syntax for setting properties; >> >> >> >> On Mon, 8 Jan 2024 at 20:13, Jon Haddad <j...@jonhaddad.com> wrote: >> >> > >> >> > Ugh, I moved some stuff around and 2 paragraphs got merged that >> shouldn't have been. >> >> > >> >> > I think there's no way we could rip out JMX, there's just too many >> benefits to having it and effectively zero benefits to removing. >> >> > >> >> > Regarding disablebinary, part of me wonders if this is a bit of a >> hammer, and what we really want is "disable binary for non-admins". I'm >> not sure what the best path is to get there. The local unix socket might >> be the easiest path as it allows us to disable network binary easily and >> still allow local admins, and allows the OS to reject the incoming >> connections vs passing that work onto a connection handler which would have >> to evaluate whether or not the user can connect. If a node is already in a >> bad spot requring disable binary, it's probably not a good idea to have it >> get DDOS'ed as part of the remediation. >> >> > >> >> > Sorry for multiple emails. >> >> > >> >> > Jon >> >> > >> >> > On Mon, Jan 8, 2024 at 4:11 PM Jon Haddad <j...@jonhaddad.com> wrote: >> >> >> >> >> >> > Syntactically, if we’re updating settings like compaction >> throughput, I would prefer to simply update a virtual settings table >> >> >> > e.g. UPDATE system.settings SET compaction_throughput = 128 >> >> >> >> >> >> I agree with this, sorry if that wasn't clear in my previous email. >> >> >> >> >> >> > Some operations will no doubt require a stored procedure syntax, >> >> >> >> >> >> The alternative to the stored procedure syntax is to have first >> class support for operations like REPAIR or COMPACT, which could be >> interesting. It might be a little nicer if the commands are first class >> citizens. I'm not sure what the downside would be besides adding complexity >> to the parser. I think I like the idea as it would allow for intuitive tab >> completion (REPAIR <tab>) and mentally fit in with the rest of the >> permission system, and be fairly obvious what permission relates to what >> action. >> >> >> >> >> >> cqlsh > GRANT INCREMENTAL REPAIR ON mykeyspace.mytable TO jon; >> >> >> >> >> >> I realize the ability to grant permissions could be done for the >> stored procedure syntax as well, but I think it's a bit more consistent to >> represent it the same way as DDL and probably better for the end user. >> >> >> >> >> >> Postgres seems to generally do admin stuff with SELECT function(): >> https://www.postgresql.org/docs/9.3/functions-admin.html. It feels a >> bit weird to me to use SELECT to do things like kill DB connections, but >> that might just be b/c it's not how I typically work with a database. >> VACUUM is a standalone command though. >> >> >> >> >> >> Curious to hear what people's thoughts are on this. >> >> >> >> >> >> > I would like to see us move to decentralised structured settings >> management at the same time, so that we can set properties for the whole >> cluster, or data centres, or individual nodes via the same mechanism - all >> from any node in the cluster. I would be happy to help out with this work, >> if time permits. >> >> >> >> >> >> This would be nice. Spinnaker has this feature and I found it to >> be very valuable at Netflix when making large changes. >> >> >> >> >> >> Regarding JMX - I think since it's about as close as we can get to >> "free" I don't really consider it to be additional overhead, a decent >> escape hatch, and I can't see us removing any functionality that most teams >> would consider critical. >> >> >> >> >> >> > We need something that's available for use before the node comes >> fully online >> >> >> > Supporting backwards compat, especially for automated ops (i.e. >> nodetool, JMX, etc), is crucial. Painful, but crucial. >> >> >> >> >> >> I think there's no way we could rip out JMX, there's just too many >> benefits to having it and effectively zero benefits to removing. Part of >> me wonders if this is a bit of a hammer, and what we really want is >> "disable binary for non-admins". I'm not sure what the best path is to get >> there. The local unix socket might be the easiest path as it allows us to >> disable network binary easily and still allow local admins, and allows the >> OS to reject the incoming connections vs passing that work onto a >> connection handler which would have to evaluate whether or not the user can >> connect. If a node is already in a bad spot requring disable binary, it's >> probably not a good idea to have it get DDOS'ed as part of the remediation. >> >> >> >> >> >> I think it's safe to say there's no appetite to remove JMX, at >> least not for anyone that would have to rework their entire admin control >> plane, plus whatever is out there in OSS provisioning tools like puppet / >> chef / etc that rely on JMX. I see no value whatsoever in removing it. >> >> >> >> >> >> I should probably have phrased my earlier email a bit differently. >> Maybe this is better: >> >> >> >> >> >> Fundamentally, I think it's better for the project if >> administration is fully supported over CQL in addition to JMX, without >> introducing a redundant third option, with the project's preference being >> CQL. >> >> >> >> >> >> >> >> >> On Mon, Jan 8, 2024 at 9:10 AM Benedict Elliott Smith < >> bened...@apache.org> wrote: >> >> >>> >> >> >>> Syntactically, if we’re updating settings like compaction >> throughput, I would prefer to simply update a virtual settings table >> >> >>> >> >> >>> e.g. UPDATE system.settings SET compaction_throughput = 128 >> >> >>> >> >> >>> Some operations will no doubt require a stored procedure syntax, >> but perhaps it would be a good idea to split the work into two: one part to >> address settings like those above, and another for maintenance operations >> such as triggering major compactions, repair and the like? >> >> >>> >> >> >>> I would like to see us move to decentralised structured settings >> management at the same time, so that we can set properties for the whole >> cluster, or data centres, or individual nodes via the same mechanism - all >> from any node in the cluster. I would be happy to help out with this work, >> if time permits. >> >> >>> >> >> >>> >> >> >>> On 8 Jan 2024, at 11:42, Josh McKenzie <jmcken...@apache.org> >> wrote: >> >> >>> >> >> >>> Fundamentally, I think it's better for the project if >> administration is fully done over CQL and we have a consistent, single way >> of doing things. >> >> >>> >> >> >>> Strongly agree here. With 2 caveats: >> >> >>> >> >> >>> Supporting backwards compat, especially for automated ops (i.e. >> nodetool, JMX, etc), is crucial. Painful, but crucial. >> >> >>> We need something that's available for use before the node comes >> fully online; the point Jeff always brings up when we discuss moving away >> from JMX. So long as we have some kind of "out-of-band" access to nodes or >> accommodation for that, we should be good. >> >> >>> >> >> >>> For context on point 2, see slack: >> https://the-asf.slack.com/archives/CK23JSY2K/p1688745128122749?thread_ts=1688662169.018449&cid=CK23JSY2K >> >> >>> >> >> >>> I point out that JMX works before and after the native protocol is >> running (startup, shutdown, joining, leaving), and also it's semi-common >> for us to disable the native protocol in certain circumstances, so at the >> very least, we'd then need to implement a totally different cql protocol >> interface just for administration, which nobody has committed to building >> yet. >> >> >>> >> >> >>> >> >> >>> I think this is a solvable problem, and I think the benefits of >> having a single, elegant way of interacting with a cluster and configuring >> it justifies the investment for us as a project. Assuming someone has the >> cycles to, you know, actually do the work. :D >> >> >>> >> >> >>> On Sun, Jan 7, 2024, at 10:41 PM, Jon Haddad wrote: >> >> >>> >> >> >>> I like the idea of the ability to execute certain commands via >> CQL, but I think it only makes sense for the nodetool commands that cause >> an action to take place, such as compact or repair. We already have >> virtual tables, I don't think we need another layer to run informational >> queries. I see little value in having the following (I'm using exec here >> for simplicity): >> >> >>> >> >> >>> cqlsh> exec tpstats >> >> >>> >> >> >>> which returns a string in addition to: >> >> >>> >> >> >>> cqlsh> select * from system_views.thread_pools >> >> >>> >> >> >>> which returns structured data. >> >> >>> >> >> >>> I'd also rather see updatable configuration virtual tables instead >> of >> >> >>> >> >> >>> cqlsh> exec setcompactionthroughput 128 >> >> >>> >> >> >>> Fundamentally, I think it's better for the project if >> administration is fully done over CQL and we have a consistent, single way >> of doing things. I'm not dead set on it, I just think less is more in a >> lot of situations, this being one of them. >> >> >>> >> >> >>> Jon >> >> >>> >> >> >>> >> >> >>> On Wed, Jan 3, 2024 at 2:56 PM Maxim Muzafarov <mmu...@apache.org> >> wrote: >> >> >>> >> >> >>> Happy New Year to everyone! I'd like to thank everyone for their >> >> >>> questions, because answering them forces us to move towards the >> right >> >> >>> solution, and I also like the ML discussions for the time they >> give to >> >> >>> investigate the code :-) >> >> >>> >> >> >>> I'm deliberately trying to limit the scope of the initial solution >> >> >>> (e.g. exclude the agent part) to keep the discussion short and >> clear, >> >> >>> but it's also important to have a glimpse of what we can do next >> once >> >> >>> we've finished with the topic. >> >> >>> >> >> >>> My view of the Command<> is that it is an abstraction in the >> broader >> >> >>> sense of an operation that can be performed on the local node, >> >> >>> involving one of a few internal components. This means that >> updating a >> >> >>> property in the settings virtual table via an update statement, or >> >> >>> executing e.g. the setconcurrentcompactors command are just >> aliases of >> >> >>> the same internal command via different APIs. Another example is >> the >> >> >>> netstats command, which simply aggregates the MessageService >> metrics >> >> >>> and returns them in a human-readable format (just another way of >> >> >>> looking at key-value metric pairs). More broadly, the command >> input is >> >> >>> Map<String, String> and String as the result (or List<String>). >> >> >>> >> >> >>> As Abe mentioned, Command and CommandRegistry should be largely >> based >> >> >>> on the nodetool command set at the beginning. We have a few options >> >> >>> for how we can initially construct command metadata during the >> >> >>> registry implementation (when moving command metadata from the >> >> >>> nodetool to the core part), so I'm planning to consult with the >> >> >>> command representations of the k8cassandra project in the way of >> any >> >> >>> further registry adoptions have zero problems (by writing a test >> >> >>> openapi registry exporter and comparing the representation >> results). >> >> >>> >> >> >>> So, the MVP is the following: >> >> >>> - Command >> >> >>> - CommandRegistry >> >> >>> - CQLCommandExporter >> >> >>> - JMXCommandExporter >> >> >>> - the nodetool uses the JMXCommandExporter >> >> >>> >> >> >>> >> >> >>> = Answers = >> >> >>> >> >> >>> > What do you have in mind specifically there? Do you plan on >> rewriting a brand new implementation which would be partially inspired by >> our agent? Or would the project integrate our agent code in-tree or as a >> dependency? >> >> >>> >> >> >>> Personally, I like the state of the k8ssandra project as it is >> now. My >> >> >>> understanding is that the server part of a database always lags >> behind >> >> >>> the client and sidecar parts in terms of the jdk version and the >> >> >>> features it provides. In contrast, sidecars should always be on >> top of >> >> >>> the market, so if we want to make an agent part in-tree, this >> should >> >> >>> be carefully considered for the flexibility which we may lose, as >> we >> >> >>> will not be able to change the agent part within the sidecar. The >> only >> >> >>> closest change I can see is that we can remove the interceptor part >> >> >>> once the CQL command interface is available. I suggest we move the >> >> >>> agent part to phase 2 and research it. wdyt? >> >> >>> >> >> >>> >> >> >>> > How are the results of the commands expressed to the CQL client? >> Since the command is being treated as CQL, I guess it will be rows, right? >> If yes, some of the nodetool commands output are a bit hierarchical in >> nature (e.g. cfstats, netstats etc...). How are these cases handled? >> >> >>> >> >> >>> I think the result of the execution should be a simple string (or >> set >> >> >>> of strings), which by its nature matches the nodetool output. I >> would >> >> >>> avoid building complex output or output schemas for now to simplify >> >> >>> the initial changes. >> >> >>> >> >> >>> >> >> >>> > Any changes expected at client/driver side? >> >> >>> >> >> >>> I'd like to keep the initial changes to a server part only, to >> avoid >> >> >>> scope inflation. For the driver part, I have checked the >> ExecutionInfo >> >> >>> interface provided by the java-driver, which should probably be >> used >> >> >>> as a command execution status holder. We'd like to have a unique >> >> >>> command execution id for each command that is executed on the >> node, so >> >> >>> the ExecutionInfo should probably hold such an id. Currently it has >> >> >>> the UUID getTracingId(), which is not well suited for our case and >> I >> >> >>> think further changes and follow-ups will be required here >> (including >> >> >>> the binary protocol, I think). >> >> >>> >> >> >>> >> >> >>> > The term COMMAND is a bit abstract I feel (subjective)... And I >> also feel the settings part is overlapping with virtual tables. >> >> >>> >> >> >>> I think we should keep the term Command as broad as it possible. As >> >> >>> long as we have a single implementation of a command, and the cost >> of >> >> >>> maintaining that piece of the source code is low, it's even better >> if >> >> >>> we have a few ways to achieve the same result using different APIs. >> >> >>> Personally, the only thing I would vote for is the separation of >> >> >>> command and metric terms (they shouldn't be mixed up). >> >> >>> >> >> >>> >> >> >>> > How are the responses of different operations expressed through >> the Command API? If the Command Registry Adapters depend upon the command >> metadata for invoking/validating the command, then I think there has to be >> a way for them to interpret the response format also, right? >> >> >>> >> >> >>> I'm not sure, that I've got the question correctly. Are you talking >> >> >>> about the command execution result schema and the validation of >> that >> >> >>> schema? >> >> >>> >> >> >>> For now, I see the interface as follows, the result of the >> execution >> >> >>> is a type that can be converted to the same string as the nodetool >> has >> >> >>> for the corresponding command (so that the outputs match): >> >> >>> >> >> >>> Command<A, R> >> >> >>> { >> >> >>> printResult(A argument, R result, Consumer<String> printer); >> >> >>> } >> >> >>> >> >> >>> On Tue, 5 Dec 2023 at 16:51, Abe Ratnofsky <a...@aber.io> wrote: >> >> >>> > >> >> >>> > Adding to Hari's comments: >> >> >>> > >> >> >>> > > Any changes expected at client/driver side? While using >> JMX/nodetool, it is clear that the command/operations are getting executed >> against which Cassandra node. But a client can connect to multiple hosts >> and trigger queries, then how can it ensure that commands are executed >> against the desired Cassandra instance? >> >> >>> > >> >> >>> > Clients are expected to set the node for the given CQL statement >> in cases like this; see docstring for example: >> https://github.com/apache/cassandra-java-driver/blob/4.x/core/src/main/java/com/datastax/oss/driver/api/core/cql/Statement.java#L124-L147 >> >> >>> > >> >> >>> > > The term COMMAND is a bit abstract I feel (subjective). Some >> of the examples quoted are referring to updating settings (for example: >> EXECUTE COMMAND setconcurrentcompactors WITH concurrent_compactors=5;) and >> some are referring to operations. Updating settings and running operations >> are considerably different things. They may have to be handled in their own >> way. And I also feel the settings part is overlapping with virtual tables. >> If virtual tables support writes (at least the settings virtual table), >> then settings can be updated using the virtual table itself. >> >> >>> > >> >> >>> > I agree with this - I actually think it would be clearer if this >> was referred to as nodetool, if the set of commands is going to be largely >> based on nodetool at the beginning. There is a lot of documentation online >> that references nodetool by name, and changing the nomenclature would make >> that existing documentation harder to understand. If a user can understand >> this as "nodetool, but better and over CQL not JMX" I think that's a >> clearer transition than a new concept of "commands". >> >> >>> > >> >> >>> > I understand that this proposal includes more than just >> nodetool, but there's a benefit to having a tool with a name, and a web >> search for "cassandra commands" is going to have more competition and >> ambiguity. >> >> >>> >> >> >>> >> >