First off - thanks so much for putting in this effort Maxim! This is excellent work.
Some thoughts on the CEP and responses in thread:
Considering that JMX is usually not used and disabled in production environments for various performance and security reasons, the operator may not see the same picture from various of Dropwizard's metrics exporters and integrations as Cassandra's JMX metrics provide [1][2].
I don't think this assertion is true. Cassandra is running in a lot of places in the world, and JMX has been in this ecosystem for a long time; we need data that is basically impossible to get to claim "JMX is usually not used in C* environments in prod".
I also wonder about if we should care about JMX? I know many wish to migrate (its going to be a very long time) away from JMX, so do we need a wrapper to make JMX and vtables consistent?
If we can move away from a bespoke vtable or JMX based implementation and instead have a templatized solution each of these is generated from, that to me is the superior option. There's little harm in adding new JMX endpoints (or hell, other metrics framework integration?) as a byproduct of adding new vtable exposed metrics; we have the same maintenance obligation to them as we have to the vtables and if it generates from the same base data, we shouldn't have any further maintenance burden due to its presence right?
we wish to move away from JMX
I do, and you do, and many people do, but I don't believe all people on the project do. The last time this came up in slack the conclusion was "Josh should go draft a CEP to chart out a path to moving off JMX while maintaining backwards-compat w/existing JMX metrics for environments that are using them" (so I'm excited to see this CEP pop up before I got to it! ;)). Moving to a system that gives us a 0-cost way to keep JMX and vtable in sync over time on new metrics seems like a nice compromise for folks that have built out JMX-based maintenance infra on top of C*. Plus removing the boilerplate toil on vtables. win-win.
If we add a column to the end of the JMX row did we just break users?
I *think* this is arguably true for a vtable / CQL-based solution as well from the "you don't know how people are using your API" perspective. Unless we have clear guidelines about discretely selecting the columns you want from a vtable and trust users to follow them, if people have brittle greedy parsers pulling in all data from vtables we could very well break them as well by adding a new column right? Could be wrong here; I haven't written anything that consumes vtable metric data and maybe the obvious idiom in the face of that is robust in the presence of column addition. /shrug
It's certainly more flexible and simpler to write to w/out detonating compared to JMX, but it's still an API we'd be revving.
On Sat, Jan 28, 2023, at 4:24 PM, Ekaterina Dimitrova wrote:
Overall I have similar thoughts and questions as David.
I just wanted to add a reminder about this thread from last summer[1]. We already have issues with the alignment of JMX and Settings Virtual Table. I guess this is how Maxim got inspired to suggest this framework proposal which I want to thank him for! (I noticed he assigned CASSANDRA-15254)
Not to open the Pandora box, but to me the most important thing here is to come into agreement about the future of JMX and what we will do or not as a community. Also, how much time people are able to invest. I guess this will influence any directions to be taken here.
I took a look and I see the result is an interface that looks like the vtable interface, that is then used by vtables and JMX? My first thought is why not just use the vtable logic?
I also wonder about if we should care about JMX? I know many wish to migrate (its going to be a very long time) away from JMX, so do we need a wrapper to make JMX and vtables consistent? I am cool with something like the following
registerWithJMX(jmxName, query(“SELECT * FROM system_views.streaming”));
So if we want to have a JMX view that matches the table then that’s cool by me, but one thing that has been brought up in reviews is backwards compatibility with regard to adding columns… If we add a column to the end of the JMX row did we just break users?
Considering that JMX is usually not used and disabled in production environments for various performance and security reasons, the operator may not see the same picture from various of Dropwizard's metrics exporters
If this is a real problem people are hitting, we can always add the ability to push metrics to common systems with a pluggable way to add non-standard solutions. Dropwizard already support this so would be low hanging fruit to address this.
To make the proposed changes backwards compatible with the previous version of Cassandra, all MBeans and Virtual Tables we already have will remain unchanged
If this is for new JMX endpoints moving forward, I am not sure of the benefit for the same reason listed above; we wish to move away from JMX
Hello Cassandra Community,
I've been faced with a number of inconsistencies in the user APIs of
the internal data collections representation exposed through the
Cassandra monitoring interfaces that need to be fully aligned from an
operator perspective. First of all, I'm highlighting JMX, Dropwizard
Metrics, and Virtual Tables user interfaces. In order to address all
these inconsistencies, I have created a draft enhancement proposal
that describes everything I have found and how we can fix it once and
for all.
I'd like to hear your opinion and thoughts on it. Please take a look:
--
Maxim Muzafarov