> Having something like a registry and standardizing/enforcing all metric types > is something we should be sure to maintain. A registry w/documentation on each metric indicating *what it's actually measuring and what it means* would be great for our users.
On Mon, Mar 10, 2025, at 3:46 PM, Chris Lohfink wrote: > Just something to be mindful about what we had *before* codahale in Cassandra > and avoid that again. Pre 1.1 it was pretty much impossible to collect > metrics without looking at code (there were efficient custom made things, but > each metric was reported differently) and that stuck through until 2.2 days. > Having something like a registry and standardizing/enforcing all metric types > is something we should be sure to maintain. > > Chris > > On Fri, Mar 7, 2025 at 1:33 PM Jon Haddad <j...@rustyrazorblade.com> wrote: >> As long as operators are able to use all the OTel tooling, I'm happy. I'm >> not looking to try to decide what the metrics API looks like, although I >> think trying to plan for 15 years out is a bit unnecessary. A lot of the DB >> will be replaced by then. That said, I'm mostly hands off on code and you >> guys are more than capable of making the smart decision here. >> >> Regarding virtual tables, I'm looking at writing a custom OTel receiver [1] >> to ingest them. I was really impressed with the performance work you did >> there and it got my wheels turning on how to best make use of it. I am >> planning on using it with easy-cass-lab to pull DB metrics and logs down to >> my local machine along with kernel metrics via eBPF. >> >> Jon >> >> [1] https://opentelemetry.io/docs/collector/building/receiver/ >> >> >> >> On Wed, Mar 5, 2025 at 1:06 PM Maxim Muzafarov <mmu...@apache.org> wrote: >>> If we do swap, we may run into the same issues with third-party >>> metrics libraries in the next 10-15 years that we are discussing now >>> with the Codahale we added ~10-15 years ago, and given the fact that a >>> proposed new API is quite small my personal feeling is that it would >>> be our best choice for the metrics. >>> >>> Having our own API also doesn't prevent us from having all the >>> integrations with new 3-rd party libraries the world will develop in >>> future, just by writing custom adapters to our own -- this will be >>> possible for the Codahale (with some suboptimal considerations), where >>> we have to support backwards compatibility, and for the OpenTelemetry >>> as well. We already have the CEP-32[1] proposal to instrument metrics; >>> in this sense, it doesn't change much for us. >>> >>> Another point of having our own API is the virtual tables we have -- >>> it gives us enough flexibility and latitude to export the metrics >>> efficiently via the virtual tables by implementing the access patterns >>> we consider important. >>> >>> [1] >>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255071749#CEP32:(DRAFT)OpenTelemetryintegration-ExportingMetricsthroughOpenTelemetry >>> [2 https://opentelemetry.io/docs/languages/java/instrumentation/ >>> >>> On Wed, 5 Mar 2025 at 21:35, Jeff Jirsa <jji...@gmail.com> wrote: >>> > >>> > I think widely accepted that otel in general has won this stage of >>> > observability, as most metrics systems allow it and most saas providers >>> > support it. So Jon’s point there is important. >>> > >>> > The promise of unifying logs/traces/metrics usually (aka wide events) is >>> > far more important in the tracing side of our observability than in the >>> > areas we use Codahale/DropWizard. >>> > >>> > Scott: if we swap, we can (probably should) deprecate like everything >>> > else, and run both side by side for a release so people don’t lose >>> > metrics entirely on bounce? FF both, to control double cost during the >>> > transition. >>> > >>> > >>> > >>> > >>> > On Mar 5, 2025, at 8:21 PM, C. Scott Andreas <sc...@paradoxica.net> wrote: >>> > >>> > No strong opinion on particular choice of metrics library. >>> > >>> > My primary feedback is that if we swap metrics implementations and the >>> > new values are *different*, we can anticipate broad user >>> > confusion/interest. >>> > >>> > In particular if latency stats are reported higher post-upgrade, we >>> > should expect users to interpret this as a performance regression, >>> > dedicating significant resources to investigating the change, and >>> > expending credibility with stakeholders in their systems. >>> > >>> > - Scott >>> > >>> > On Mar 5, 2025, at 11:57 AM, Benedict <bened...@apache.org> wrote: >>> > >>> > >>> > I really like the idea of integrating tracing, metrics and logging >>> > frameworks. >>> > >>> > I would like to have the time to look closely at the API before we decide >>> > to adopt it though. I agree that a widely deployed API has inherent >>> > benefits, but any API we adopt also shapes future evolution of our >>> > capabilities. Hopefully this is also a good API that allows us plenty of >>> > evolutionary headroom. >>> > >>> > >>> > On 5 Mar 2025, at 19:45, Josh McKenzie <jmcken...@apache.org> wrote: >>> > >>> > >>> > >>> > if the plan is to rip out something old and unmaintained and replace with >>> > something new, I think there's a huge win to be had by implementing the >>> > standard that everyone's using now. >>> > >>> > Strong +1 on anything that's an ecosystem integration inflection point. >>> > The added benefit here is that if we architect ourselves to gracefully >>> > integrate with whatever system's are ubiquitous today, we'll inherit the >>> > migration work that any new industry-wide replacement system would need >>> > to do to become the new de facto standard. >>> > >>> > On Wed, Mar 5, 2025, at 2:23 PM, Jon Haddad wrote: >>> > >>> > Thank you for the replies. >>> > >>> > Dmitry: Based on some other patches you've worked on and your explanation >>> > here, it looks like you're optimizing the front door portion of write >>> > path - very cool. Testing it in isolation with those settings makes >>> > sense if your goal is to push write throughput as far as you can, >>> > something I'm very much on board with, and is a key component to pushing >>> > density and reducing cost. I'm spinning up a 5.0 cluster now to run a >>> > test, so I'll run a load test similar to what you've done and try to >>> > reproduce your results. I'll also review the JIRA to get more familiar >>> > with what you're working on. >>> > >>> > Benedict: I agree with your line of thinking around optimizing the cost >>> > of metrics. As we push both density and multi-tenancy, there's going to >>> > be more and more demand for clusters with hundreds or thousands of >>> > tables. Maybe tens of thousands. Reducing overhead for something that's >>> > O(N * M) (multiple counters per table) will definitely be a welcome >>> > improvement. There's always more stuff that's going to get in the way, >>> > but it's an elephant and I appreciate every bite. >>> > >>> > My main concern with metrics isn't really compatibility, and I don't have >>> > any real investment in DropWizard. I don't know if there's any real >>> > value in putting in effort to maintain compatibility, but I'm just one >>> > sample, so I won't make a strong statement here. >>> > >>> > It would be *very nice* we moved to metrics which implement the Open >>> > Telemetry Metrics API [1], which I think solves multiple issues at once: >>> > >>> > * We can use either one of the existing implementations (OTel SDK) or our >>> > own >>> > * We get a "free" upgrade that lets people tap into the OTel ecosystem >>> > * It paves the way for OTel traces with ZipKin [2] / Jaeger [3] >>> > * We can use the ubiquitous Otel instrumentation agent to send metrics to >>> > the OTel collector, meaning people can collect at a much higher frequency >>> > than today >>> > * OTel logging is a significant improvement over logback, you can >>> > coorelate metrics + traces + logs together. >>> > >>> > Anyways, if the plan is to rip out something old and unmaintained and >>> > replace with something new, I think there's a huge win to be had by >>> > implementing the standard that everyone's using now. >>> > >>> > All this is very exciting and I appreciate the discussion! >>> > >>> > Jon >>> > >>> > [1] https://opentelemetry.io/docs/languages/java/api/ >>> > [2] https://zipkin.io/ >>> > [3] https://www.jaegertracing.io/ >>> > >>> > >>> > >>> > >>> > On Wed, Mar 5, 2025 at 2:58 AM Dmitry Konstantinov <netud...@gmail.com> >>> > wrote: >>> > >>> > Hi Jon >>> > >>> > >> Is there a specific workload you're running where you're seeing it >>> > >> take up a significant % of CPU time? Could you share some metrics, >>> > >> profile data, or a workload so I can try to reproduce your findings? >>> > Yes, I have shared the workload generation command (sorry, it is in >>> > cassandra-stress, I have not yet adopted your tool but want to do it soon >>> > :-) ), setup details and async profiler CPU profile in CASSANDRA-20250 >>> > A summary: >>> > >>> > it is a plain insert-only workload to assert a max throughput capacity >>> > for a single node: ./tools/bin/cassandra-stress "write n=10m" -rate >>> > threads=100 -node myhost >>> > small amount of data per row is inserted, local SSD disks are used, so >>> > CPU is a primary bottleneck in this scenario (while it is quite synthetic >>> > in my real business cases CPU is a primary bottleneck as well) >>> > I used 5.1 trunk version (similar results I have for 5.0 version while I >>> > was checking CASSANDRA-20165) >>> > I enabled trie memetables + offheap objects mode >>> > I disabled compaction >>> > a recent nightly build is used for async-profiler >>> > my hardware is quite old: on-premise VM, Linux 4.18.0-240.el8.x86_64, >>> > OpenJdk-11.0.26+4, Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 16 cores >>> > link to CPU profile ("codahale" code: 8.65%) >>> > -XX:+DebugNonSafepoints option is enabled to improve the profile precision >>> > >>> > >>> > On Wed, 5 Mar 2025 at 12:38, Benedict Elliott Smith <bened...@apache.org> >>> > wrote: >>> > >>> > Some quick thoughts of my own… >>> > >>> > === Performance === >>> > - I have seen heap dumps with > 1GiB dedicated to metric counters. This >>> > patch should improve this, while opening up room to cut it further, >>> > steeply. >>> > - The performance improvement in relative terms for the metrics being >>> > replaced is rather dramatic - about 80%.. We can also improve this >>> > further. >>> > - Cheaper metrics (in terms of both cpu and memory) means we can readily >>> > have more of them, exposing finer-grained details. This is hard to >>> > understate the value of. >>> > >>> > === Reporting === >>> > - We’re already non-standard for our most important metrics, because we >>> > had to replace the Codahale histogram years ago >>> > - We can continue implementing the Codahale interfaces, so that exporting >>> > libraries have minimal work to support us >>> > - We can probably push patches upstream to a couple of selected libraries >>> > we consider important >>> > - I would anyway also support picking a new reporting framework to >>> > support, but I would like us to do this with great care to avoid >>> > repeating our mistakes. I won’t have cycles to actually implement this, >>> > so it would be down to others to decide if they are willing to undertake >>> > this work >>> > >>> > I think the fallback option for now, however, is to abuse unsafe to allow >>> > us to override the implementation details of Codahale metrics. So we can >>> > decouple the performance discussion for now from the deprecation >>> > discussion, but I think we should have a target of deprecating >>> > Codahale/DropWizard for the reasons Dmitry outlines, however we decide to >>> > do it. >>> > >>> > On 4 Mar 2025, at 21:17, Jon Haddad <j...@rustyrazorblade.com> wrote: >>> > >>> > I've got a few thoughts... >>> > >>> > On the performance side, I took a look at a few CPU profiles from past >>> > benchmarks and I'm seeing DropWizard taking ~ 3% of CPU time. Is there a >>> > specific workload you're running where you're seeing it take up a >>> > significant % of CPU time? Could you share some metrics, profile data, >>> > or a workload so I can try to reproduce your findings? In my testing >>> > I've found the majority of the overhead from metrics to come from JMX, >>> > not DropWizard. >>> > >>> > On the operator side, inventing our own metrics lib means risks making it >>> > harder to instrument Cassandra. There are libraries out there that allow >>> > you to tap into DropWizard metrics directly. For example, Sarma >>> > Pydipally did a presentation on this last year [1] based on some code I >>> > threw together. >>> > >>> > If you're planning on making it easier to instrument C* by supporting >>> > sending metrics to the OTel collector [2], then I could see the change >>> > being a net win as long as the perf is no worse than the status quo. >>> > >>> > It's hard to know the full extent of what you're planning and the impact, >>> > so I'll save any opinions till I know more about the plan. >>> > >>> > Thanks for bringing this up! >>> > Jon >>> > >>> > [1] >>> > https://planetcassandra.org/leaf/apache-cassandra-lunch-62-grafana-dashboard-for-apache-cassandra-business-platform-team/ >>> > [2] https://opentelemetry.io/docs/collector/ >>> > >>> > On Tue, Mar 4, 2025 at 12:40 PM Dmitry Konstantinov <netud...@gmail.com> >>> > wrote: >>> > >>> > Hi all, >>> > >>> > After a long conversation with Benedict and Maxim in CASSANDRA-20250 I >>> > would like to raise and discuss a proposal to deprecate >>> > Dropwizard/Codahale metrics usage in the next major release of Cassandra >>> > server and drop it in the following major release. >>> > Instead of it our own Java API and implementation should be introduced. >>> > For the next major release Dropwizard/Codahale API is still planned to >>> > support by extending Codahale implementations, to give potential users of >>> > this API enough time for transition. >>> > The proposal does not affect JMX API for metrics, it is only about local >>> > Java API changes within Cassandra server classpath, so it is about the >>> > cases when somebody outside of Cassandra server code relies on Codahale >>> > API in some kind of extensions or agents. >>> > >>> > Reasons: >>> > 1) Codahale metrics implementation is not very efficient from CPU and >>> > memory usage point of view. In the past we already replaced default >>> > Codahale implementations for Reservoir with our custom one and now in >>> > CASSANDRA-20250 we (Benedict and I) want to add a more efficient >>> > implementation for Counter and Meter logic. So, in total we do not have >>> > so much logic left from the original library (mostly a MetricRegistry as >>> > container for metrics) and the majority of logic is implemented by >>> > ourselves. >>> > We use metrics a lot along the read and write paths and they contribute a >>> > visible overhead (for example for plain write load it is about 9-11% >>> > according to async profiler CPU profile), so we want them to be highly >>> > optimized. >>> > From memory perspective Counter and Meter are built based on LongAdder >>> > and they are quite heavy for the amounts which we create and use. >>> > >>> > 2) Codahale metrics does not provide any way to replace Counter and Meter >>> > implementations. There are no full functional interfaces for these >>> > entities + MetricRegistry has casts/checks to implementations and cannot >>> > work with anything else. >>> > I looked through the already reported issues and found the following >>> > similar and unsuccessful attempt to introduce interfaces for metrics: >>> > https://github.com/dropwizard/metrics/issues/2186 >>> > as well as other older attempts: >>> > https://github.com/dropwizard/metrics/issues/252 >>> > https://github.com/dropwizard/metrics/issues/264 >>> > https://github.com/dropwizard/metrics/issues/703 >>> > https://github.com/dropwizard/metrics/pull/487 >>> > https://github.com/dropwizard/metrics/issues/479 >>> > https://github.com/dropwizard/metrics/issues/253 >>> > >>> > So, the option to request an extensibility from Codahale metrics does not >>> > look real.. >>> > >>> > 3) It looks like the library is in maintenance mode now, 5.x version is >>> > on hold and many integrations are also not so alive. >>> > The main benefit to use Codahale metrics should be a huge amount of >>> > reporters/integrations but if we check carefully the list of reporters >>> > mentioned here: >>> > https://metrics.dropwizard.io/4.2.0/manual/third-party.html#reporters >>> > we can see that almost all of them are dead/archived. >>> > >>> > 4) In general, exposing other 3rd party libraries as our own public API >>> > frequently creates too many limitations and issues (Guava is another >>> > typical example which I saw previously, it is easy to start but later you >>> > struggle more and more). >>> > >>> > Does anyone have any questions or concerns regarding this suggestion? >>> > -- >>> > Dmitry Konstantinov >>> > >>> > >>> > >>> > >>> > -- >>> > Dmitry Konstantinov >>> > >>> >