Sooooo. That's a +1 from you, Jon? Just want to make sure.

On Thu, Oct 3, 2024 at 7:17 AM Jon Haddad <j...@rustyrazorblade.com> wrote:

> I love that we're having a discussion about observability.  A HUGE thank
> you to anyone willing to invest time improving it in Cassandra.
>
> I'd really, really like to see us ship a Prom compatible metrics endpoint
> out of the box in C* that has low overhead.  All the current OSS metrics
> exporters that I've seen have massive overhead.  I'm specifically looking
> for sub-10s collection on clusters with a thousand nodes and 500+ tables.
> That means going directly to DropWizard and skipping JMX.
>
> I put together a POC of it a while ago here:
> https://github.com/rustyrazorblade/cassandra-prometheus-exporter.  Please
> use commit 434be099d5983d537e2c70aad745194e575bc49a as a reference.  I
> wasn't expecting anyone to actually care about the repo and the last commit
> broke it.  There's some optimizations that could be done to further improve
> the exporter, I was working on that when I broke the repo :/
>
> For industry comparison the following DBs either ship entire monitoring
> stacks or provide strong recommendations / solutions:
>
> * ScyllaDB: https://www.scylladb.com/product/scylladb-monitoring-stack/
> * Cockroach:
> https://www.cockroachlabs.com/docs/v24.2/ui-overview-dashboard
> * Aerospike:
> https://aerospike.com/docs/monitorstack/new/components-of-monitoring-stack
> * MongoDB:
> https://www.mongodb.com/products/platform/atlas-charts/dashboard
> * Elastic:
> https://www.elastic.co/guide/en/elasticsearch/reference/8.15/monitoring-production.html
> * Redis: https://grafana.com/grafana/dashboards/12776-redis/
>
> Re: Logs - I wouldn't write off OTel logging [1].  OTel logs can be tagged
> with metadata including the span allowing you to do some really useful
> diagnostics.  It's a significant improvement over standard logging.
>
> Anyways - I don't have a strong opinion on how the CEPs are done.
> Different ones or together, whichever works.  I hope we can finally get a
> good metrics solution because that's an area of significant pain for end
> users.  A lot of teams don't even have Cassandra dashboards because we
> currently provide zero direction.
>
> Jon
>
> [1] https://opentelemetry.io/docs/specs/otel/logs/
>
> Logs can be correlated with the rest of observability data in a few
> dimensions:
>
> * By the time of execution. Logs, traces and metrics can record the moment
> of time or the range of time the execution took place. This is the most
> basic form of correlation.
>
>  * By the execution context, also known as the trace context. It is a
> standard practice to record the execution context (trace and span ids as
> well as user-defined context) in the spans. OpenTelemetry extends this
> practice to logs where possible by including TraceId and SpanId in the
> LogRecords. This allows to directly correlate logs and traces that
> correspond to the same execution context. It also allows to correlate logs
> from different components of a distributed system that participated in the
> particular request execution.
>
>   * By the origin of the telemetry, also known as the Resource context.
> OpenTelemetry traces and metrics contain information about the Resource
> they come from. We extend this practice to logs by including the Resource
> in LogRecords.
>
>
>
> On Thu, Oct 3, 2024 at 6:11 AM João Reis <joaor...@apache.org> wrote:
>
>> Reducing the scope of CEP-32 to OpenTelemetry Tracing is a good idea (or
>> creating a new one). We recently added OpenTelemetry Tracing support to the
>> C# driver [1] and we also decided to not include Metrics and Logs in this
>> initiative because the driver already provides a way to collect metrics and
>> logs so it's not as important.
>>
>> I believe there's also efforts to add OpenTelemetry support to the java
>> driver but I'm not sure if it's limited to Tracing or if they include
>> metrics and logs.
>>
>> [1]
>> https://github.com/datastax/csharp-driver/tree/master/doc/features/opentelemetry#readme
>>
>> Yuki Morishita <mor.y...@gmail.com> escreveu (terça, 1/10/2024 à(s)
>> 07:13):
>>
>>> Hi,
>>>
>>> Since I have limited time working on the CEP-32, I'd appreciate the
>>> collaboration to make this CEP the reality.
>>>
>>> Another thing I'm thinking of is to reduce its scope to only the
>>> OpenTelemetry configuration and the way to work only with OpenTelemetry
>>> Tracing.
>>>
>>> If it's possible to create sub CEPs, I will create the one for tracing,
>>> metrics and logs. Otherwise, I can rewrite the current CEP-32 to only focus
>>> on OpenTelemetry Tracing.
>>> Or maybe scrap CEP-32 and create a new one for Tracing.
>>>
>>>
>>> On Mon, Sep 23, 2024 at 11:47 AM Saranya Krishnakumar <
>>> saran.krishna...@gmail.com> wrote:
>>>
>>>> Hi Patrick,
>>>>
>>>> I am interested in working on this CEP collaborating with Yuki. I
>>>> recently worked on adding metrics framework in Apache Cassandra Sidecar
>>>> project.
>>>>
>>>> Best,
>>>> Saranya Krishnakumar
>>>>
>>>> On Thu, Sep 19, 2024 at 10:57 AM Patrick McFadin <pmcfa...@gmail.com>
>>>> wrote:
>>>>
>>>>> Here's another stalled CEP. In this case, no discuss thread or Jira.
>>>>>
>>>>> Yuki (or anyone else) know the status of this CEP?
>>>>>
>>>>>
>>>>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-32%3A+%28DRAFT%29+OpenTelemetry+integration
>>>>>
>>>>> Patrick
>>>>>
>>>>

Reply via email to