SiyaoIsHiding commented on code in PR #1994: URL: https://github.com/apache/cassandra-java-driver/pull/1994#discussion_r1901589099
########## proposals/open-telemetry/tracing.md: ########## @@ -0,0 +1,263 @@ +- Feature Name: Add OpenTelemetry Traces +- Start Date: 2024-12-05 + +# Summary +[summary]: #summary + +[OpenTelemetry](https://opentelemetry.io/docs/what-is-opentelemetry/) is a comprehensive collection of APIs, SDKs, and tools designed to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) to analyze software performance and behavior. +This document outlines the necessary steps to integrate OpenTelemetry tracing into the Apache Cassandra Java driver. + +# Motivation +[motivation]: #motivation + +OpenTelemetry has become the industry standard for telemetry data aggregation, encompassing logs, metrics, and traces. +Tracing, in particular, enables developers to track the full "path" a request takes through the application, providing deep insights into services. +[OpenTelemetry's auto-instrumentation](https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation/cassandra/cassandra-4.4/library) of the Apache Cassandra Java Driver (via the Java agent) already supports basic traces, logs, and metrics. However, this proposal to include tracing directly in the native Apache Cassandra Java Driver will eliminate the need for a Java agent and provide more detailed information, including individual Cassandra calls due to retry or speculative execution. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +## [Traces](https://opentelemetry.io/docs/concepts/signals/traces/) + +Traces allow developers to understand the complete flow of a request through the system, navigating across services. Each trace consists of multiple [Spans](https://opentelemetry.io/docs/concepts/signals/traces/#spans), which represent units of work within the system. Each span includes the following details: + +- Name +- Parent span ID (empty for root spans) +- Start and End Timestamps +- [Span Context](https://opentelemetry.io/docs/concepts/signals/traces/#span-context) +- [Attributes](https://opentelemetry.io/docs/concepts/signals/traces/#attributes) +- [Span Events](https://opentelemetry.io/docs/concepts/signals/traces/#span-events) +- [Span Links](https://opentelemetry.io/docs/concepts/signals/traces/#span-links) +- [Span Status](https://opentelemetry.io/docs/concepts/signals/traces/#span-status) + +Spans can be correlated using [context propagation](https://opentelemetry.io/docs/concepts/signals/traces/#context-propagation). + +### Example of a trace in a microservice architecture + + + +## OpenTelemetry Semantic Conventions +[opentelemetry-semantic-conventions]: #opentelemetry-semantic-conventions + +### Span name + +[OpenTelemetry Trace Semantic Conventions](https://opentelemetry.io/docs/specs/semconv/general/trace/) (at the time of this writing, it's on version 1.29.0) standardizes naming conventions for various components. +For the Apache Cassandra Java Driver, the focus is on: +* [Database Client Call Conventions](https://opentelemetry.io/docs/specs/semconv/database/database-spans/) +* [Cassandra\-Specific Conventions](https://opentelemetry.io/docs/specs/semconv/database/cassandra/) + +The span name for Cassandra will follow this convention: `<db.operation> <db.name>` if the keyspace name is available. If not, it will be `<db.operation>`. + +### Span attributes + +This implementation will include, by default, the **required** attributes for Database, and Cassandra spans. +`server.address`, `server.port`, and `db.query.text`, despite only **recommended**, are included to give information regarding the client connection. + +| Attribute | Description | Type | Level | Required | Supported Values | +|-------------------|-----------------------------------------------------------------------------|--------|------------|-----------------------------------------------|--------------------------------------------| +| db.system | An identifier for the database management system (DBMS) product being used. | string | Connection | true | cassandra | +| db.namespace | The keyspace name in Cassandra. | string | Call | conditionally true [1] | *keyspace in use* | +| db.operation.name | The name of the operation being executed. | string | Call | true if `db.statement` is not applicable. [2] | _Session Request_ or _Node Request_ | +| db.query.text | The database statement being executed. | string | Call | false | *database statement in use* [3] | +| server.address | Name of the database host. | string | Connection | true | e.g.: example.com; 10.1.2.80; /tmp/my.sock | +| server.port | Server port number. Used in case the port being used is not the default. | int | Connection | false | e.g.: 9445 | Review Comment: Thanks for your review! I updated the proposal. Retries and speculative executions details, e.g. `db.query.text` and `server.address` will be in the attributes `Node Request` type of spans. But we can never know which `Node Request` is a retry or a speculative execution. The current `RequestTracker` interface does not expose such information. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: pr-unsubscr...@cassandra.apache.org For additional commands, e-mail: pr-h...@cassandra.apache.org