Yes, we should aim for interoperability with the existing de-facto standard OTel and make it easy for users to integrate into their observability platforms.
On Wed, Oct 22, 2025 at 7:05 PM Dmitri Bourlatchkov <[email protected]> wrote: > > Hi Alex and All, > > I certainly support the idea of following OTel standards for achieving > "correlation" wrt Polaris requests and/or events. > > As to what form the correlation data should take, I believe it is > conceptually what the OTel "context" represents. So, I believe it makes > sense for Polaris to support standard context propagators at the API layer. > > If the incoming request has OTel context information, then returning any > other "correlation" data in the response is redundant, I think. > > If the incoming request does not have OTel context info, what is the > purpose of generating a Polaris-specific "correlation ID"? How is it > envisioned to be used? > > If the intention is to correlate a Polaris response (operation) with events > that might have resulted from its execution, I believe a more robust > approach would be to propagate the OTel trace info (starting a new trace if > necessary) into event data. Then, Polaris can also return the trace context > in the API response (top span). It's a bit awkward from the OTel > perspective, but might be an option for supporting custom correlators. It > could be covered by a feature flag. The header name could be > "polaris-traceparent" for W3C Trace Context. > > Custom correlation code will be able to extract the trace ID from the > response and from events and find related data. Granted, it will require a > bit more effort for the custom code to decode trace IDs from the OTel > context, but the format is standard and not complex. The benefit for > Polaris, though, is that it can easily integrate with OTel-compatible > observability platforms regardless of whether any particular deployment > uses custom correlators or not. > > WDYT? > > Thanks, > Dmitri. > > On Wed, Oct 22, 2025 at 6:03 AM Alexandre Dutra <[email protected]> wrote: > > > Hi all, > > > > Today, Polaris has the notion of "request ID", but its purpose is not > > entirely clear. It seems to serve as an observability feature to > > facilitate correlation. A pending PR aims to rename it to "correlation > > ID" for better alignment with industry standards [1]. > > > > However, this PR has brought to light overlaps with core telemetry > > features: when OpenTelemetry (OTel) is enabled in Polaris, each > > request already has a trace ID and span ID, making a separate > > correlation ID redundant. > > > > Moreover, using the OTel trace ID and span ID in Polaris events, > > rather than the generated correlation ID, would significantly simplify > > correlation of events with other traces. > > > > Therefore, I propose the following changes: > > > > 1. If OTel is enabled, use the trace ID and span ID as the correlation > > ID for the request, instead of generating a random correlation ID. > > 2. Otherwise, if a (Polaris-specific) correlation ID header is present > > in the request, use it. > > 3. If neither of the above conditions is met, generate a random > > correlation ID. > > > > I am somewhat undecided on the best approach when a correlation ID > > header is present in the request. However, I believe it would be more > > sensible to disregard it if OTel is enabled, as OTel offers a more > > robust solution for client-to-server trace propagation, e.g. W3C Trace > > Context propagation [2]. > > > > Please share your thoughts! > > > > Thanks, > > Alex > > > > [1]: https://github.com/apache/polaris/pull/2757 > > [2]: https://www.w3.org/TR/trace-context > >
