Hi Alex and All, I certainly support the idea of following OTel standards for achieving "correlation" wrt Polaris requests and/or events.
As to what form the correlation data should take, I believe it is conceptually what the OTel "context" represents. So, I believe it makes sense for Polaris to support standard context propagators at the API layer. If the incoming request has OTel context information, then returning any other "correlation" data in the response is redundant, I think. If the incoming request does not have OTel context info, what is the purpose of generating a Polaris-specific "correlation ID"? How is it envisioned to be used? If the intention is to correlate a Polaris response (operation) with events that might have resulted from its execution, I believe a more robust approach would be to propagate the OTel trace info (starting a new trace if necessary) into event data. Then, Polaris can also return the trace context in the API response (top span). It's a bit awkward from the OTel perspective, but might be an option for supporting custom correlators. It could be covered by a feature flag. The header name could be "polaris-traceparent" for W3C Trace Context. Custom correlation code will be able to extract the trace ID from the response and from events and find related data. Granted, it will require a bit more effort for the custom code to decode trace IDs from the OTel context, but the format is standard and not complex. The benefit for Polaris, though, is that it can easily integrate with OTel-compatible observability platforms regardless of whether any particular deployment uses custom correlators or not. WDYT? Thanks, Dmitri. On Wed, Oct 22, 2025 at 6:03 AM Alexandre Dutra <[email protected]> wrote: > Hi all, > > Today, Polaris has the notion of "request ID", but its purpose is not > entirely clear. It seems to serve as an observability feature to > facilitate correlation. A pending PR aims to rename it to "correlation > ID" for better alignment with industry standards [1]. > > However, this PR has brought to light overlaps with core telemetry > features: when OpenTelemetry (OTel) is enabled in Polaris, each > request already has a trace ID and span ID, making a separate > correlation ID redundant. > > Moreover, using the OTel trace ID and span ID in Polaris events, > rather than the generated correlation ID, would significantly simplify > correlation of events with other traces. > > Therefore, I propose the following changes: > > 1. If OTel is enabled, use the trace ID and span ID as the correlation > ID for the request, instead of generating a random correlation ID. > 2. Otherwise, if a (Polaris-specific) correlation ID header is present > in the request, use it. > 3. If neither of the above conditions is met, generate a random > correlation ID. > > I am somewhat undecided on the best approach when a correlation ID > header is present in the request. However, I believe it would be more > sensible to disregard it if OTel is enabled, as OTel offers a more > robust solution for client-to-server trace propagation, e.g. W3C Trace > Context propagation [2]. > > Please share your thoughts! > > Thanks, > Alex > > [1]: https://github.com/apache/polaris/pull/2757 > [2]: https://www.w3.org/TR/trace-context >
