Hi Alex and All,

I certainly support the idea of following OTel standards for achieving
"correlation" wrt Polaris requests and/or events.

As to what form the correlation data should take, I believe it is
conceptually what the OTel "context" represents. So, I believe it makes
sense for Polaris to support standard context propagators at the API layer.

If the incoming request has OTel context information, then returning any
other "correlation" data in the response is redundant, I think.

If the incoming request does not have OTel context info, what is the
purpose of generating a Polaris-specific "correlation ID"? How is it
envisioned to be used?

If the intention is to correlate a Polaris response (operation) with events
that might have resulted from its execution, I believe a more robust
approach would be to propagate the OTel trace info (starting a new trace if
necessary) into event data. Then, Polaris can also return the trace context
in the API response (top span). It's a bit awkward from the OTel
perspective, but might be an option for supporting custom correlators. It
could be covered by a feature flag. The header name could be
"polaris-traceparent" for W3C Trace Context.

Custom correlation code will be able to extract the trace ID from the
response and from events and find related data. Granted, it will require a
bit more effort for the custom code to decode trace IDs from the OTel
context, but the format is standard and not complex. The benefit for
Polaris, though, is that it can easily integrate with OTel-compatible
observability platforms regardless of whether any particular deployment
uses custom correlators or not.

WDYT?

Thanks,
Dmitri.

On Wed, Oct 22, 2025 at 6:03 AM Alexandre Dutra <[email protected]> wrote:

> Hi all,
>
> Today, Polaris has the notion of "request ID", but its purpose is not
> entirely clear. It seems to serve as an observability feature to
> facilitate correlation. A pending PR aims to rename it to "correlation
> ID" for better alignment with industry standards [1].
>
> However, this PR has brought to light overlaps with core telemetry
> features: when OpenTelemetry (OTel) is enabled in Polaris, each
> request already has a trace ID and span ID, making a separate
> correlation ID redundant.
>
> Moreover, using the OTel trace ID and span ID in Polaris events,
> rather than the generated correlation ID, would significantly simplify
> correlation of events with other traces.
>
> Therefore, I propose the following changes:
>
> 1. If OTel is enabled, use the trace ID and span ID as the correlation
> ID for the request, instead of generating a random correlation ID.
> 2. Otherwise, if a (Polaris-specific) correlation ID header is present
> in the request, use it.
> 3. If neither of the above conditions is met, generate a random
> correlation ID.
>
> I am somewhat undecided on the best approach when a correlation ID
> header is present in the request. However, I believe it would be more
> sensible to disregard it if OTel is enabled, as OTel offers a more
> robust solution for client-to-server trace propagation, e.g. W3C Trace
> Context propagation [2].
>
> Please share your thoughts!
>
> Thanks,
> Alex
>
> [1]: https://github.com/apache/polaris/pull/2757
> [2]: https://www.w3.org/TR/trace-context
>

Reply via email to