Hi Michael, Logging x-request-id headers makes sense.
Just to confirm: if / when we restore that, Polaris will NOT generate new IDs in case the header is not present in the request, correct? I believe x-request-id can co-exist with OTel. What about adding request IDs to events [1][2]? What's the intended use case for that? Could you share some context here too? Side note: I proposed [2877] flagging event persistence as "beta" in 1.2.0... This discussion adds another point towards that, I think. [1] https://github.com/apache/polaris/blob/2f0c7a43d446452004ea51196b618de9bdf0e25b/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/inmemory/InMemoryBufferEventListener.java#L97 [2] https://github.com/apache/polaris/blob/19742cc20f4bc0b7e5a315a62f89c6085ad81b7d/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/PolarisPersistenceEventListener.java#L66 [2877] https://github.com/apache/polaris/pull/2877#discussion_r2456300613 Thanks, Dmitri. On Thu, Oct 23, 2025 at 2:23 PM Michael Collado <[email protected]> wrote: > Hey Dmitri > > The generating a request id is new code that was added after the original > x-request-id support. You can see the state from ~1 year ago, we hard-coded > request_id as the header we used for the MDC - > > https://github.com/apache/polaris/blob/a6197bd7d8cb5551253fa427e4373897205ecece/polaris-service/src/main/java/org/apache/polaris/service/PolarisApplication.java#L415-L416 > . At some point, it was changed to be configurable, then the > ContextResolverFilter filter was refactored/eliminated and the > RequestIdFilter took its responsibility, but lost some of its original > functionality. > > My personal opinion is that restoring support for the x-request-id header > is something that we should do, but if the header isn't present, falling > back on simply using OTel trace ids is good enough (better, even) than > generating another random request id. > > Mike > > On Thu, Oct 23, 2025 at 10:47 AM Dmitri Bourlatchkov <[email protected]> > wrote: > > > Hi Michael, > > > > Thanks for the info! > > > > Working with Envoy's tracing headers makes sense to me. However, I > wonder: > > why would Polaris need to generate a new request ID inside its code?.. > > and return it in response headers? > > > > How important is it to propagate this ID to Polaris Events? > > > > I'm just trying to understand the full context :) > > > > Thanks, > > Dmitri. > > > > On Thu, Oct 23, 2025 at 1:29 PM Michael Collado <[email protected]> > > wrote: > > > > > I think the original intention for this requestId field was to support > > > request propagation from load balancers, like Envoy ( > > > > > > > > > https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/observability/tracing > > > ), which is distinct from OTEL. Don't ask me why the default > > > is Polaris-Request-Id - I think it was originally a custom thing, but > > then > > > we changed to integrate with existing conventions. Unfortunately, > looking > > > through the code, I think that the actual functional plumbing has been > > lost > > > in the course of multiple refactors around the call context and > > resolver. I > > > don't see references to that property or the underlying header. > > > > > > Support for the unofficial x-request-id header feels like something we > > > should definitely keep, especially when Polaris is one service in a > mesh > > of > > > services that maybe don't have OTel integration. I'm a fan of the OTel > > > standard, but it's not entirely ubiquitous and there are many > middleware > > > layers that know how to forward on the x-request-id header. > > > > > > Mike > > > > > > On Thu, Oct 23, 2025 at 3:00 AM Robert Stupp <[email protected]> wrote: > > > > > > > Yes, we should aim for interoperability with the existing de-facto > > > > standard OTel and make it easy for users to integrate into their > > > > observability platforms. > > > > > > > > On Wed, Oct 22, 2025 at 7:05 PM Dmitri Bourlatchkov < > [email protected]> > > > > wrote: > > > > > > > > > > Hi Alex and All, > > > > > > > > > > I certainly support the idea of following OTel standards for > > achieving > > > > > "correlation" wrt Polaris requests and/or events. > > > > > > > > > > As to what form the correlation data should take, I believe it is > > > > > conceptually what the OTel "context" represents. So, I believe it > > makes > > > > > sense for Polaris to support standard context propagators at the > API > > > > layer. > > > > > > > > > > If the incoming request has OTel context information, then > returning > > > any > > > > > other "correlation" data in the response is redundant, I think. > > > > > > > > > > If the incoming request does not have OTel context info, what is > the > > > > > purpose of generating a Polaris-specific "correlation ID"? How is > it > > > > > envisioned to be used? > > > > > > > > > > If the intention is to correlate a Polaris response (operation) > with > > > > events > > > > > that might have resulted from its execution, I believe a more > robust > > > > > approach would be to propagate the OTel trace info (starting a new > > > trace > > > > if > > > > > necessary) into event data. Then, Polaris can also return the trace > > > > context > > > > > in the API response (top span). It's a bit awkward from the OTel > > > > > perspective, but might be an option for supporting custom > > correlators. > > > It > > > > > could be covered by a feature flag. The header name could be > > > > > "polaris-traceparent" for W3C Trace Context. > > > > > > > > > > Custom correlation code will be able to extract the trace ID from > the > > > > > response and from events and find related data. Granted, it will > > > require > > > > a > > > > > bit more effort for the custom code to decode trace IDs from the > OTel > > > > > context, but the format is standard and not complex. The benefit > for > > > > > Polaris, though, is that it can easily integrate with > OTel-compatible > > > > > observability platforms regardless of whether any particular > > deployment > > > > > uses custom correlators or not. > > > > > > > > > > WDYT? > > > > > > > > > > Thanks, > > > > > Dmitri. > > > > > > > > > > On Wed, Oct 22, 2025 at 6:03 AM Alexandre Dutra <[email protected] > > > > > > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > Today, Polaris has the notion of "request ID", but its purpose is > > not > > > > > > entirely clear. It seems to serve as an observability feature to > > > > > > facilitate correlation. A pending PR aims to rename it to > > > "correlation > > > > > > ID" for better alignment with industry standards [1]. > > > > > > > > > > > > However, this PR has brought to light overlaps with core > telemetry > > > > > > features: when OpenTelemetry (OTel) is enabled in Polaris, each > > > > > > request already has a trace ID and span ID, making a separate > > > > > > correlation ID redundant. > > > > > > > > > > > > Moreover, using the OTel trace ID and span ID in Polaris events, > > > > > > rather than the generated correlation ID, would significantly > > > simplify > > > > > > correlation of events with other traces. > > > > > > > > > > > > Therefore, I propose the following changes: > > > > > > > > > > > > 1. If OTel is enabled, use the trace ID and span ID as the > > > correlation > > > > > > ID for the request, instead of generating a random correlation > ID. > > > > > > 2. Otherwise, if a (Polaris-specific) correlation ID header is > > > present > > > > > > in the request, use it. > > > > > > 3. If neither of the above conditions is met, generate a random > > > > > > correlation ID. > > > > > > > > > > > > I am somewhat undecided on the best approach when a correlation > ID > > > > > > header is present in the request. However, I believe it would be > > more > > > > > > sensible to disregard it if OTel is enabled, as OTel offers a > more > > > > > > robust solution for client-to-server trace propagation, e.g. W3C > > > Trace > > > > > > Context propagation [2]. > > > > > > > > > > > > Please share your thoughts! > > > > > > > > > > > > Thanks, > > > > > > Alex > > > > > > > > > > > > [1]: https://github.com/apache/polaris/pull/2757 > > > > > > [2]: https://www.w3.org/TR/trace-context > > > > > > > > > > > > > > > >
