Hi All, Many thanks for the background info, Adnan!
+1 to action items proposed by Alex! Re: (3) Can we abstract request/trace info into a separate object without exposing those accessors on the Event class directly? OTel defines trace/span concepts, but in request ID is a bit foreign to OTel. Having tracing / request ID isolated in java could help with maintaining it and potentially supporting other (custom) tracing methods. Re: (4) I'd like to propose storing OTel correlation data in the form of a standard context propagation string, e.g. W3C trace-context [1] (same value as its HTTP header), so the column could be called w3c_trace_context or simply trace_context. Open question: do we need to write a separate, individual trace ID field in SQL? I suppose it is not very useful since correlating it to other trace data already requires understanding OTel context propagation and a query against trace_context can still be made using string-matching clauses. We could probably (additionally) store it in the request_id column if the Polaris-specific request ID header is not set. As for span ID, I do not really see a use case for persisting it individually. It is very specific to OTel trace data construction. Actually, using the W3C trace context [1] encoding probably makes sense in the java event representation too. Interested callers can easily decode this information since the format is well-defined. As a side benefit, this opens opportunities for downstream event consumers to connect (propagate context) to traces that produced events based on the event data itself, without relying on the intermediate frameworks. This may be desirable since the current Polaris event persistence impl. writes events in batches, so the association to individual requests that produced those events is lost. Whether to perform this kind of context propagation will be at the discretion of the event consumer, of course (outside of Polaris code). [1] https://www.w3.org/TR/trace-context/ WDYT? Thanks, Dmitri. On Fri, Oct 24, 2025 at 7:18 AM Alexandre Dutra <[email protected]> wrote: > Hi all, > > Thank you for chiming in; the context around request IDs is now clear. > > Trying to summarize this thread into actionable items, here's what I > propose: > > 1. Restore the original functionality for request IDs. > - Change the default header name back to x-request-id (despite the > x- prefix being deprecated), but keep it configurable as today. > 2. Remove RequestIdGenerator and related functionality. > - Do not generate a request ID if the REST client doesn't provide one. > 3. Update PolarisEvent: > - Expose new requestId(), traceId(), spanId() methods, all nullable. > - This would align with the emerging consensus around including > contextual information in PolarisEvent [1]. > 4. Update events table SQL schema: > - Insert the client-provided request ID into the request_id > column; otherwise, insert null. > - Add two new nullable columns, trace_id and span_id, and populate > them if OTel is enabled. > > From our discussions, I think it's important to not conflate OTel > tracing fields with Envoy tracing fields, which is why I suggest we > use separate fields / columns for them. > > Would the above plan work for everyone? > > Thanks, > Alex > > [1]: https://lists.apache.org/thread/rl5cpcft16sn5n00mfkmx9ldn3gsqtfy > > > On Fri, Oct 24, 2025 at 9:33 AM Adnan Hemani > <[email protected]> wrote: > > > > Hi all, > > > > Thanks to Alex for starting this thread - because of this, I’m just > coming to the realization that OTel Trace and Span IDs are coming built-in > with Quarkus and my previous work to generate a Request/Correlation ID is > likely not needed as a result. My original motivation for generation of a > Request/Correlation ID was to ensure that any client can uniquely identify > a request made to Polaris, which would be especially useful for debugging > failing requests or identifying call patterns. > > > > As a result, I’m a +1 on Michael’s opinion: we should remove the > Request/Correlation ID generation and always use the OTel trace/span IDs > (which come for free with Quarkus) instead for the Correlation ID unless a > valid header is present, which would take over as the Correlation ID > instead. > > > > — > > > > To answer Dmitri’s question re Polaris Events: The intended use case is > to provide some sort of correlation between events that have occurred as > part of the same request. For example, if a user makes an CommitTransaction > request, it would be helpful to see all UpdateTable calls that were made as > part of that one user request. > > > > Best, > > Adnan Hemani > > > > > On Oct 23, 2025, at 12:15 PM, Dmitri Bourlatchkov <[email protected]> > wrote: > > > > > > Hi Michael, > > > > > > Logging x-request-id headers makes sense. > > > > > > Just to confirm: if / when we restore that, Polaris will NOT generate > new > > > IDs in case the header is not present in the request, correct? > > > > > > I believe x-request-id can co-exist with OTel. > > > > > > What about adding request IDs to events [1][2]? What's the intended use > > > case for that? Could you share some context here too? > > > > > > Side note: I proposed [2877] flagging event persistence as "beta" in > > > 1.2.0... This discussion adds another point towards that, I think. > > > > > > [1] > > > > https://www.google.com/url?q=https://github.com/apache/polaris/blob/2f0c7a43d446452004ea51196b618de9bdf0e25b/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/inmemory/InMemoryBufferEventListener.java%23L97&source=gmail-imap&ust=1761851831000000&usg=AOvVaw1WfUaXLp6z_M87iAXEqSUw > > > [2] > > > > https://www.google.com/url?q=https://github.com/apache/polaris/blob/19742cc20f4bc0b7e5a315a62f89c6085ad81b7d/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/PolarisPersistenceEventListener.java%23L66&source=gmail-imap&ust=1761851831000000&usg=AOvVaw12-7e3ahm2sLSkLSNqTecm > > > > > > [2877] > https://www.google.com/url?q=https://github.com/apache/polaris/pull/2877%23discussion_r2456300613&source=gmail-imap&ust=1761851831000000&usg=AOvVaw3TuYbkzwnLx3QEVIM8oDda > > > > > > Thanks, > > > Dmitri. > > > > > > On Thu, Oct 23, 2025 at 2:23 PM Michael Collado < > [email protected]> > > > wrote: > > > > > >> Hey Dmitri > > >> > > >> The generating a request id is new code that was added after the > original > > >> x-request-id support. You can see the state from ~1 year ago, we > hard-coded > > >> request_id as the header we used for the MDC - > > >> > > >> > https://www.google.com/url?q=https://github.com/apache/polaris/blob/a6197bd7d8cb5551253fa427e4373897205ecece/polaris-service/src/main/java/org/apache/polaris/service/PolarisApplication.java%23L415-L416&source=gmail-imap&ust=1761851831000000&usg=AOvVaw35Q1A_2avAiSYlYVAnZxBb > > >> . At some point, it was changed to be configurable, then the > > >> ContextResolverFilter filter was refactored/eliminated and the > > >> RequestIdFilter took its responsibility, but lost some of its original > > >> functionality. > > >> > > >> My personal opinion is that restoring support for the x-request-id > header > > >> is something that we should do, but if the header isn't present, > falling > > >> back on simply using OTel trace ids is good enough (better, even) than > > >> generating another random request id. > > >> > > >> Mike > > >> > > >> On Thu, Oct 23, 2025 at 10:47 AM Dmitri Bourlatchkov < > [email protected]> > > >> wrote: > > >> > > >>> Hi Michael, > > >>> > > >>> Thanks for the info! > > >>> > > >>> Working with Envoy's tracing headers makes sense to me. However, I > > >> wonder: > > >>> why would Polaris need to generate a new request ID inside its > code?.. > > >>> and return it in response headers? > > >>> > > >>> How important is it to propagate this ID to Polaris Events? > > >>> > > >>> I'm just trying to understand the full context :) > > >>> > > >>> Thanks, > > >>> Dmitri. > > >>> > > >>> On Thu, Oct 23, 2025 at 1:29 PM Michael Collado < > [email protected]> > > >>> wrote: > > >>> > > >>>> I think the original intention for this requestId field was to > support > > >>>> request propagation from load balancers, like Envoy ( > > >>>> > > >>>> > > >>> > > >> > https://www.google.com/url?q=https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/observability/tracing&source=gmail-imap&ust=1761851831000000&usg=AOvVaw1RnuM8mViV-j7jvuxq74Aw > > >>>> ), which is distinct from OTEL. Don't ask me why the default > > >>>> is Polaris-Request-Id - I think it was originally a custom thing, > but > > >>> then > > >>>> we changed to integrate with existing conventions. Unfortunately, > > >> looking > > >>>> through the code, I think that the actual functional plumbing has > been > > >>> lost > > >>>> in the course of multiple refactors around the call context and > > >>> resolver. I > > >>>> don't see references to that property or the underlying header. > > >>>> > > >>>> Support for the unofficial x-request-id header feels like something > we > > >>>> should definitely keep, especially when Polaris is one service in a > > >> mesh > > >>> of > > >>>> services that maybe don't have OTel integration. I'm a fan of the > OTel > > >>>> standard, but it's not entirely ubiquitous and there are many > > >> middleware > > >>>> layers that know how to forward on the x-request-id header. > > >>>> > > >>>> Mike > > >>>> > > >>>> On Thu, Oct 23, 2025 at 3:00 AM Robert Stupp <[email protected]> > wrote: > > >>>> > > >>>>> Yes, we should aim for interoperability with the existing de-facto > > >>>>> standard OTel and make it easy for users to integrate into their > > >>>>> observability platforms. > > >>>>> > > >>>>> On Wed, Oct 22, 2025 at 7:05 PM Dmitri Bourlatchkov < > > >> [email protected]> > > >>>>> wrote: > > >>>>>> > > >>>>>> Hi Alex and All, > > >>>>>> > > >>>>>> I certainly support the idea of following OTel standards for > > >>> achieving > > >>>>>> "correlation" wrt Polaris requests and/or events. > > >>>>>> > > >>>>>> As to what form the correlation data should take, I believe it is > > >>>>>> conceptually what the OTel "context" represents. So, I believe it > > >>> makes > > >>>>>> sense for Polaris to support standard context propagators at the > > >> API > > >>>>> layer. > > >>>>>> > > >>>>>> If the incoming request has OTel context information, then > > >> returning > > >>>> any > > >>>>>> other "correlation" data in the response is redundant, I think. > > >>>>>> > > >>>>>> If the incoming request does not have OTel context info, what is > > >> the > > >>>>>> purpose of generating a Polaris-specific "correlation ID"? How is > > >> it > > >>>>>> envisioned to be used? > > >>>>>> > > >>>>>> If the intention is to correlate a Polaris response (operation) > > >> with > > >>>>> events > > >>>>>> that might have resulted from its execution, I believe a more > > >> robust > > >>>>>> approach would be to propagate the OTel trace info (starting a new > > >>>> trace > > >>>>> if > > >>>>>> necessary) into event data. Then, Polaris can also return the > trace > > >>>>> context > > >>>>>> in the API response (top span). It's a bit awkward from the OTel > > >>>>>> perspective, but might be an option for supporting custom > > >>> correlators. > > >>>> It > > >>>>>> could be covered by a feature flag. The header name could be > > >>>>>> "polaris-traceparent" for W3C Trace Context. > > >>>>>> > > >>>>>> Custom correlation code will be able to extract the trace ID from > > >> the > > >>>>>> response and from events and find related data. Granted, it will > > >>>> require > > >>>>> a > > >>>>>> bit more effort for the custom code to decode trace IDs from the > > >> OTel > > >>>>>> context, but the format is standard and not complex. The benefit > > >> for > > >>>>>> Polaris, though, is that it can easily integrate with > > >> OTel-compatible > > >>>>>> observability platforms regardless of whether any particular > > >>> deployment > > >>>>>> uses custom correlators or not. > > >>>>>> > > >>>>>> WDYT? > > >>>>>> > > >>>>>> Thanks, > > >>>>>> Dmitri. > > >>>>>> > > >>>>>> On Wed, Oct 22, 2025 at 6:03 AM Alexandre Dutra < > [email protected] > > >>> > > >>>>> wrote: > > >>>>>> > > >>>>>>> Hi all, > > >>>>>>> > > >>>>>>> Today, Polaris has the notion of "request ID", but its purpose is > > >>> not > > >>>>>>> entirely clear. It seems to serve as an observability feature to > > >>>>>>> facilitate correlation. A pending PR aims to rename it to > > >>>> "correlation > > >>>>>>> ID" for better alignment with industry standards [1]. > > >>>>>>> > > >>>>>>> However, this PR has brought to light overlaps with core > > >> telemetry > > >>>>>>> features: when OpenTelemetry (OTel) is enabled in Polaris, each > > >>>>>>> request already has a trace ID and span ID, making a separate > > >>>>>>> correlation ID redundant. > > >>>>>>> > > >>>>>>> Moreover, using the OTel trace ID and span ID in Polaris events, > > >>>>>>> rather than the generated correlation ID, would significantly > > >>>> simplify > > >>>>>>> correlation of events with other traces. > > >>>>>>> > > >>>>>>> Therefore, I propose the following changes: > > >>>>>>> > > >>>>>>> 1. If OTel is enabled, use the trace ID and span ID as the > > >>>> correlation > > >>>>>>> ID for the request, instead of generating a random correlation > > >> ID. > > >>>>>>> 2. Otherwise, if a (Polaris-specific) correlation ID header is > > >>>> present > > >>>>>>> in the request, use it. > > >>>>>>> 3. If neither of the above conditions is met, generate a random > > >>>>>>> correlation ID. > > >>>>>>> > > >>>>>>> I am somewhat undecided on the best approach when a correlation > > >> ID > > >>>>>>> header is present in the request. However, I believe it would be > > >>> more > > >>>>>>> sensible to disregard it if OTel is enabled, as OTel offers a > > >> more > > >>>>>>> robust solution for client-to-server trace propagation, e.g. W3C > > >>>> Trace > > >>>>>>> Context propagation [2]. > > >>>>>>> > > >>>>>>> Please share your thoughts! > > >>>>>>> > > >>>>>>> Thanks, > > >>>>>>> Alex > > >>>>>>> > > >>>>>>> [1]: > https://www.google.com/url?q=https://github.com/apache/polaris/pull/2757&source=gmail-imap&ust=1761851831000000&usg=AOvVaw1-kAWfEk4tmsEg0q0GZBCn > > >>>>>>> [2]: > https://www.google.com/url?q=https://www.w3.org/TR/trace-context&source=gmail-imap&ust=1761851831000000&usg=AOvVaw22nMyOS7pbJ69XrBo5kHQS > > >>>>>>> > > >>>>> > > >>>> > > >>> > > >> > > >
