Hi all, +1 to all of Alex’s AIs with Dmitri’s suggested changes as well. Great find, Dmitri!
I’m still debating with myself on whether we need to store the `x-request-id` field as part of the Events persistence. Can we think of a good use case where this is more helpful to the user than the OTel Trace/Span IDs? I am making the assumption here that those are still being returned back to the client. Best, Adnan Hemani > On Oct 24, 2025, at 9:12 AM, Alexandre Dutra <[email protected]> wrote: > > Hi Dmitri, > > Yes, I think your suggestion to use just one field w3c_trace_context > makes more sense than two fields (span_id and trace_id). > > With that, I also think that we are slowly drifting into > implementation considerations; let's get consensus on the general > design first, and we can certainly fine-tune the actual Java methods > and SQL table columns in the future PR. WDYT? > > Thanks, > Alex > > On Fri, Oct 24, 2025 at 5:14 PM Dmitri Bourlatchkov <[email protected]> wrote: >> >> Hi All, >> >> Many thanks for the background info, Adnan! >> >> +1 to action items proposed by Alex! >> >> Re: (3) Can we abstract request/trace info into a separate object without >> exposing those accessors on the Event class directly? OTel defines >> trace/span concepts, but in request ID is a bit foreign to OTel. Having >> tracing / request ID isolated in java could help with maintaining it and >> potentially supporting other (custom) tracing methods. >> >> Re: (4) I'd like to propose storing OTel correlation data in the form of a >> standard context propagation string, e.g. W3C trace-context [1] (same value >> as its HTTP header), so the column could be called w3c_trace_context or >> simply trace_context. >> >> Open question: do we need to write a separate, individual trace ID field in >> SQL? I suppose it is not very useful since correlating it to other trace >> data already requires understanding OTel context propagation and a query >> against trace_context can still be made using string-matching clauses. We >> could probably (additionally) store it in the request_id column if the >> Polaris-specific request ID header is not set. >> >> As for span ID, I do not really see a use case for persisting it >> individually. It is very specific to OTel trace data construction. >> >> Actually, using the W3C trace context [1] encoding probably makes sense in >> the java event representation too. Interested callers can easily decode >> this information since the format is well-defined. As a side benefit, this >> opens opportunities for downstream event consumers to connect (propagate >> context) to traces that produced events based on the event data itself, >> without relying on the intermediate frameworks. This may be desirable since >> the current Polaris event persistence impl. writes events in batches, so >> the association to individual requests that produced those events is lost. >> Whether to perform this kind of context propagation will be at the >> discretion of the event consumer, of course (outside of Polaris code). >> >> [1] >> https://www.google.com/url?q=https://www.w3.org/TR/trace-context/&source=gmail-imap&ust=1761927167000000&usg=AOvVaw2fn0lRsTx-f9r4PCz8wmJK >> >> WDYT? >> >> Thanks, >> Dmitri. >> >> On Fri, Oct 24, 2025 at 7:18 AM Alexandre Dutra <[email protected]> wrote: >> >>> Hi all, >>> >>> Thank you for chiming in; the context around request IDs is now clear. >>> >>> Trying to summarize this thread into actionable items, here's what I >>> propose: >>> >>> 1. Restore the original functionality for request IDs. >>> - Change the default header name back to x-request-id (despite the >>> x- prefix being deprecated), but keep it configurable as today. >>> 2. Remove RequestIdGenerator and related functionality. >>> - Do not generate a request ID if the REST client doesn't provide one. >>> 3. Update PolarisEvent: >>> - Expose new requestId(), traceId(), spanId() methods, all nullable. >>> - This would align with the emerging consensus around including >>> contextual information in PolarisEvent [1]. >>> 4. Update events table SQL schema: >>> - Insert the client-provided request ID into the request_id >>> column; otherwise, insert null. >>> - Add two new nullable columns, trace_id and span_id, and populate >>> them if OTel is enabled. >>> >>> From our discussions, I think it's important to not conflate OTel >>> tracing fields with Envoy tracing fields, which is why I suggest we >>> use separate fields / columns for them. >>> >>> Would the above plan work for everyone? >>> >>> Thanks, >>> Alex >>> >>> [1]: >>> https://www.google.com/url?q=https://lists.apache.org/thread/rl5cpcft16sn5n00mfkmx9ldn3gsqtfy&source=gmail-imap&ust=1761927167000000&usg=AOvVaw02wPvb0qxRzYAKEP0h8l9T >>> >>> >>> On Fri, Oct 24, 2025 at 9:33 AM Adnan Hemani >>> <[email protected]> wrote: >>>> >>>> Hi all, >>>> >>>> Thanks to Alex for starting this thread - because of this, I’m just >>> coming to the realization that OTel Trace and Span IDs are coming built-in >>> with Quarkus and my previous work to generate a Request/Correlation ID is >>> likely not needed as a result. My original motivation for generation of a >>> Request/Correlation ID was to ensure that any client can uniquely identify >>> a request made to Polaris, which would be especially useful for debugging >>> failing requests or identifying call patterns. >>>> >>>> As a result, I’m a +1 on Michael’s opinion: we should remove the >>> Request/Correlation ID generation and always use the OTel trace/span IDs >>> (which come for free with Quarkus) instead for the Correlation ID unless a >>> valid header is present, which would take over as the Correlation ID >>> instead. >>>> >>>> — >>>> >>>> To answer Dmitri’s question re Polaris Events: The intended use case is >>> to provide some sort of correlation between events that have occurred as >>> part of the same request. For example, if a user makes an CommitTransaction >>> request, it would be helpful to see all UpdateTable calls that were made as >>> part of that one user request. >>>> >>>> Best, >>>> Adnan Hemani >>>> >>>>> On Oct 23, 2025, at 12:15 PM, Dmitri Bourlatchkov <[email protected]> >>> wrote: >>>>> >>>>> Hi Michael, >>>>> >>>>> Logging x-request-id headers makes sense. >>>>> >>>>> Just to confirm: if / when we restore that, Polaris will NOT generate >>> new >>>>> IDs in case the header is not present in the request, correct? >>>>> >>>>> I believe x-request-id can co-exist with OTel. >>>>> >>>>> What about adding request IDs to events [1][2]? What's the intended use >>>>> case for that? Could you share some context here too? >>>>> >>>>> Side note: I proposed [2877] flagging event persistence as "beta" in >>>>> 1.2.0... This discussion adds another point towards that, I think. >>>>> >>>>> [1] >>>>> >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/2f0c7a43d446452004ea51196b618de9bdf0e25b/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/inmemory/InMemoryBufferEventListener.java%2523L97%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1WfUaXLp6z_M87iAXEqSUw&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1Pioys4ROm8vYM_mdx4ygH >>>>> [2] >>>>> >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/19742cc20f4bc0b7e5a315a62f89c6085ad81b7d/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/PolarisPersistenceEventListener.java%2523L66%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw12-7e3ahm2sLSkLSNqTecm&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1PGW6uFwtUN8F1dOlJwrpm >>>>> >>>>> [2877] >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/2877%2523discussion_r2456300613%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw3TuYbkzwnLx3QEVIM8oDda&source=gmail-imap&ust=1761927167000000&usg=AOvVaw2Tfk7wAM1MaB5z9Dvw5X5H >>>>> >>>>> Thanks, >>>>> Dmitri. >>>>> >>>>> On Thu, Oct 23, 2025 at 2:23 PM Michael Collado < >>> [email protected]> >>>>> wrote: >>>>> >>>>>> Hey Dmitri >>>>>> >>>>>> The generating a request id is new code that was added after the >>> original >>>>>> x-request-id support. You can see the state from ~1 year ago, we >>> hard-coded >>>>>> request_id as the header we used for the MDC - >>>>>> >>>>>> >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/a6197bd7d8cb5551253fa427e4373897205ecece/polaris-service/src/main/java/org/apache/polaris/service/PolarisApplication.java%2523L415-L416%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw35Q1A_2avAiSYlYVAnZxBb&source=gmail-imap&ust=1761927167000000&usg=AOvVaw34DksOw7DSHJu8PfQQ5bT6 >>>>>> . At some point, it was changed to be configurable, then the >>>>>> ContextResolverFilter filter was refactored/eliminated and the >>>>>> RequestIdFilter took its responsibility, but lost some of its original >>>>>> functionality. >>>>>> >>>>>> My personal opinion is that restoring support for the x-request-id >>> header >>>>>> is something that we should do, but if the header isn't present, >>> falling >>>>>> back on simply using OTel trace ids is good enough (better, even) than >>>>>> generating another random request id. >>>>>> >>>>>> Mike >>>>>> >>>>>> On Thu, Oct 23, 2025 at 10:47 AM Dmitri Bourlatchkov < >>> [email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Michael, >>>>>>> >>>>>>> Thanks for the info! >>>>>>> >>>>>>> Working with Envoy's tracing headers makes sense to me. However, I >>>>>> wonder: >>>>>>> why would Polaris need to generate a new request ID inside its >>> code?.. >>>>>>> and return it in response headers? >>>>>>> >>>>>>> How important is it to propagate this ID to Polaris Events? >>>>>>> >>>>>>> I'm just trying to understand the full context :) >>>>>>> >>>>>>> Thanks, >>>>>>> Dmitri. >>>>>>> >>>>>>> On Thu, Oct 23, 2025 at 1:29 PM Michael Collado < >>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> I think the original intention for this requestId field was to >>> support >>>>>>>> request propagation from load balancers, like Envoy ( >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/observability/tracing%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1RnuM8mViV-j7jvuxq74Aw&source=gmail-imap&ust=1761927167000000&usg=AOvVaw3LNqtilZ2OoJfE7yEwraZa >>>>>>>> ), which is distinct from OTEL. Don't ask me why the default >>>>>>>> is Polaris-Request-Id - I think it was originally a custom thing, >>> but >>>>>>> then >>>>>>>> we changed to integrate with existing conventions. Unfortunately, >>>>>> looking >>>>>>>> through the code, I think that the actual functional plumbing has >>> been >>>>>>> lost >>>>>>>> in the course of multiple refactors around the call context and >>>>>>> resolver. I >>>>>>>> don't see references to that property or the underlying header. >>>>>>>> >>>>>>>> Support for the unofficial x-request-id header feels like something >>> we >>>>>>>> should definitely keep, especially when Polaris is one service in a >>>>>> mesh >>>>>>> of >>>>>>>> services that maybe don't have OTel integration. I'm a fan of the >>> OTel >>>>>>>> standard, but it's not entirely ubiquitous and there are many >>>>>> middleware >>>>>>>> layers that know how to forward on the x-request-id header. >>>>>>>> >>>>>>>> Mike >>>>>>>> >>>>>>>> On Thu, Oct 23, 2025 at 3:00 AM Robert Stupp <[email protected]> >>> wrote: >>>>>>>> >>>>>>>>> Yes, we should aim for interoperability with the existing de-facto >>>>>>>>> standard OTel and make it easy for users to integrate into their >>>>>>>>> observability platforms. >>>>>>>>> >>>>>>>>> On Wed, Oct 22, 2025 at 7:05 PM Dmitri Bourlatchkov < >>>>>> [email protected]> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi Alex and All, >>>>>>>>>> >>>>>>>>>> I certainly support the idea of following OTel standards for >>>>>>> achieving >>>>>>>>>> "correlation" wrt Polaris requests and/or events. >>>>>>>>>> >>>>>>>>>> As to what form the correlation data should take, I believe it is >>>>>>>>>> conceptually what the OTel "context" represents. So, I believe it >>>>>>> makes >>>>>>>>>> sense for Polaris to support standard context propagators at the >>>>>> API >>>>>>>>> layer. >>>>>>>>>> >>>>>>>>>> If the incoming request has OTel context information, then >>>>>> returning >>>>>>>> any >>>>>>>>>> other "correlation" data in the response is redundant, I think. >>>>>>>>>> >>>>>>>>>> If the incoming request does not have OTel context info, what is >>>>>> the >>>>>>>>>> purpose of generating a Polaris-specific "correlation ID"? How is >>>>>> it >>>>>>>>>> envisioned to be used? >>>>>>>>>> >>>>>>>>>> If the intention is to correlate a Polaris response (operation) >>>>>> with >>>>>>>>> events >>>>>>>>>> that might have resulted from its execution, I believe a more >>>>>> robust >>>>>>>>>> approach would be to propagate the OTel trace info (starting a new >>>>>>>> trace >>>>>>>>> if >>>>>>>>>> necessary) into event data. Then, Polaris can also return the >>> trace >>>>>>>>> context >>>>>>>>>> in the API response (top span). It's a bit awkward from the OTel >>>>>>>>>> perspective, but might be an option for supporting custom >>>>>>> correlators. >>>>>>>> It >>>>>>>>>> could be covered by a feature flag. The header name could be >>>>>>>>>> "polaris-traceparent" for W3C Trace Context. >>>>>>>>>> >>>>>>>>>> Custom correlation code will be able to extract the trace ID from >>>>>> the >>>>>>>>>> response and from events and find related data. Granted, it will >>>>>>>> require >>>>>>>>> a >>>>>>>>>> bit more effort for the custom code to decode trace IDs from the >>>>>> OTel >>>>>>>>>> context, but the format is standard and not complex. The benefit >>>>>> for >>>>>>>>>> Polaris, though, is that it can easily integrate with >>>>>> OTel-compatible >>>>>>>>>> observability platforms regardless of whether any particular >>>>>>> deployment >>>>>>>>>> uses custom correlators or not. >>>>>>>>>> >>>>>>>>>> WDYT? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Dmitri. >>>>>>>>>> >>>>>>>>>> On Wed, Oct 22, 2025 at 6:03 AM Alexandre Dutra < >>> [email protected] >>>>>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi all, >>>>>>>>>>> >>>>>>>>>>> Today, Polaris has the notion of "request ID", but its purpose is >>>>>>> not >>>>>>>>>>> entirely clear. It seems to serve as an observability feature to >>>>>>>>>>> facilitate correlation. A pending PR aims to rename it to >>>>>>>> "correlation >>>>>>>>>>> ID" for better alignment with industry standards [1]. >>>>>>>>>>> >>>>>>>>>>> However, this PR has brought to light overlaps with core >>>>>> telemetry >>>>>>>>>>> features: when OpenTelemetry (OTel) is enabled in Polaris, each >>>>>>>>>>> request already has a trace ID and span ID, making a separate >>>>>>>>>>> correlation ID redundant. >>>>>>>>>>> >>>>>>>>>>> Moreover, using the OTel trace ID and span ID in Polaris events, >>>>>>>>>>> rather than the generated correlation ID, would significantly >>>>>>>> simplify >>>>>>>>>>> correlation of events with other traces. >>>>>>>>>>> >>>>>>>>>>> Therefore, I propose the following changes: >>>>>>>>>>> >>>>>>>>>>> 1. If OTel is enabled, use the trace ID and span ID as the >>>>>>>> correlation >>>>>>>>>>> ID for the request, instead of generating a random correlation >>>>>> ID. >>>>>>>>>>> 2. Otherwise, if a (Polaris-specific) correlation ID header is >>>>>>>> present >>>>>>>>>>> in the request, use it. >>>>>>>>>>> 3. If neither of the above conditions is met, generate a random >>>>>>>>>>> correlation ID. >>>>>>>>>>> >>>>>>>>>>> I am somewhat undecided on the best approach when a correlation >>>>>> ID >>>>>>>>>>> header is present in the request. However, I believe it would be >>>>>>> more >>>>>>>>>>> sensible to disregard it if OTel is enabled, as OTel offers a >>>>>> more >>>>>>>>>>> robust solution for client-to-server trace propagation, e.g. W3C >>>>>>>> Trace >>>>>>>>>>> Context propagation [2]. >>>>>>>>>>> >>>>>>>>>>> Please share your thoughts! >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Alex >>>>>>>>>>> >>>>>>>>>>> [1]: >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/2757%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1-kAWfEk4tmsEg0q0GZBCn&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1Oe-25vtt4gLVSMFMtSVNg >>>>>>>>>>> [2]: >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.w3.org/TR/trace-context%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw22nMyOS7pbJ69XrBo5kHQS&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1TRdkzABc_7U-_KZ1MZ59v >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >>>
