Hi Yeah it looks good to me and I agree we have a consensus.
Thanks ! Regards JB Le dim. 26 oct. 2025 à 14:29, Alexandre Dutra <[email protected]> a écrit : > Hi all, > > Thanks for the good discussion so far. I think we have a broad consensus. > > Unless anybody disagrees I am going to move forward with the the > following revised plan, incorporating the latest feedback: > > 1. Restore the original functionality for request IDs, change the > default header name back to x-request-id > > 2. Remove RequestIdGenerator and related functionality. > > 3. Update PolarisEvent, expose request ID if available, expose OTel > context if available. > > 4. Update events table SQL schema: insert request ID if available, > insert OTel context if available. > > Thanks, > Alex > > On Sat, Oct 25, 2025 at 11:12 AM Jean-Baptiste Onofré <[email protected]> > wrote: > > > > Hi > > > > I think it's better to use w3c_trace_context (better than two fields). > > > > Regarding x-request-id, I think it's for a different purpose, > > specifically for persistence. Personally, I think we can achieve the > > same with span and events. > > If we want to "distinguish" the two layers, it makes sense to have > > x-request-id (not sure it's super helpful). Else, I think we can > > achieve the same with span/event. > > > > Just my $0.01 > > > > Regards > > JB > > > > On Sat, Oct 25, 2025 at 10:42 AM Adnan Hemani > > <[email protected]> wrote: > > > > > > Hi all, > > > > > > +1 to all of Alex’s AIs with Dmitri’s suggested changes as well. Great > find, Dmitri! > > > > > > I’m still debating with myself on whether we need to store the > `x-request-id` field as part of the Events persistence. Can we think of a > good use case where this is more helpful to the user than the OTel > Trace/Span IDs? I am making the assumption here that those are still being > returned back to the client. > > > > > > Best, > > > Adnan Hemani > > > > > > > On Oct 24, 2025, at 9:12 AM, Alexandre Dutra <[email protected]> > wrote: > > > > > > > > Hi Dmitri, > > > > > > > > Yes, I think your suggestion to use just one field w3c_trace_context > > > > makes more sense than two fields (span_id and trace_id). > > > > > > > > With that, I also think that we are slowly drifting into > > > > implementation considerations; let's get consensus on the general > > > > design first, and we can certainly fine-tune the actual Java methods > > > > and SQL table columns in the future PR. WDYT? > > > > > > > > Thanks, > > > > Alex > > > > > > > > On Fri, Oct 24, 2025 at 5:14 PM Dmitri Bourlatchkov < > [email protected]> wrote: > > > >> > > > >> Hi All, > > > >> > > > >> Many thanks for the background info, Adnan! > > > >> > > > >> +1 to action items proposed by Alex! > > > >> > > > >> Re: (3) Can we abstract request/trace info into a separate object > without > > > >> exposing those accessors on the Event class directly? OTel defines > > > >> trace/span concepts, but in request ID is a bit foreign to OTel. > Having > > > >> tracing / request ID isolated in java could help with maintaining > it and > > > >> potentially supporting other (custom) tracing methods. > > > >> > > > >> Re: (4) I'd like to propose storing OTel correlation data in the > form of a > > > >> standard context propagation string, e.g. W3C trace-context [1] > (same value > > > >> as its HTTP header), so the column could be called > w3c_trace_context or > > > >> simply trace_context. > > > >> > > > >> Open question: do we need to write a separate, individual trace ID > field in > > > >> SQL? I suppose it is not very useful since correlating it to other > trace > > > >> data already requires understanding OTel context propagation and a > query > > > >> against trace_context can still be made using string-matching > clauses. We > > > >> could probably (additionally) store it in the request_id column if > the > > > >> Polaris-specific request ID header is not set. > > > >> > > > >> As for span ID, I do not really see a use case for persisting it > > > >> individually. It is very specific to OTel trace data construction. > > > >> > > > >> Actually, using the W3C trace context [1] encoding probably makes > sense in > > > >> the java event representation too. Interested callers can easily > decode > > > >> this information since the format is well-defined. As a side > benefit, this > > > >> opens opportunities for downstream event consumers to connect > (propagate > > > >> context) to traces that produced events based on the event data > itself, > > > >> without relying on the intermediate frameworks. This may be > desirable since > > > >> the current Polaris event persistence impl. writes events in > batches, so > > > >> the association to individual requests that produced those events > is lost. > > > >> Whether to perform this kind of context propagation will be at the > > > >> discretion of the event consumer, of course (outside of Polaris > code). > > > >> > > > >> [1] > https://www.google.com/url?q=https://www.w3.org/TR/trace-context/&source=gmail-imap&ust=1761927167000000&usg=AOvVaw2fn0lRsTx-f9r4PCz8wmJK > > > >> > > > >> WDYT? > > > >> > > > >> Thanks, > > > >> Dmitri. > > > >> > > > >> On Fri, Oct 24, 2025 at 7:18 AM Alexandre Dutra <[email protected]> > wrote: > > > >> > > > >>> Hi all, > > > >>> > > > >>> Thank you for chiming in; the context around request IDs is now > clear. > > > >>> > > > >>> Trying to summarize this thread into actionable items, here's what > I > > > >>> propose: > > > >>> > > > >>> 1. Restore the original functionality for request IDs. > > > >>> - Change the default header name back to x-request-id (despite > the > > > >>> x- prefix being deprecated), but keep it configurable as today. > > > >>> 2. Remove RequestIdGenerator and related functionality. > > > >>> - Do not generate a request ID if the REST client doesn't > provide one. > > > >>> 3. Update PolarisEvent: > > > >>> - Expose new requestId(), traceId(), spanId() methods, all > nullable. > > > >>> - This would align with the emerging consensus around including > > > >>> contextual information in PolarisEvent [1]. > > > >>> 4. Update events table SQL schema: > > > >>> - Insert the client-provided request ID into the request_id > > > >>> column; otherwise, insert null. > > > >>> - Add two new nullable columns, trace_id and span_id, and > populate > > > >>> them if OTel is enabled. > > > >>> > > > >>> From our discussions, I think it's important to not conflate OTel > > > >>> tracing fields with Envoy tracing fields, which is why I suggest we > > > >>> use separate fields / columns for them. > > > >>> > > > >>> Would the above plan work for everyone? > > > >>> > > > >>> Thanks, > > > >>> Alex > > > >>> > > > >>> [1]: > https://www.google.com/url?q=https://lists.apache.org/thread/rl5cpcft16sn5n00mfkmx9ldn3gsqtfy&source=gmail-imap&ust=1761927167000000&usg=AOvVaw02wPvb0qxRzYAKEP0h8l9T > > > >>> > > > >>> > > > >>> On Fri, Oct 24, 2025 at 9:33 AM Adnan Hemani > > > >>> <[email protected]> wrote: > > > >>>> > > > >>>> Hi all, > > > >>>> > > > >>>> Thanks to Alex for starting this thread - because of this, I’m > just > > > >>> coming to the realization that OTel Trace and Span IDs are coming > built-in > > > >>> with Quarkus and my previous work to generate a > Request/Correlation ID is > > > >>> likely not needed as a result. My original motivation for > generation of a > > > >>> Request/Correlation ID was to ensure that any client can uniquely > identify > > > >>> a request made to Polaris, which would be especially useful for > debugging > > > >>> failing requests or identifying call patterns. > > > >>>> > > > >>>> As a result, I’m a +1 on Michael’s opinion: we should remove the > > > >>> Request/Correlation ID generation and always use the OTel > trace/span IDs > > > >>> (which come for free with Quarkus) instead for the Correlation ID > unless a > > > >>> valid header is present, which would take over as the Correlation > ID > > > >>> instead. > > > >>>> > > > >>>> — > > > >>>> > > > >>>> To answer Dmitri’s question re Polaris Events: The intended use > case is > > > >>> to provide some sort of correlation between events that have > occurred as > > > >>> part of the same request. For example, if a user makes an > CommitTransaction > > > >>> request, it would be helpful to see all UpdateTable calls that > were made as > > > >>> part of that one user request. > > > >>>> > > > >>>> Best, > > > >>>> Adnan Hemani > > > >>>> > > > >>>>> On Oct 23, 2025, at 12:15 PM, Dmitri Bourlatchkov < > [email protected]> > > > >>> wrote: > > > >>>>> > > > >>>>> Hi Michael, > > > >>>>> > > > >>>>> Logging x-request-id headers makes sense. > > > >>>>> > > > >>>>> Just to confirm: if / when we restore that, Polaris will NOT > generate > > > >>> new > > > >>>>> IDs in case the header is not present in the request, correct? > > > >>>>> > > > >>>>> I believe x-request-id can co-exist with OTel. > > > >>>>> > > > >>>>> What about adding request IDs to events [1][2]? What's the > intended use > > > >>>>> case for that? Could you share some context here too? > > > >>>>> > > > >>>>> Side note: I proposed [2877] flagging event persistence as > "beta" in > > > >>>>> 1.2.0... This discussion adds another point towards that, I > think. > > > >>>>> > > > >>>>> [1] > > > >>>>> > > > >>> > https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/2f0c7a43d446452004ea51196b618de9bdf0e25b/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/inmemory/InMemoryBufferEventListener.java%2523L97%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1WfUaXLp6z_M87iAXEqSUw&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1Pioys4ROm8vYM_mdx4ygH > > > >>>>> [2] > > > >>>>> > > > >>> > https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/19742cc20f4bc0b7e5a315a62f89c6085ad81b7d/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/PolarisPersistenceEventListener.java%2523L66%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw12-7e3ahm2sLSkLSNqTecm&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1PGW6uFwtUN8F1dOlJwrpm > > > >>>>> > > > >>>>> [2877] > > > >>> > https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/2877%2523discussion_r2456300613%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw3TuYbkzwnLx3QEVIM8oDda&source=gmail-imap&ust=1761927167000000&usg=AOvVaw2Tfk7wAM1MaB5z9Dvw5X5H > > > >>>>> > > > >>>>> Thanks, > > > >>>>> Dmitri. > > > >>>>> > > > >>>>> On Thu, Oct 23, 2025 at 2:23 PM Michael Collado < > > > >>> [email protected]> > > > >>>>> wrote: > > > >>>>> > > > >>>>>> Hey Dmitri > > > >>>>>> > > > >>>>>> The generating a request id is new code that was added after the > > > >>> original > > > >>>>>> x-request-id support. You can see the state from ~1 year ago, we > > > >>> hard-coded > > > >>>>>> request_id as the header we used for the MDC - > > > >>>>>> > > > >>>>>> > > > >>> > https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/a6197bd7d8cb5551253fa427e4373897205ecece/polaris-service/src/main/java/org/apache/polaris/service/PolarisApplication.java%2523L415-L416%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw35Q1A_2avAiSYlYVAnZxBb&source=gmail-imap&ust=1761927167000000&usg=AOvVaw34DksOw7DSHJu8PfQQ5bT6 > > > >>>>>> . At some point, it was changed to be configurable, then the > > > >>>>>> ContextResolverFilter filter was refactored/eliminated and the > > > >>>>>> RequestIdFilter took its responsibility, but lost some of its > original > > > >>>>>> functionality. > > > >>>>>> > > > >>>>>> My personal opinion is that restoring support for the > x-request-id > > > >>> header > > > >>>>>> is something that we should do, but if the header isn't present, > > > >>> falling > > > >>>>>> back on simply using OTel trace ids is good enough (better, > even) than > > > >>>>>> generating another random request id. > > > >>>>>> > > > >>>>>> Mike > > > >>>>>> > > > >>>>>> On Thu, Oct 23, 2025 at 10:47 AM Dmitri Bourlatchkov < > > > >>> [email protected]> > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>>> Hi Michael, > > > >>>>>>> > > > >>>>>>> Thanks for the info! > > > >>>>>>> > > > >>>>>>> Working with Envoy's tracing headers makes sense to me. > However, I > > > >>>>>> wonder: > > > >>>>>>> why would Polaris need to generate a new request ID inside its > > > >>> code?.. > > > >>>>>>> and return it in response headers? > > > >>>>>>> > > > >>>>>>> How important is it to propagate this ID to Polaris Events? > > > >>>>>>> > > > >>>>>>> I'm just trying to understand the full context :) > > > >>>>>>> > > > >>>>>>> Thanks, > > > >>>>>>> Dmitri. > > > >>>>>>> > > > >>>>>>> On Thu, Oct 23, 2025 at 1:29 PM Michael Collado < > > > >>> [email protected]> > > > >>>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> I think the original intention for this requestId field was to > > > >>> support > > > >>>>>>>> request propagation from load balancers, like Envoy ( > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>> > https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/observability/tracing%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1RnuM8mViV-j7jvuxq74Aw&source=gmail-imap&ust=1761927167000000&usg=AOvVaw3LNqtilZ2OoJfE7yEwraZa > > > >>>>>>>> ), which is distinct from OTEL. Don't ask me why the default > > > >>>>>>>> is Polaris-Request-Id - I think it was originally a custom > thing, > > > >>> but > > > >>>>>>> then > > > >>>>>>>> we changed to integrate with existing conventions. > Unfortunately, > > > >>>>>> looking > > > >>>>>>>> through the code, I think that the actual functional plumbing > has > > > >>> been > > > >>>>>>> lost > > > >>>>>>>> in the course of multiple refactors around the call context > and > > > >>>>>>> resolver. I > > > >>>>>>>> don't see references to that property or the underlying > header. > > > >>>>>>>> > > > >>>>>>>> Support for the unofficial x-request-id header feels like > something > > > >>> we > > > >>>>>>>> should definitely keep, especially when Polaris is one > service in a > > > >>>>>> mesh > > > >>>>>>> of > > > >>>>>>>> services that maybe don't have OTel integration. I'm a fan of > the > > > >>> OTel > > > >>>>>>>> standard, but it's not entirely ubiquitous and there are many > > > >>>>>> middleware > > > >>>>>>>> layers that know how to forward on the x-request-id header. > > > >>>>>>>> > > > >>>>>>>> Mike > > > >>>>>>>> > > > >>>>>>>> On Thu, Oct 23, 2025 at 3:00 AM Robert Stupp <[email protected]> > > > >>> wrote: > > > >>>>>>>> > > > >>>>>>>>> Yes, we should aim for interoperability with the existing > de-facto > > > >>>>>>>>> standard OTel and make it easy for users to integrate into > their > > > >>>>>>>>> observability platforms. > > > >>>>>>>>> > > > >>>>>>>>> On Wed, Oct 22, 2025 at 7:05 PM Dmitri Bourlatchkov < > > > >>>>>> [email protected]> > > > >>>>>>>>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>> Hi Alex and All, > > > >>>>>>>>>> > > > >>>>>>>>>> I certainly support the idea of following OTel standards for > > > >>>>>>> achieving > > > >>>>>>>>>> "correlation" wrt Polaris requests and/or events. > > > >>>>>>>>>> > > > >>>>>>>>>> As to what form the correlation data should take, I believe > it is > > > >>>>>>>>>> conceptually what the OTel "context" represents. So, I > believe it > > > >>>>>>> makes > > > >>>>>>>>>> sense for Polaris to support standard context propagators > at the > > > >>>>>> API > > > >>>>>>>>> layer. > > > >>>>>>>>>> > > > >>>>>>>>>> If the incoming request has OTel context information, then > > > >>>>>> returning > > > >>>>>>>> any > > > >>>>>>>>>> other "correlation" data in the response is redundant, I > think. > > > >>>>>>>>>> > > > >>>>>>>>>> If the incoming request does not have OTel context info, > what is > > > >>>>>> the > > > >>>>>>>>>> purpose of generating a Polaris-specific "correlation ID"? > How is > > > >>>>>> it > > > >>>>>>>>>> envisioned to be used? > > > >>>>>>>>>> > > > >>>>>>>>>> If the intention is to correlate a Polaris response > (operation) > > > >>>>>> with > > > >>>>>>>>> events > > > >>>>>>>>>> that might have resulted from its execution, I believe a > more > > > >>>>>> robust > > > >>>>>>>>>> approach would be to propagate the OTel trace info > (starting a new > > > >>>>>>>> trace > > > >>>>>>>>> if > > > >>>>>>>>>> necessary) into event data. Then, Polaris can also return > the > > > >>> trace > > > >>>>>>>>> context > > > >>>>>>>>>> in the API response (top span). It's a bit awkward from the > OTel > > > >>>>>>>>>> perspective, but might be an option for supporting custom > > > >>>>>>> correlators. > > > >>>>>>>> It > > > >>>>>>>>>> could be covered by a feature flag. The header name could be > > > >>>>>>>>>> "polaris-traceparent" for W3C Trace Context. > > > >>>>>>>>>> > > > >>>>>>>>>> Custom correlation code will be able to extract the trace > ID from > > > >>>>>> the > > > >>>>>>>>>> response and from events and find related data. Granted, it > will > > > >>>>>>>> require > > > >>>>>>>>> a > > > >>>>>>>>>> bit more effort for the custom code to decode trace IDs > from the > > > >>>>>> OTel > > > >>>>>>>>>> context, but the format is standard and not complex. The > benefit > > > >>>>>> for > > > >>>>>>>>>> Polaris, though, is that it can easily integrate with > > > >>>>>> OTel-compatible > > > >>>>>>>>>> observability platforms regardless of whether any particular > > > >>>>>>> deployment > > > >>>>>>>>>> uses custom correlators or not. > > > >>>>>>>>>> > > > >>>>>>>>>> WDYT? > > > >>>>>>>>>> > > > >>>>>>>>>> Thanks, > > > >>>>>>>>>> Dmitri. > > > >>>>>>>>>> > > > >>>>>>>>>> On Wed, Oct 22, 2025 at 6:03 AM Alexandre Dutra < > > > >>> [email protected] > > > >>>>>>> > > > >>>>>>>>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>>> Hi all, > > > >>>>>>>>>>> > > > >>>>>>>>>>> Today, Polaris has the notion of "request ID", but its > purpose is > > > >>>>>>> not > > > >>>>>>>>>>> entirely clear. It seems to serve as an observability > feature to > > > >>>>>>>>>>> facilitate correlation. A pending PR aims to rename it to > > > >>>>>>>> "correlation > > > >>>>>>>>>>> ID" for better alignment with industry standards [1]. > > > >>>>>>>>>>> > > > >>>>>>>>>>> However, this PR has brought to light overlaps with core > > > >>>>>> telemetry > > > >>>>>>>>>>> features: when OpenTelemetry (OTel) is enabled in Polaris, > each > > > >>>>>>>>>>> request already has a trace ID and span ID, making a > separate > > > >>>>>>>>>>> correlation ID redundant. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Moreover, using the OTel trace ID and span ID in Polaris > events, > > > >>>>>>>>>>> rather than the generated correlation ID, would > significantly > > > >>>>>>>> simplify > > > >>>>>>>>>>> correlation of events with other traces. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Therefore, I propose the following changes: > > > >>>>>>>>>>> > > > >>>>>>>>>>> 1. If OTel is enabled, use the trace ID and span ID as the > > > >>>>>>>> correlation > > > >>>>>>>>>>> ID for the request, instead of generating a random > correlation > > > >>>>>> ID. > > > >>>>>>>>>>> 2. Otherwise, if a (Polaris-specific) correlation ID > header is > > > >>>>>>>> present > > > >>>>>>>>>>> in the request, use it. > > > >>>>>>>>>>> 3. If neither of the above conditions is met, generate a > random > > > >>>>>>>>>>> correlation ID. > > > >>>>>>>>>>> > > > >>>>>>>>>>> I am somewhat undecided on the best approach when a > correlation > > > >>>>>> ID > > > >>>>>>>>>>> header is present in the request. However, I believe it > would be > > > >>>>>>> more > > > >>>>>>>>>>> sensible to disregard it if OTel is enabled, as OTel > offers a > > > >>>>>> more > > > >>>>>>>>>>> robust solution for client-to-server trace propagation, > e.g. W3C > > > >>>>>>>> Trace > > > >>>>>>>>>>> Context propagation [2]. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Please share your thoughts! > > > >>>>>>>>>>> > > > >>>>>>>>>>> Thanks, > > > >>>>>>>>>>> Alex > > > >>>>>>>>>>> > > > >>>>>>>>>>> [1]: > > > >>> > https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/2757%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1-kAWfEk4tmsEg0q0GZBCn&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1Oe-25vtt4gLVSMFMtSVNg > > > >>>>>>>>>>> [2]: > > > >>> > https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.w3.org/TR/trace-context%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw22nMyOS7pbJ69XrBo5kHQS&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1TRdkzABc_7U-_KZ1MZ59v > > > >>>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>> > > > >>> > > > >
