Hi all,

Thanks for the good discussion so far. I think we have a broad consensus.

Unless anybody disagrees I am going to move forward with the the
following revised plan, incorporating the latest feedback:

1. Restore the original functionality for request IDs, change the
default header name back to x-request-id

2. Remove RequestIdGenerator and related functionality.

3. Update PolarisEvent, expose request ID if available, expose OTel
context if available.

4. Update events table SQL schema: insert request ID if available,
insert OTel context if available.

Thanks,
Alex

On Sat, Oct 25, 2025 at 11:12 AM Jean-Baptiste Onofré <[email protected]> wrote:
>
> Hi
>
> I think it's better to use w3c_trace_context (better than two fields).
>
> Regarding x-request-id, I think it's for a different purpose,
> specifically for persistence. Personally, I think we can achieve the
> same with span and events.
> If we want to "distinguish" the two layers, it makes sense to have
> x-request-id (not sure it's super helpful). Else, I think we can
> achieve the same with span/event.
>
> Just my $0.01
>
> Regards
> JB
>
> On Sat, Oct 25, 2025 at 10:42 AM Adnan Hemani
> <[email protected]> wrote:
> >
> > Hi all,
> >
> > +1 to all of Alex’s AIs with Dmitri’s suggested changes as well. Great 
> > find, Dmitri!
> >
> > I’m still debating with myself on whether we need to store the 
> > `x-request-id` field as part of the Events persistence. Can we think of a 
> > good use case where this is more helpful to the user than the OTel 
> > Trace/Span IDs? I am making the assumption here that those are still being 
> > returned back to the client.
> >
> > Best,
> > Adnan Hemani
> >
> > > On Oct 24, 2025, at 9:12 AM, Alexandre Dutra <[email protected]> wrote:
> > >
> > > Hi Dmitri,
> > >
> > > Yes, I think your suggestion to use just one field w3c_trace_context
> > > makes more sense than two fields (span_id and trace_id).
> > >
> > > With that, I also think that we are slowly drifting into
> > > implementation considerations; let's get consensus on the general
> > > design first, and we can certainly fine-tune the actual Java methods
> > > and SQL table columns in the future PR. WDYT?
> > >
> > > Thanks,
> > > Alex
> > >
> > > On Fri, Oct 24, 2025 at 5:14 PM Dmitri Bourlatchkov <[email protected]> 
> > > wrote:
> > >>
> > >> Hi All,
> > >>
> > >> Many thanks for the background info, Adnan!
> > >>
> > >> +1 to action items proposed by Alex!
> > >>
> > >> Re: (3) Can we abstract request/trace info into a separate object without
> > >> exposing those accessors on the Event class directly? OTel defines
> > >> trace/span concepts, but in request ID is a bit foreign to OTel. Having
> > >> tracing / request ID isolated in java could help with maintaining it and
> > >> potentially supporting other (custom) tracing methods.
> > >>
> > >> Re: (4) I'd like to propose storing OTel correlation data in the form of 
> > >> a
> > >> standard context propagation string, e.g. W3C trace-context [1] (same 
> > >> value
> > >> as its HTTP header), so the column could be called w3c_trace_context or
> > >> simply trace_context.
> > >>
> > >> Open question: do we need to write a separate, individual trace ID field 
> > >> in
> > >> SQL? I suppose it is not very useful since correlating it to other trace
> > >> data already requires understanding OTel context propagation and a query
> > >> against trace_context can still be made using string-matching clauses. We
> > >> could probably (additionally) store it in the request_id column if the
> > >> Polaris-specific request ID header is not set.
> > >>
> > >> As for span ID, I do not really see a use case for persisting it
> > >> individually. It is very specific to OTel trace data construction.
> > >>
> > >> Actually, using the W3C trace context [1] encoding probably makes sense 
> > >> in
> > >> the java event representation too. Interested callers can easily decode
> > >> this information since the format is well-defined. As a side benefit, 
> > >> this
> > >> opens opportunities for downstream event consumers to connect (propagate
> > >> context) to traces that produced events based on the event data itself,
> > >> without relying on the intermediate frameworks. This may be desirable 
> > >> since
> > >> the current Polaris event persistence impl. writes events in batches, so
> > >> the association to individual requests that produced those events is 
> > >> lost.
> > >> Whether to perform this kind of context propagation will be at the
> > >> discretion of the event consumer, of course (outside of Polaris code).
> > >>
> > >> [1] 
> > >> https://www.google.com/url?q=https://www.w3.org/TR/trace-context/&source=gmail-imap&ust=1761927167000000&usg=AOvVaw2fn0lRsTx-f9r4PCz8wmJK
> > >>
> > >> WDYT?
> > >>
> > >> Thanks,
> > >> Dmitri.
> > >>
> > >> On Fri, Oct 24, 2025 at 7:18 AM Alexandre Dutra <[email protected]> 
> > >> wrote:
> > >>
> > >>> Hi all,
> > >>>
> > >>> Thank you for chiming in; the context around request IDs is now clear.
> > >>>
> > >>> Trying to summarize this thread into actionable items, here's what I
> > >>> propose:
> > >>>
> > >>> 1. Restore the original functionality for request IDs.
> > >>>    - Change the default header name back to x-request-id (despite the
> > >>> x- prefix being deprecated), but keep it configurable as today.
> > >>> 2. Remove RequestIdGenerator and related functionality.
> > >>>    - Do not generate a request ID if the REST client doesn't provide 
> > >>> one.
> > >>> 3. Update PolarisEvent:
> > >>>    - Expose new requestId(), traceId(), spanId() methods, all nullable.
> > >>>    - This would align with the emerging consensus around including
> > >>> contextual information in PolarisEvent [1].
> > >>> 4. Update events table SQL schema:
> > >>>    - Insert the client-provided request ID into the request_id
> > >>> column; otherwise, insert null.
> > >>>    - Add two new nullable columns, trace_id and span_id, and populate
> > >>> them if OTel is enabled.
> > >>>
> > >>> From our discussions, I think it's important to not conflate OTel
> > >>> tracing fields with Envoy tracing fields, which is why I suggest we
> > >>> use separate fields / columns for them.
> > >>>
> > >>> Would the above plan work for everyone?
> > >>>
> > >>> Thanks,
> > >>> Alex
> > >>>
> > >>> [1]: 
> > >>> https://www.google.com/url?q=https://lists.apache.org/thread/rl5cpcft16sn5n00mfkmx9ldn3gsqtfy&source=gmail-imap&ust=1761927167000000&usg=AOvVaw02wPvb0qxRzYAKEP0h8l9T
> > >>>
> > >>>
> > >>> On Fri, Oct 24, 2025 at 9:33 AM Adnan Hemani
> > >>> <[email protected]> wrote:
> > >>>>
> > >>>> Hi all,
> > >>>>
> > >>>> Thanks to Alex for starting this thread - because of this, I’m just
> > >>> coming to the realization that OTel Trace and Span IDs are coming 
> > >>> built-in
> > >>> with Quarkus and my previous work to generate a Request/Correlation ID 
> > >>> is
> > >>> likely not needed as a result. My original motivation for generation of 
> > >>> a
> > >>> Request/Correlation ID was to ensure that any client can uniquely 
> > >>> identify
> > >>> a request made to Polaris, which would be especially useful for 
> > >>> debugging
> > >>> failing requests or identifying call patterns.
> > >>>>
> > >>>> As a result, I’m a +1 on Michael’s opinion: we should remove the
> > >>> Request/Correlation ID generation and always use the OTel trace/span IDs
> > >>> (which come for free with Quarkus) instead for the Correlation ID 
> > >>> unless a
> > >>> valid header is present, which would take over as the Correlation ID
> > >>> instead.
> > >>>>
> > >>>> —
> > >>>>
> > >>>> To answer Dmitri’s question re Polaris Events: The intended use case is
> > >>> to provide some sort of correlation between events that have occurred as
> > >>> part of the same request. For example, if a user makes an 
> > >>> CommitTransaction
> > >>> request, it would be helpful to see all UpdateTable calls that were 
> > >>> made as
> > >>> part of that one user request.
> > >>>>
> > >>>> Best,
> > >>>> Adnan Hemani
> > >>>>
> > >>>>> On Oct 23, 2025, at 12:15 PM, Dmitri Bourlatchkov <[email protected]>
> > >>> wrote:
> > >>>>>
> > >>>>> Hi Michael,
> > >>>>>
> > >>>>> Logging x-request-id headers makes sense.
> > >>>>>
> > >>>>> Just to confirm: if / when we restore that, Polaris will NOT generate
> > >>> new
> > >>>>> IDs in case the header is not present in the request, correct?
> > >>>>>
> > >>>>> I believe x-request-id can co-exist with OTel.
> > >>>>>
> > >>>>> What about adding request IDs to events [1][2]? What's the intended 
> > >>>>> use
> > >>>>> case for that? Could you share some context here too?
> > >>>>>
> > >>>>> Side note: I proposed [2877] flagging event persistence as "beta" in
> > >>>>> 1.2.0... This discussion adds another point towards that, I think.
> > >>>>>
> > >>>>> [1]
> > >>>>>
> > >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/2f0c7a43d446452004ea51196b618de9bdf0e25b/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/inmemory/InMemoryBufferEventListener.java%2523L97%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1WfUaXLp6z_M87iAXEqSUw&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1Pioys4ROm8vYM_mdx4ygH
> > >>>>> [2]
> > >>>>>
> > >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/19742cc20f4bc0b7e5a315a62f89c6085ad81b7d/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/PolarisPersistenceEventListener.java%2523L66%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw12-7e3ahm2sLSkLSNqTecm&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1PGW6uFwtUN8F1dOlJwrpm
> > >>>>>
> > >>>>> [2877]
> > >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/2877%2523discussion_r2456300613%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw3TuYbkzwnLx3QEVIM8oDda&source=gmail-imap&ust=1761927167000000&usg=AOvVaw2Tfk7wAM1MaB5z9Dvw5X5H
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Dmitri.
> > >>>>>
> > >>>>> On Thu, Oct 23, 2025 at 2:23 PM Michael Collado <
> > >>> [email protected]>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Hey Dmitri
> > >>>>>>
> > >>>>>> The generating a request id is new code that was added after the
> > >>> original
> > >>>>>> x-request-id support. You can see the state from ~1 year ago, we
> > >>> hard-coded
> > >>>>>> request_id as the header we used for the MDC -
> > >>>>>>
> > >>>>>>
> > >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/a6197bd7d8cb5551253fa427e4373897205ecece/polaris-service/src/main/java/org/apache/polaris/service/PolarisApplication.java%2523L415-L416%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw35Q1A_2avAiSYlYVAnZxBb&source=gmail-imap&ust=1761927167000000&usg=AOvVaw34DksOw7DSHJu8PfQQ5bT6
> > >>>>>> . At some point, it was changed to be configurable, then the
> > >>>>>> ContextResolverFilter filter was refactored/eliminated and the
> > >>>>>> RequestIdFilter took its responsibility, but lost some of its 
> > >>>>>> original
> > >>>>>> functionality.
> > >>>>>>
> > >>>>>> My personal opinion is that restoring support for the x-request-id
> > >>> header
> > >>>>>> is something that we should do, but if the header isn't present,
> > >>> falling
> > >>>>>> back on simply using OTel trace ids is good enough (better, even) 
> > >>>>>> than
> > >>>>>> generating another random request id.
> > >>>>>>
> > >>>>>> Mike
> > >>>>>>
> > >>>>>> On Thu, Oct 23, 2025 at 10:47 AM Dmitri Bourlatchkov <
> > >>> [email protected]>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Hi Michael,
> > >>>>>>>
> > >>>>>>> Thanks for the info!
> > >>>>>>>
> > >>>>>>> Working with Envoy's tracing headers makes sense to me. However, I
> > >>>>>> wonder:
> > >>>>>>> why would Polaris need to generate a new request ID inside its
> > >>> code?..
> > >>>>>>> and return it in response headers?
> > >>>>>>>
> > >>>>>>> How important is it to propagate this ID to Polaris Events?
> > >>>>>>>
> > >>>>>>> I'm just trying to understand the full context :)
> > >>>>>>>
> > >>>>>>> Thanks,
> > >>>>>>> Dmitri.
> > >>>>>>>
> > >>>>>>> On Thu, Oct 23, 2025 at 1:29 PM Michael Collado <
> > >>> [email protected]>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> I think the original intention for this requestId field was to
> > >>> support
> > >>>>>>>> request propagation from load balancers, like Envoy (
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/observability/tracing%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1RnuM8mViV-j7jvuxq74Aw&source=gmail-imap&ust=1761927167000000&usg=AOvVaw3LNqtilZ2OoJfE7yEwraZa
> > >>>>>>>> ), which is distinct from OTEL. Don't ask me why the default
> > >>>>>>>> is Polaris-Request-Id - I think it was originally a custom thing,
> > >>> but
> > >>>>>>> then
> > >>>>>>>> we changed to integrate with existing conventions. Unfortunately,
> > >>>>>> looking
> > >>>>>>>> through the code, I think that the actual functional plumbing has
> > >>> been
> > >>>>>>> lost
> > >>>>>>>> in the course of multiple refactors around the call context and
> > >>>>>>> resolver. I
> > >>>>>>>> don't see references to that property or the underlying header.
> > >>>>>>>>
> > >>>>>>>> Support for the unofficial x-request-id header feels like something
> > >>> we
> > >>>>>>>> should definitely keep, especially when Polaris is one service in a
> > >>>>>> mesh
> > >>>>>>> of
> > >>>>>>>> services that maybe don't have OTel integration. I'm a fan of the
> > >>> OTel
> > >>>>>>>> standard, but it's not entirely ubiquitous and there are many
> > >>>>>> middleware
> > >>>>>>>> layers that know how to forward on the x-request-id header.
> > >>>>>>>>
> > >>>>>>>> Mike
> > >>>>>>>>
> > >>>>>>>> On Thu, Oct 23, 2025 at 3:00 AM Robert Stupp <[email protected]>
> > >>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Yes, we should aim for interoperability with the existing de-facto
> > >>>>>>>>> standard OTel and make it easy for users to integrate into their
> > >>>>>>>>> observability platforms.
> > >>>>>>>>>
> > >>>>>>>>> On Wed, Oct 22, 2025 at 7:05 PM Dmitri Bourlatchkov <
> > >>>>>> [email protected]>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> Hi Alex and All,
> > >>>>>>>>>>
> > >>>>>>>>>> I certainly support the idea of following OTel standards for
> > >>>>>>> achieving
> > >>>>>>>>>> "correlation" wrt Polaris requests and/or events.
> > >>>>>>>>>>
> > >>>>>>>>>> As to what form the correlation data should take, I believe it is
> > >>>>>>>>>> conceptually what the OTel "context" represents. So, I believe it
> > >>>>>>> makes
> > >>>>>>>>>> sense for Polaris to support standard context propagators at the
> > >>>>>> API
> > >>>>>>>>> layer.
> > >>>>>>>>>>
> > >>>>>>>>>> If the incoming request has OTel context information, then
> > >>>>>> returning
> > >>>>>>>> any
> > >>>>>>>>>> other "correlation" data in the response is redundant, I think.
> > >>>>>>>>>>
> > >>>>>>>>>> If the incoming request does not have OTel context info, what is
> > >>>>>> the
> > >>>>>>>>>> purpose of generating a Polaris-specific "correlation ID"? How is
> > >>>>>> it
> > >>>>>>>>>> envisioned to be used?
> > >>>>>>>>>>
> > >>>>>>>>>> If the intention is to correlate a Polaris response (operation)
> > >>>>>> with
> > >>>>>>>>> events
> > >>>>>>>>>> that might have resulted from its execution, I believe a more
> > >>>>>> robust
> > >>>>>>>>>> approach would be to propagate the OTel trace info (starting a 
> > >>>>>>>>>> new
> > >>>>>>>> trace
> > >>>>>>>>> if
> > >>>>>>>>>> necessary) into event data. Then, Polaris can also return the
> > >>> trace
> > >>>>>>>>> context
> > >>>>>>>>>> in the API response (top span). It's a bit awkward from the OTel
> > >>>>>>>>>> perspective, but might be an option for supporting custom
> > >>>>>>> correlators.
> > >>>>>>>> It
> > >>>>>>>>>> could be covered by a feature flag. The header name could be
> > >>>>>>>>>> "polaris-traceparent" for W3C Trace Context.
> > >>>>>>>>>>
> > >>>>>>>>>> Custom correlation code will be able to extract the trace ID from
> > >>>>>> the
> > >>>>>>>>>> response and from events and find related data. Granted, it will
> > >>>>>>>> require
> > >>>>>>>>> a
> > >>>>>>>>>> bit more effort for the custom code to decode trace IDs from the
> > >>>>>> OTel
> > >>>>>>>>>> context, but the format is standard and not complex. The benefit
> > >>>>>> for
> > >>>>>>>>>> Polaris, though, is that it can easily integrate with
> > >>>>>> OTel-compatible
> > >>>>>>>>>> observability platforms regardless of whether any particular
> > >>>>>>> deployment
> > >>>>>>>>>> uses custom correlators or not.
> > >>>>>>>>>>
> > >>>>>>>>>> WDYT?
> > >>>>>>>>>>
> > >>>>>>>>>> Thanks,
> > >>>>>>>>>> Dmitri.
> > >>>>>>>>>>
> > >>>>>>>>>> On Wed, Oct 22, 2025 at 6:03 AM Alexandre Dutra <
> > >>> [email protected]
> > >>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi all,
> > >>>>>>>>>>>
> > >>>>>>>>>>> Today, Polaris has the notion of "request ID", but its purpose 
> > >>>>>>>>>>> is
> > >>>>>>> not
> > >>>>>>>>>>> entirely clear. It seems to serve as an observability feature to
> > >>>>>>>>>>> facilitate correlation. A pending PR aims to rename it to
> > >>>>>>>> "correlation
> > >>>>>>>>>>> ID" for better alignment with industry standards [1].
> > >>>>>>>>>>>
> > >>>>>>>>>>> However, this PR has brought to light overlaps with core
> > >>>>>> telemetry
> > >>>>>>>>>>> features: when OpenTelemetry (OTel) is enabled in Polaris, each
> > >>>>>>>>>>> request already has a trace ID and span ID, making a separate
> > >>>>>>>>>>> correlation ID redundant.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Moreover, using the OTel trace ID and span ID in Polaris events,
> > >>>>>>>>>>> rather than the generated correlation ID, would significantly
> > >>>>>>>> simplify
> > >>>>>>>>>>> correlation of events with other traces.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Therefore, I propose the following changes:
> > >>>>>>>>>>>
> > >>>>>>>>>>> 1. If OTel is enabled, use the trace ID and span ID as the
> > >>>>>>>> correlation
> > >>>>>>>>>>> ID for the request, instead of generating a random correlation
> > >>>>>> ID.
> > >>>>>>>>>>> 2. Otherwise, if a (Polaris-specific) correlation ID header is
> > >>>>>>>> present
> > >>>>>>>>>>> in the request, use it.
> > >>>>>>>>>>> 3. If neither of the above conditions is met, generate a random
> > >>>>>>>>>>> correlation ID.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I am somewhat undecided on the best approach when a correlation
> > >>>>>> ID
> > >>>>>>>>>>> header is present in the request. However, I believe it would be
> > >>>>>>> more
> > >>>>>>>>>>> sensible to disregard it if OTel is enabled, as OTel offers a
> > >>>>>> more
> > >>>>>>>>>>> robust solution for client-to-server trace propagation, e.g. W3C
> > >>>>>>>> Trace
> > >>>>>>>>>>> Context propagation [2].
> > >>>>>>>>>>>
> > >>>>>>>>>>> Please share your thoughts!
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks,
> > >>>>>>>>>>> Alex
> > >>>>>>>>>>>
> > >>>>>>>>>>> [1]:
> > >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/2757%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1-kAWfEk4tmsEg0q0GZBCn&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1Oe-25vtt4gLVSMFMtSVNg
> > >>>>>>>>>>> [2]:
> > >>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.w3.org/TR/trace-context%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw22nMyOS7pbJ69XrBo5kHQS&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1TRdkzABc_7U-_KZ1MZ59v
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> >

Reply via email to