Hi all,

+1 to all of Alex’s AIs with Dmitri’s suggested changes as well. Great find, 
Dmitri!

I’m still debating with myself on whether we need to store the `x-request-id` 
field as part of the Events persistence. Can we think of a good use case where 
this is more helpful to the user than the OTel Trace/Span IDs? I am making the 
assumption here that those are still being returned back to the client.

Best,
Adnan Hemani

> On Oct 24, 2025, at 9:12 AM, Alexandre Dutra <[email protected]> wrote:
> 
> Hi Dmitri,
> 
> Yes, I think your suggestion to use just one field w3c_trace_context
> makes more sense than two fields (span_id and trace_id).
> 
> With that, I also think that we are slowly drifting into
> implementation considerations; let's get consensus on the general
> design first, and we can certainly fine-tune the actual Java methods
> and SQL table columns in the future PR. WDYT?
> 
> Thanks,
> Alex
> 
> On Fri, Oct 24, 2025 at 5:14 PM Dmitri Bourlatchkov <[email protected]> wrote:
>> 
>> Hi All,
>> 
>> Many thanks for the background info, Adnan!
>> 
>> +1 to action items proposed by Alex!
>> 
>> Re: (3) Can we abstract request/trace info into a separate object without
>> exposing those accessors on the Event class directly? OTel defines
>> trace/span concepts, but in request ID is a bit foreign to OTel. Having
>> tracing / request ID isolated in java could help with maintaining it and
>> potentially supporting other (custom) tracing methods.
>> 
>> Re: (4) I'd like to propose storing OTel correlation data in the form of a
>> standard context propagation string, e.g. W3C trace-context [1] (same value
>> as its HTTP header), so the column could be called w3c_trace_context or
>> simply trace_context.
>> 
>> Open question: do we need to write a separate, individual trace ID field in
>> SQL? I suppose it is not very useful since correlating it to other trace
>> data already requires understanding OTel context propagation and a query
>> against trace_context can still be made using string-matching clauses. We
>> could probably (additionally) store it in the request_id column if the
>> Polaris-specific request ID header is not set.
>> 
>> As for span ID, I do not really see a use case for persisting it
>> individually. It is very specific to OTel trace data construction.
>> 
>> Actually, using the W3C trace context [1] encoding probably makes sense in
>> the java event representation too. Interested callers can easily decode
>> this information since the format is well-defined. As a side benefit, this
>> opens opportunities for downstream event consumers to connect (propagate
>> context) to traces that produced events based on the event data itself,
>> without relying on the intermediate frameworks. This may be desirable since
>> the current Polaris event persistence impl. writes events in batches, so
>> the association to individual requests that produced those events is lost.
>> Whether to perform this kind of context propagation will be at the
>> discretion of the event consumer, of course (outside of Polaris code).
>> 
>> [1] 
>> https://www.google.com/url?q=https://www.w3.org/TR/trace-context/&source=gmail-imap&ust=1761927167000000&usg=AOvVaw2fn0lRsTx-f9r4PCz8wmJK
>> 
>> WDYT?
>> 
>> Thanks,
>> Dmitri.
>> 
>> On Fri, Oct 24, 2025 at 7:18 AM Alexandre Dutra <[email protected]> wrote:
>> 
>>> Hi all,
>>> 
>>> Thank you for chiming in; the context around request IDs is now clear.
>>> 
>>> Trying to summarize this thread into actionable items, here's what I
>>> propose:
>>> 
>>> 1. Restore the original functionality for request IDs.
>>>    - Change the default header name back to x-request-id (despite the
>>> x- prefix being deprecated), but keep it configurable as today.
>>> 2. Remove RequestIdGenerator and related functionality.
>>>    - Do not generate a request ID if the REST client doesn't provide one.
>>> 3. Update PolarisEvent:
>>>    - Expose new requestId(), traceId(), spanId() methods, all nullable.
>>>    - This would align with the emerging consensus around including
>>> contextual information in PolarisEvent [1].
>>> 4. Update events table SQL schema:
>>>    - Insert the client-provided request ID into the request_id
>>> column; otherwise, insert null.
>>>    - Add two new nullable columns, trace_id and span_id, and populate
>>> them if OTel is enabled.
>>> 
>>> From our discussions, I think it's important to not conflate OTel
>>> tracing fields with Envoy tracing fields, which is why I suggest we
>>> use separate fields / columns for them.
>>> 
>>> Would the above plan work for everyone?
>>> 
>>> Thanks,
>>> Alex
>>> 
>>> [1]: 
>>> https://www.google.com/url?q=https://lists.apache.org/thread/rl5cpcft16sn5n00mfkmx9ldn3gsqtfy&source=gmail-imap&ust=1761927167000000&usg=AOvVaw02wPvb0qxRzYAKEP0h8l9T
>>> 
>>> 
>>> On Fri, Oct 24, 2025 at 9:33 AM Adnan Hemani
>>> <[email protected]> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> Thanks to Alex for starting this thread - because of this, I’m just
>>> coming to the realization that OTel Trace and Span IDs are coming built-in
>>> with Quarkus and my previous work to generate a Request/Correlation ID is
>>> likely not needed as a result. My original motivation for generation of a
>>> Request/Correlation ID was to ensure that any client can uniquely identify
>>> a request made to Polaris, which would be especially useful for debugging
>>> failing requests or identifying call patterns.
>>>> 
>>>> As a result, I’m a +1 on Michael’s opinion: we should remove the
>>> Request/Correlation ID generation and always use the OTel trace/span IDs
>>> (which come for free with Quarkus) instead for the Correlation ID unless a
>>> valid header is present, which would take over as the Correlation ID
>>> instead.
>>>> 
>>>> —
>>>> 
>>>> To answer Dmitri’s question re Polaris Events: The intended use case is
>>> to provide some sort of correlation between events that have occurred as
>>> part of the same request. For example, if a user makes an CommitTransaction
>>> request, it would be helpful to see all UpdateTable calls that were made as
>>> part of that one user request.
>>>> 
>>>> Best,
>>>> Adnan Hemani
>>>> 
>>>>> On Oct 23, 2025, at 12:15 PM, Dmitri Bourlatchkov <[email protected]>
>>> wrote:
>>>>> 
>>>>> Hi Michael,
>>>>> 
>>>>> Logging x-request-id headers makes sense.
>>>>> 
>>>>> Just to confirm: if / when we restore that, Polaris will NOT generate
>>> new
>>>>> IDs in case the header is not present in the request, correct?
>>>>> 
>>>>> I believe x-request-id can co-exist with OTel.
>>>>> 
>>>>> What about adding request IDs to events [1][2]? What's the intended use
>>>>> case for that? Could you share some context here too?
>>>>> 
>>>>> Side note: I proposed [2877] flagging event persistence as "beta" in
>>>>> 1.2.0... This discussion adds another point towards that, I think.
>>>>> 
>>>>> [1]
>>>>> 
>>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/2f0c7a43d446452004ea51196b618de9bdf0e25b/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/inmemory/InMemoryBufferEventListener.java%2523L97%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1WfUaXLp6z_M87iAXEqSUw&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1Pioys4ROm8vYM_mdx4ygH
>>>>> [2]
>>>>> 
>>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/19742cc20f4bc0b7e5a315a62f89c6085ad81b7d/runtime/service/src/main/java/org/apache/polaris/service/events/listeners/PolarisPersistenceEventListener.java%2523L66%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw12-7e3ahm2sLSkLSNqTecm&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1PGW6uFwtUN8F1dOlJwrpm
>>>>> 
>>>>> [2877]
>>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/2877%2523discussion_r2456300613%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw3TuYbkzwnLx3QEVIM8oDda&source=gmail-imap&ust=1761927167000000&usg=AOvVaw2Tfk7wAM1MaB5z9Dvw5X5H
>>>>> 
>>>>> Thanks,
>>>>> Dmitri.
>>>>> 
>>>>> On Thu, Oct 23, 2025 at 2:23 PM Michael Collado <
>>> [email protected]>
>>>>> wrote:
>>>>> 
>>>>>> Hey Dmitri
>>>>>> 
>>>>>> The generating a request id is new code that was added after the
>>> original
>>>>>> x-request-id support. You can see the state from ~1 year ago, we
>>> hard-coded
>>>>>> request_id as the header we used for the MDC -
>>>>>> 
>>>>>> 
>>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/blob/a6197bd7d8cb5551253fa427e4373897205ecece/polaris-service/src/main/java/org/apache/polaris/service/PolarisApplication.java%2523L415-L416%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw35Q1A_2avAiSYlYVAnZxBb&source=gmail-imap&ust=1761927167000000&usg=AOvVaw34DksOw7DSHJu8PfQQ5bT6
>>>>>> . At some point, it was changed to be configurable, then the
>>>>>> ContextResolverFilter filter was refactored/eliminated and the
>>>>>> RequestIdFilter took its responsibility, but lost some of its original
>>>>>> functionality.
>>>>>> 
>>>>>> My personal opinion is that restoring support for the x-request-id
>>> header
>>>>>> is something that we should do, but if the header isn't present,
>>> falling
>>>>>> back on simply using OTel trace ids is good enough (better, even) than
>>>>>> generating another random request id.
>>>>>> 
>>>>>> Mike
>>>>>> 
>>>>>> On Thu, Oct 23, 2025 at 10:47 AM Dmitri Bourlatchkov <
>>> [email protected]>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi Michael,
>>>>>>> 
>>>>>>> Thanks for the info!
>>>>>>> 
>>>>>>> Working with Envoy's tracing headers makes sense to me. However, I
>>>>>> wonder:
>>>>>>> why would Polaris need to generate a new request ID inside its
>>> code?..
>>>>>>> and return it in response headers?
>>>>>>> 
>>>>>>> How important is it to propagate this ID to Polaris Events?
>>>>>>> 
>>>>>>> I'm just trying to understand the full context :)
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Dmitri.
>>>>>>> 
>>>>>>> On Thu, Oct 23, 2025 at 1:29 PM Michael Collado <
>>> [email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> I think the original intention for this requestId field was to
>>> support
>>>>>>>> request propagation from load balancers, like Envoy (
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/observability/tracing%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1RnuM8mViV-j7jvuxq74Aw&source=gmail-imap&ust=1761927167000000&usg=AOvVaw3LNqtilZ2OoJfE7yEwraZa
>>>>>>>> ), which is distinct from OTEL. Don't ask me why the default
>>>>>>>> is Polaris-Request-Id - I think it was originally a custom thing,
>>> but
>>>>>>> then
>>>>>>>> we changed to integrate with existing conventions. Unfortunately,
>>>>>> looking
>>>>>>>> through the code, I think that the actual functional plumbing has
>>> been
>>>>>>> lost
>>>>>>>> in the course of multiple refactors around the call context and
>>>>>>> resolver. I
>>>>>>>> don't see references to that property or the underlying header.
>>>>>>>> 
>>>>>>>> Support for the unofficial x-request-id header feels like something
>>> we
>>>>>>>> should definitely keep, especially when Polaris is one service in a
>>>>>> mesh
>>>>>>> of
>>>>>>>> services that maybe don't have OTel integration. I'm a fan of the
>>> OTel
>>>>>>>> standard, but it's not entirely ubiquitous and there are many
>>>>>> middleware
>>>>>>>> layers that know how to forward on the x-request-id header.
>>>>>>>> 
>>>>>>>> Mike
>>>>>>>> 
>>>>>>>> On Thu, Oct 23, 2025 at 3:00 AM Robert Stupp <[email protected]>
>>> wrote:
>>>>>>>> 
>>>>>>>>> Yes, we should aim for interoperability with the existing de-facto
>>>>>>>>> standard OTel and make it easy for users to integrate into their
>>>>>>>>> observability platforms.
>>>>>>>>> 
>>>>>>>>> On Wed, Oct 22, 2025 at 7:05 PM Dmitri Bourlatchkov <
>>>>>> [email protected]>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Alex and All,
>>>>>>>>>> 
>>>>>>>>>> I certainly support the idea of following OTel standards for
>>>>>>> achieving
>>>>>>>>>> "correlation" wrt Polaris requests and/or events.
>>>>>>>>>> 
>>>>>>>>>> As to what form the correlation data should take, I believe it is
>>>>>>>>>> conceptually what the OTel "context" represents. So, I believe it
>>>>>>> makes
>>>>>>>>>> sense for Polaris to support standard context propagators at the
>>>>>> API
>>>>>>>>> layer.
>>>>>>>>>> 
>>>>>>>>>> If the incoming request has OTel context information, then
>>>>>> returning
>>>>>>>> any
>>>>>>>>>> other "correlation" data in the response is redundant, I think.
>>>>>>>>>> 
>>>>>>>>>> If the incoming request does not have OTel context info, what is
>>>>>> the
>>>>>>>>>> purpose of generating a Polaris-specific "correlation ID"? How is
>>>>>> it
>>>>>>>>>> envisioned to be used?
>>>>>>>>>> 
>>>>>>>>>> If the intention is to correlate a Polaris response (operation)
>>>>>> with
>>>>>>>>> events
>>>>>>>>>> that might have resulted from its execution, I believe a more
>>>>>> robust
>>>>>>>>>> approach would be to propagate the OTel trace info (starting a new
>>>>>>>> trace
>>>>>>>>> if
>>>>>>>>>> necessary) into event data. Then, Polaris can also return the
>>> trace
>>>>>>>>> context
>>>>>>>>>> in the API response (top span). It's a bit awkward from the OTel
>>>>>>>>>> perspective, but might be an option for supporting custom
>>>>>>> correlators.
>>>>>>>> It
>>>>>>>>>> could be covered by a feature flag. The header name could be
>>>>>>>>>> "polaris-traceparent" for W3C Trace Context.
>>>>>>>>>> 
>>>>>>>>>> Custom correlation code will be able to extract the trace ID from
>>>>>> the
>>>>>>>>>> response and from events and find related data. Granted, it will
>>>>>>>> require
>>>>>>>>> a
>>>>>>>>>> bit more effort for the custom code to decode trace IDs from the
>>>>>> OTel
>>>>>>>>>> context, but the format is standard and not complex. The benefit
>>>>>> for
>>>>>>>>>> Polaris, though, is that it can easily integrate with
>>>>>> OTel-compatible
>>>>>>>>>> observability platforms regardless of whether any particular
>>>>>>> deployment
>>>>>>>>>> uses custom correlators or not.
>>>>>>>>>> 
>>>>>>>>>> WDYT?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Dmitri.
>>>>>>>>>> 
>>>>>>>>>> On Wed, Oct 22, 2025 at 6:03 AM Alexandre Dutra <
>>> [email protected]
>>>>>>> 
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi all,
>>>>>>>>>>> 
>>>>>>>>>>> Today, Polaris has the notion of "request ID", but its purpose is
>>>>>>> not
>>>>>>>>>>> entirely clear. It seems to serve as an observability feature to
>>>>>>>>>>> facilitate correlation. A pending PR aims to rename it to
>>>>>>>> "correlation
>>>>>>>>>>> ID" for better alignment with industry standards [1].
>>>>>>>>>>> 
>>>>>>>>>>> However, this PR has brought to light overlaps with core
>>>>>> telemetry
>>>>>>>>>>> features: when OpenTelemetry (OTel) is enabled in Polaris, each
>>>>>>>>>>> request already has a trace ID and span ID, making a separate
>>>>>>>>>>> correlation ID redundant.
>>>>>>>>>>> 
>>>>>>>>>>> Moreover, using the OTel trace ID and span ID in Polaris events,
>>>>>>>>>>> rather than the generated correlation ID, would significantly
>>>>>>>> simplify
>>>>>>>>>>> correlation of events with other traces.
>>>>>>>>>>> 
>>>>>>>>>>> Therefore, I propose the following changes:
>>>>>>>>>>> 
>>>>>>>>>>> 1. If OTel is enabled, use the trace ID and span ID as the
>>>>>>>> correlation
>>>>>>>>>>> ID for the request, instead of generating a random correlation
>>>>>> ID.
>>>>>>>>>>> 2. Otherwise, if a (Polaris-specific) correlation ID header is
>>>>>>>> present
>>>>>>>>>>> in the request, use it.
>>>>>>>>>>> 3. If neither of the above conditions is met, generate a random
>>>>>>>>>>> correlation ID.
>>>>>>>>>>> 
>>>>>>>>>>> I am somewhat undecided on the best approach when a correlation
>>>>>> ID
>>>>>>>>>>> header is present in the request. However, I believe it would be
>>>>>>> more
>>>>>>>>>>> sensible to disregard it if OTel is enabled, as OTel offers a
>>>>>> more
>>>>>>>>>>> robust solution for client-to-server trace propagation, e.g. W3C
>>>>>>>> Trace
>>>>>>>>>>> Context propagation [2].
>>>>>>>>>>> 
>>>>>>>>>>> Please share your thoughts!
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Alex
>>>>>>>>>>> 
>>>>>>>>>>> [1]:
>>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://github.com/apache/polaris/pull/2757%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw1-kAWfEk4tmsEg0q0GZBCn&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1Oe-25vtt4gLVSMFMtSVNg
>>>>>>>>>>> [2]:
>>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://www.w3.org/TR/trace-context%26source%3Dgmail-imap%26ust%3D1761851831000000%26usg%3DAOvVaw22nMyOS7pbJ69XrBo5kHQS&source=gmail-imap&ust=1761927167000000&usg=AOvVaw1TRdkzABc_7U-_KZ1MZ59v
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>> 
>>> 

Reply via email to