[ 
https://issues.apache.org/jira/browse/CAMEL-23708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18086794#comment-18086794
 ] 

Claus Ibsen commented on CAMEL-23708:
-------------------------------------

Investigation update: the missing RECEIVED spans are NOT caused by a framework 
bug in the OTel span lifecycle.

A reproducer unit test (MulticastSedaTest) was created that faithfully 
reproduces the production route-topology topology:
{code}
direct → direct → direct → stub:orders → multicast → 
stub:fulfillment/notifications → stub:warehouse/email
{code}

All 4 test scenarios pass — every RECEIVED span is properly created and ended, 
including those in nested async paths (seda consumer → multicast → seda/stub → 
leaf consumer routes).

*Root cause: DevSpanExporter capacity eviction*

The DevSpanExporter has a {{LinkedBlockingQueue}} with capacity 500 and FIFO 
eviction. With the route-topology example:
- Timer fires every 5 seconds generating ~33 spans per trace (with core 
processors enabled)
- 500 / 33 ≈ 15 complete traces fit in the queue
- After ~75 seconds, the queue is full and starts evicting from the head 
(oldest spans)
- The RECEIVED spans for leaf routes (fulfillment, notification) complete LAST 
in each trace due to async seda queueing, so they are the last to enter the 
exporter — but the earlier spans from older traces get evicted to make room, 
creating incomplete traces

*Action needed:* Reclassify this from a framework instrumentation bug to a 
DevSpanExporter improvement. Options:
# Increase capacity (e.g. 2000) or make it configurable
# Switch to trace-aware eviction: evict entire traces instead of individual 
spans, keeping the N most recent complete traces
# Add a "spans evicted" counter to the dev console response so the TUI can warn 
the user

The reproducer test ({{MulticastSedaTest}}) has been added to 
{{camel-opentelemetry2}} as a baseline regression test for multicast+seda/stub 
span completeness.

> OpenTelemetry2 - Missing RECEIVED spans for routes consuming from stub/seda 
> in nested async paths
> -------------------------------------------------------------------------------------------------
>
>                 Key: CAMEL-23708
>                 URL: https://issues.apache.org/jira/browse/CAMEL-23708
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-opentelemetry
>            Reporter: Claus Ibsen
>            Priority: Major
>
> When using the stub component to simulate Kafka (e.g. with {{--observe}} / 
> dev profile), route consumer RECEIVED spans are not exported for routes that 
> consume from stub endpoints when the producer side is inside a multicast on a 
> seda consumer thread.
> h3. Steps to reproduce
> Run the route-topology example with OpenTelemetry dev mode:
> {code}
> camel run route-topology.camel.yaml --observe
> {code}
> The example has this flow:
> {code}
> timer -> order-generator -> direct:process-order -> process-order -> 
> kafka:orders
>   -> order-dispatcher (multicast)
>     -> kafka:fulfillment -> fulfillment route -> kafka:warehouse-shipments
>     -> kafka:notifications -> notification route -> kafka:email-outbox
> {code}
> h3. Expected behavior
> Each trace should have 27 spans including EVENT_RECEIVED spans for all 6 
> routes:
> - timer (order-generator) - OK
> - direct:process-order (process-order) - OK
> - direct:validate-order (validate-order) - OK
> - kafka:orders via stub (order-dispatcher) - OK
> - kafka:fulfillment via stub (fulfillment) - MISSING
> - kafka:notifications via stub (notification) - MISSING
> h3. Actual behavior
> Each trace has exactly 25 spans. The EVENT_RECEIVED spans for the fulfillment 
> and notification routes are never exported to the DevSpanExporter.
> The RECEIVED spans ARE created (processor spans log4, to7, log5, to8 
> correctly reference them as parentSpanId), but {{span.end()}} is apparently 
> never called, so {{SimpleSpanProcessor.onEnd()}} never fires and they never 
> reach the exporter.
> h3. Analysis
> The issue is specific to nested async routing: the fulfillment and 
> notification routes consume from stub endpoints whose producers are inside a 
> multicast within the order-dispatcher route, which is itself on a seda 
> consumer thread. The {{onExchangeDone}} route policy callback 
> ({{TracingRoutePolicy}} in {{camel-telemetry}}) does not fire for these 
> exchanges, so the RECEIVED span is started but never ended/exported.
> The kafka:orders RECEIVED span (order-dispatcher route) works correctly - 
> this is a direct stub producer to seda consumer path without the nested 
> multicast.
> Key code path:
> - {{camel-telemetry}}: {{Tracer.TracingRoutePolicy.onExchangeDone()}} -> 
> {{endEventSpan()}} -> {{spanLifecycleManager.deactivate(span)}} -> 
> {{otelSpan.end()}}
> - {{camel-seda}}: {{SedaProducer.process()}} (InOnly path) -> 
> {{addToQueue(exchange, true)}} -> async consumer pickup
> - {{camel-stub}}: extends seda, same behavior
> h3. Impact
> - Orphan spans in trace visualization (Jaeger, TUI waterfall) - processor 
> spans reference non-existent parent
> - Missing routeId propagation for orphaned spans (dev console cannot walk 
> parent chain)
> - Affects any scenario with stub/seda consumers inside multicast or other 
> async EIPs
> h3. How this was discovered
> Using the TUI MCP {{tui_get_spans}} tool to fetch raw span JSON data from a 
> running Camel TUI session, analyzing parent chain integrity across 20 
> consecutive traces (500 spans). Every trace showed the identical orphan 
> pattern.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to