[
https://issues.apache.org/jira/browse/CAMEL-23708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18086794#comment-18086794
]
Claus Ibsen commented on CAMEL-23708:
-------------------------------------
Investigation update: the missing RECEIVED spans are NOT caused by a framework
bug in the OTel span lifecycle.
A reproducer unit test (MulticastSedaTest) was created that faithfully
reproduces the production route-topology topology:
{code}
direct → direct → direct → stub:orders → multicast →
stub:fulfillment/notifications → stub:warehouse/email
{code}
All 4 test scenarios pass — every RECEIVED span is properly created and ended,
including those in nested async paths (seda consumer → multicast → seda/stub →
leaf consumer routes).
*Root cause: DevSpanExporter capacity eviction*
The DevSpanExporter has a {{LinkedBlockingQueue}} with capacity 500 and FIFO
eviction. With the route-topology example:
- Timer fires every 5 seconds generating ~33 spans per trace (with core
processors enabled)
- 500 / 33 ≈ 15 complete traces fit in the queue
- After ~75 seconds, the queue is full and starts evicting from the head
(oldest spans)
- The RECEIVED spans for leaf routes (fulfillment, notification) complete LAST
in each trace due to async seda queueing, so they are the last to enter the
exporter — but the earlier spans from older traces get evicted to make room,
creating incomplete traces
*Action needed:* Reclassify this from a framework instrumentation bug to a
DevSpanExporter improvement. Options:
# Increase capacity (e.g. 2000) or make it configurable
# Switch to trace-aware eviction: evict entire traces instead of individual
spans, keeping the N most recent complete traces
# Add a "spans evicted" counter to the dev console response so the TUI can warn
the user
The reproducer test ({{MulticastSedaTest}}) has been added to
{{camel-opentelemetry2}} as a baseline regression test for multicast+seda/stub
span completeness.
> OpenTelemetry2 - Missing RECEIVED spans for routes consuming from stub/seda
> in nested async paths
> -------------------------------------------------------------------------------------------------
>
> Key: CAMEL-23708
> URL: https://issues.apache.org/jira/browse/CAMEL-23708
> Project: Camel
> Issue Type: Bug
> Components: camel-opentelemetry
> Reporter: Claus Ibsen
> Priority: Major
>
> When using the stub component to simulate Kafka (e.g. with {{--observe}} /
> dev profile), route consumer RECEIVED spans are not exported for routes that
> consume from stub endpoints when the producer side is inside a multicast on a
> seda consumer thread.
> h3. Steps to reproduce
> Run the route-topology example with OpenTelemetry dev mode:
> {code}
> camel run route-topology.camel.yaml --observe
> {code}
> The example has this flow:
> {code}
> timer -> order-generator -> direct:process-order -> process-order ->
> kafka:orders
> -> order-dispatcher (multicast)
> -> kafka:fulfillment -> fulfillment route -> kafka:warehouse-shipments
> -> kafka:notifications -> notification route -> kafka:email-outbox
> {code}
> h3. Expected behavior
> Each trace should have 27 spans including EVENT_RECEIVED spans for all 6
> routes:
> - timer (order-generator) - OK
> - direct:process-order (process-order) - OK
> - direct:validate-order (validate-order) - OK
> - kafka:orders via stub (order-dispatcher) - OK
> - kafka:fulfillment via stub (fulfillment) - MISSING
> - kafka:notifications via stub (notification) - MISSING
> h3. Actual behavior
> Each trace has exactly 25 spans. The EVENT_RECEIVED spans for the fulfillment
> and notification routes are never exported to the DevSpanExporter.
> The RECEIVED spans ARE created (processor spans log4, to7, log5, to8
> correctly reference them as parentSpanId), but {{span.end()}} is apparently
> never called, so {{SimpleSpanProcessor.onEnd()}} never fires and they never
> reach the exporter.
> h3. Analysis
> The issue is specific to nested async routing: the fulfillment and
> notification routes consume from stub endpoints whose producers are inside a
> multicast within the order-dispatcher route, which is itself on a seda
> consumer thread. The {{onExchangeDone}} route policy callback
> ({{TracingRoutePolicy}} in {{camel-telemetry}}) does not fire for these
> exchanges, so the RECEIVED span is started but never ended/exported.
> The kafka:orders RECEIVED span (order-dispatcher route) works correctly -
> this is a direct stub producer to seda consumer path without the nested
> multicast.
> Key code path:
> - {{camel-telemetry}}: {{Tracer.TracingRoutePolicy.onExchangeDone()}} ->
> {{endEventSpan()}} -> {{spanLifecycleManager.deactivate(span)}} ->
> {{otelSpan.end()}}
> - {{camel-seda}}: {{SedaProducer.process()}} (InOnly path) ->
> {{addToQueue(exchange, true)}} -> async consumer pickup
> - {{camel-stub}}: extends seda, same behavior
> h3. Impact
> - Orphan spans in trace visualization (Jaeger, TUI waterfall) - processor
> spans reference non-existent parent
> - Missing routeId propagation for orphaned spans (dev console cannot walk
> parent chain)
> - Affects any scenario with stub/seda consumers inside multicast or other
> async EIPs
> h3. How this was discovered
> Using the TUI MCP {{tui_get_spans}} tool to fetch raw span JSON data from a
> running Camel TUI session, analyzing parent chain integrity across 20
> consecutive traces (500 spans). Every trace showed the identical orphan
> pattern.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)