[ 
https://issues.apache.org/jira/browse/CAMEL-21302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886925#comment-17886925
 ] 

Freeman Yue Fang commented on CAMEL-21302:
------------------------------------------

A quick update of this issue.

I noticed this error in log
{code}
java.lang.IllegalStateException: Thread [ForkJoinPool-1-worker-1] opened scope, 
but thread [default-workqueue-1] closed it
        at 
io.opentelemetry.context.StrictContextStorage$StrictScope.close(StrictContextStorage.java:205)
 ~[opentelemetry-context-1.42.1.jar:1.42.1]
        at 
org.apache.camel.tracing.ActiveSpanManager$Holder.closeScope(ActiveSpanManager.java:140)
 ~[camel-tracing-4.9.0-SNAPSHOT.jar:4.9.0-SNAPSHOT]
        at 
org.apache.camel.tracing.ActiveSpanManager.deactivate(ActiveSpanManager.java:80)
 ~[camel-tracing-4.9.0-SNAPSHOT.jar:4.9.0-SNAPSHOT]
        at 
org.apache.camel.tracing.Tracer$TracingEventNotifier.notify(Tracer.java:266) 
~[camel-tracing-4.9.0-SNAPSHOT.jar:4.9.0-SNAPSHOT]
        at 
org.apache.camel.support.EventHelper.doNotifyEvent(EventHelper.java:1495) 
~[camel-support-4.9.0-SNAPSHOT.jar:4.9.0-SNAPSHOT]
        at 
org.apache.camel.support.EventHelper.notifyExchangeSent(EventHelper.java:969) 
~[camel-support-4.9.0-SNAPSHOT.jar:4.9.0-SNAPSHOT]
        at 
org.apache.camel.processor.SendProcessor.lambda$process$0(SendProcessor.java:193)
 ~[camel-core-processor-4.9.0-SNAPSHOT.jar:4.9.0-SNAPSHOT]
        at 
org.apache.camel.component.cxf.jaxrs.CxfRsProducer$CxfInvocationCallback.completed(CxfRsProducer.java:732)
 [camel-cxf-rest-4.9.0-SNAPSHOT.jar:4.9.0-SNAPSHOT]
        at 
org.apache.camel.component.cxf.jaxrs.CxfRsProducer$CxfInvocationCallback.completed(CxfRsProducer.java:680)
 [camel-cxf-rest-4.9.0-SNAPSHOT.jar:4.9.0-SNAPSHOT]
{code}

Otel expects the scope should be opened and closed in the same thread, if in 
different threads,  we see exception here.  And I highly believe this is the 
root cause which messes up scope/span stacks here and causes the same traceId 
is used in different invocations.

For the camel-cxf producer endpoint, if async is used, the cxf client(code in 
org.apache.cxf.transport.http.HTTPConduit) will handleResponseOnWorkqueue(so 
camel-cxf producer opens otel scope and sends request in one thread which is 
from camel managed threadpool, but handles response and closes otel scope in 
another thread which is managed by cxf threadpool), here cxf creates 
org.apache.cxf.workqueue.WorkQueue extends java.util.concurrent.Executor if 
can't find an Executor instance from cxf Exchange. 

So back to the exception
{code}
java.lang.IllegalStateException: Thread [ForkJoinPool-1-worker-1] opened scope, 
but thread [default-workqueue-1] closed it
{code}

The ForkJoinPool-1-worker-1 is from camel managed threadpool but 
default-workqueue-1 is from cxf threadpool.

I think this would be a pretty common scenario for the async client(send and 
receive using different threads) and Otel should be able to accommodate this 
scenario(And I found related otel document like propagate Otel Context across 
threads seems promising).

Another thing is that we probably can grab camel managed threadpool and set it 
to cxf exchange in async mode, so the shared threadpool is used here.

Still experimenting...

> camel-opentelemetry context leak with cxf async producer
> --------------------------------------------------------
>
>                 Key: CAMEL-21302
>                 URL: https://issues.apache.org/jira/browse/CAMEL-21302
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-cxf, camel-opentelemetry
>            Reporter: John Poth
>            Priority: Major
>
> There seems to be a Otel context leak when using a CXF producer in async 
> mode. This causes different requests to have the same _traceId._ As a 
> workaround, setting _synchronous=true_ on the CXF producer resolves the 
> issue. Here's a reproducer:
> {code:java}
> @Override
> protected RoutesBuilder createRouteBuilder() {
>     return new RouteBuilder() {
>         @Override
>         public void configure() {
>             from("direct:start").routeId("myRoute")
>                     .to("direct:send")
>                     .end();
>             from("direct:send")
>                     .log("message")
>                     .to("cxfrs:http://localhost:"; + port1
>                         + "/rest/helloservice/sayHello?synchronous=false"); 
> // setting to 'true' resolves the issue
>             restConfiguration()
>                     .port(port1);
>             rest("/rest/helloservice")
>                     .post("/sayHello").routeId("rest-GET-say-hi")
>                     .to("direct:sayHi");
>             from("direct:sayHi")
>                     .routeId("mock-GET-say-hi")
>                     .log("example")
>                     .to("mock:end");
> }};
> {code}
>  
> I've added the complete unit here: 
> https://github.com/apache/camel/blob/7d83a62b8e442dc9ac6fd79b153192add940301e/components/camel-opentelemetry/src/test/java/org/apache/camel/opentelemetry/AsyncCxfTest.java



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to