[ 
https://issues.apache.org/jira/browse/CAMEL-17544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krzysztof Jamróz updated CAMEL-17544:
-------------------------------------
          Component/s: came-core
    Affects Version/s: 3.18.0

I just tested and the same problem still occurs in 3.18.0. As I have written 
before, {{SimpleLRUCache}} is not thread-safe but is sometimes used from 
multiple threads. This can be a source of hard to reproduce errors.

I think this should either be fixed (use thread safe LRU cache) or documented 
as a (IMO risky) optimization. In latter case information should be provided 
what should be avoided in default configuration, eg. that you should not use 
dynamic recipient list in multiple threads when not using {{caffeine-lrucache}} 
(and some other cases).

> ServicePool.doStop still hangs during shutdown
> ----------------------------------------------
>
>                 Key: CAMEL-17544
>                 URL: https://issues.apache.org/jira/browse/CAMEL-17544
>             Project: Camel
>          Issue Type: Bug
>          Components: came-core
>    Affects Versions: 3.14.0, 3.18.0
>            Reporter: Krzysztof Jamróz
>            Priority: Major
>         Attachments: ServicePoolShutdownTest.java
>
>
> {{ServicePool.doStop}} still hangs during shutdown with optimized fix of 
> CAMEL-17536. {{LinkedHashMap}} in cache is corrupted not only because of race 
> condtion between {{acquire}} and {{doStop}} but also between concurrent 
> invocations of {{acquire}} during parallel routing of exchanges (see attached 
> {{{}repeatTestRecipientList{}}}).
>  
> The root cause is that {{LinkedHashMap}} used by {{SimpleLRUCache}} is *not* 
> thread-safe but access is not synchronized. Even reads may modify it as it 
> has LRU policy. And access to {{{}LinkedHashMap{}}}/\{{ SimpleLRUCache }} is 
> possible during routing (so concurrently) not only during start/stop.
>  
> Uses of {{SimpleLRUCache}} in other places in Camel may also exhibit race 
> conditions. It is hard to demonstrate them reliably but I have another 
> example ({{{}repeatEndpointsTest{}}}). This one usually crashes due to 
> {{OutOfMemory}} when converting corrupted (looping) {{SimpleLRUCache}} in  
> {{EndpointRegistry}} to array, but I got also other exceptions.
>  
> {noformat}
> java.lang.OutOfMemoryError: Java heap space
>     at java.base/java.util.Arrays.copyOf(Arrays.java:3480)
>     at 
> java.base/java.util.AbstractCollection.finishToArray(AbstractCollection.java:227)
>     at 
> java.base/java.util.AbstractCollection.toArray(AbstractCollection.java:148)
>     at java.base/java.util.ArrayList.<init>(ArrayList.java:181)
>     at 
> org.apache.camel.impl.engine.AbstractCamelContext.shutdownServices(AbstractCamelContext.java:3582)
>     at 
> org.apache.camel.impl.engine.AbstractCamelContext.shutdownServices(AbstractCamelContext.java:3576)
>     at 
> org.apache.camel.impl.engine.AbstractCamelContext.doStop(AbstractCamelContext.java:3414)
>     at org.apache.camel.support.service.BaseService.stop(BaseService.java:160)
>     at 
> org.apache.camel.impl.engine.AbstractCamelContext.stop(AbstractCamelContext.java:2658)
>     at 
> org.apache.camel.component.http.ServicePoolShutdownTest.testEndpoints(ServicePoolShutdownTest.java:129)
>     at 
> org.apache.camel.component.http.ServicePoolShutdownTest.repeatEndpointsTest(ServicePoolShutdownTest.java:135){noformat}
>  
> Caffeine, which used to be default cache implementation until CAMEL-16093 is 
> thread-safe. Enabling caffeine-lrucache might be a workaround.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to