[
https://issues.apache.org/jira/browse/CAMEL-20835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arthur Naseef updated CAMEL-20835:
----------------------------------
Description:
Out Of Memory (OOM) occurs when using the Recipient List with a large number of
dynamic URLs. For example:
.recipientList(simple("http://{{{}downstream-server{}}}/employee/${header.emplId}"))
with a large number of values for ${header.emplId} leads to the OOM.
*REPRODUCER:*
*=============*
[https://github.com/artnaseef/camel-recipient-list-oom-reproducer]
See the README.md for instructions to reproduce and detect the problem
*DETAILS*
*=======*
The MulticastProcessor, which RecipientListProcessor extends, has the
following "unlimited" cache:
private final ConcurrentMap<Processor, Processor> errorHandlers = new
ConcurrentHashMap<>();
Entries are added to this map for every unique processor created - every unique
URL generates a unique processor. The entries themselves are wrapped processor
instances for error handling IIUC (to support the custom error handling used by
multicast and recipient-list). Entries are only removed from this map on
shutdown. Ironically, there is an LRUCache for the processors themselves, with
a default maximum size of 1000, so the wrapped processors may get recreated
even though the error handler remains in the map indefinitely.
*IMPACT VERSIONS:*
*================*
Appears to impact versions >= 3.10.0
*COMMIT: 0d9227ff16fb00e047fdd087740c87cce01bb545*
*=======*
It appears this commit introduced the use of the errorHandlers "unlimited"
cache for recipient lists.
*FOLLOW-UP*
*==========*
I have ideas and questions for implemeting a fix:
- IDEA 1: We can use an LRUCache for this data structure as well.
- Does it make more sense to remove the entries from errorHandlers when the
related Processor entry is removed from it's LRUCache?
- IDEA 2: setting on recipient list to disable the errorHandler cache (for
dyamic urls with little chance of duplicates, this could be the best)
was:
Out Of Memory (OOM) occurs when using the Recipient List with a large number of
dynamic URLs. For example:
.recipientList(simple("http://\{{downstream-server}}/employee/${header.emplId}"))
with a large number of values for ${header.emplId} leads to the OOM.
REPRODUCER:
=============
[https://github.com/artnaseef/camel-recipient-list-oom-reproducer]
- See the README.md for instructions to reproduce and detect the problem
DETAILS
=======
The MulticastProcessor, which RecipientListProcessor extends, has the
following "unlimited" cache:
private final ConcurrentMap<Processor, Processor> errorHandlers = new
ConcurrentHashMap<>();
Entries are added to this map for every unique processor created - every unique
URL generates a unique processor. The entries themselves are wrapped processor
instances for error handling IIUC (to support the custom error handling used by
multicast and recipient-list). Entries are only removed from this map on
shutdown. Ironically, there is an LRUCache for the processors themselves, with
a default maximum size of 1000, so the wrapped processors may get recreated
even though the error handler remains in the map indefinitely.
*IMPACT VERSIONS:*
*================*
Appears to impact versions >= 3.10.0
*COMMIT: 0d9227ff16fb00e047fdd087740c87cce01bb545*
*=======*
It appears this commit introduced the use of the errorHandlers "unlimited"
cache for recipient lists.
*FOLLOW-UP*
*==========*
I have ideas and questions for implemeting a fix:
- IDEA 1: We can use an LRUCache for this data structure as well.
- Does it make more sense to remove the entries from errorHandlers when the
related Processor entry is removed from it's LRUCache?
- IDEA 2: setting on recipient list to disable the errorHandler cache (for
dyamic urls with little chance of duplicates, this could be the best)
POC commits can be found here:
https://github.com/apache/camel/compare/main...artnaseef:camel:asn/disable-error-handler-cache-setting
> OOM using RecipientList
> -----------------------
>
> Key: CAMEL-20835
> URL: https://issues.apache.org/jira/browse/CAMEL-20835
> Project: Camel
> Issue Type: Bug
> Components: camel-core
> Affects Versions: 3.10.0, 4.6.0
> Reporter: Arthur Naseef
> Priority: Critical
>
> Out Of Memory (OOM) occurs when using the Recipient List with a large number
> of dynamic URLs. For example:
>
>
> .recipientList(simple("http://{{{}downstream-server{}}}/employee/${header.emplId}"))
>
> with a large number of values for ${header.emplId} leads to the OOM.
>
> *REPRODUCER:*
> *=============*
> [https://github.com/artnaseef/camel-recipient-list-oom-reproducer]
>
> See the README.md for instructions to reproduce and detect the problem
>
> *DETAILS*
> *=======*
> The MulticastProcessor, which RecipientListProcessor extends, has the
> following "unlimited" cache:
>
> private final ConcurrentMap<Processor, Processor> errorHandlers = new
> ConcurrentHashMap<>();
>
> Entries are added to this map for every unique processor created - every
> unique URL generates a unique processor. The entries themselves are wrapped
> processor instances for error handling IIUC (to support the custom error
> handling used by multicast and recipient-list). Entries are only removed
> from this map on shutdown. Ironically, there is an LRUCache for the
> processors themselves, with a default maximum size of 1000, so the wrapped
> processors may get recreated even though the error handler remains in the map
> indefinitely.
>
> *IMPACT VERSIONS:*
> *================*
> Appears to impact versions >= 3.10.0
> *COMMIT: 0d9227ff16fb00e047fdd087740c87cce01bb545*
> *=======*
> It appears this commit introduced the use of the errorHandlers "unlimited"
> cache for recipient lists.
>
> *FOLLOW-UP*
> *==========*
> I have ideas and questions for implemeting a fix:
> - IDEA 1: We can use an LRUCache for this data structure as well.
> - Does it make more sense to remove the entries from errorHandlers when
> the related Processor entry is removed from it's LRUCache?
> - IDEA 2: setting on recipient list to disable the errorHandler cache
> (for dyamic urls with little chance of duplicates, this could be the best)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)