[ 
https://issues.apache.org/jira/browse/CAMEL-23686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18086308#comment-18086308
 ] 

Guillaume Nodet commented on CAMEL-23686:
-----------------------------------------

_Claude Code on behalf of Guillaume Nodet_

h3. Benchmark results after applying all optimization PRs

Test setup: JDK Temurin 21.0.9, Camel 4.21.0-SNAPSHOT, timer(period=1ms) + 
split(1000 tokens) routes, 60s runs, heap histogram captured at 35s.

PRs included: #23738 (CoW headers), #23766 (O(1) CaseInsensitiveMap), #23779 
(optimized ASCII hash), #23769 (profile fix), #23770 (clock fix), #23771 (UoW 
lighten).

h4. Heap usage

||Route||Before||After||Change||
|Baseline (prod+pooled)|75 MB|44 MB|*-41%*|
|Pipeline (dev default)|3,194 MB|2,392 MB|*-25%*|
|HTTP (prod+pooled)|95 MB|100 MB|~same|

h4. Key object counts — Pipeline (dev profile)

||Object||Before||After||Change||
|DefaultMessageHistory|3,616,340|2,541,752|*-30%*|
|ConcurrentLinkedDeque$Node|1,036,646|5,276|*-99.5%*|
|ConcurrentLinkedDeque|516,743|eliminated|*-100%*|
|DefaultExchange|1,033,952|728,740|-30%|
|DefaultUnitOfWork|516,715|363,175|-30%|

h4. Key object counts — Pipeline (prod+pooled)

||Object||Before||After||Change||
|ConcurrentLinkedDeque$Node|1,048,208 (25 MB)|eliminated|*-100%*|
|ConcurrentLinkedDeque|524,086 (12 MB)|eliminated|*-100%*|
|DefaultExchange|703,892|505,210|-28%|
|DefaultPooledExchange|524,091|376,701|-28%|
|EnumMap|2,455,973|1,763,829|-28%|

h4. Highlights

* *ConcurrentLinkedDeque completely eliminated* from UnitOfWork route stack (PR 
#23771) — saves 37 MB in the pipeline test
* *Baseline heap down 41%* (75 MB → 44 MB) with all optimizations combined
* *MessageHistory count down 30%* in dev mode after PR #23769 profile fix
* Thread count and metaspace unchanged — no regressions

> Reduce Exchange memory pressure and fix pooled exchange issues
> --------------------------------------------------------------
>
>                 Key: CAMEL-23686
>                 URL: https://issues.apache.org/jira/browse/CAMEL-23686
>             Project: Camel
>          Issue Type: Improvement
>          Components: camel-core
>            Reporter: Guillaume Nodet
>            Assignee: Guillaume Nodet
>            Priority: Major
>              Labels: memory, performance
>
> h2. Context
> Profiling Camel routes under high throughput (timer + splitter producing ~1M 
> msg/s) reveals several memory and allocation inefficiencies in the Exchange 
> lifecycle. This issue tracks concrete improvements identified through heap 
> histogram analysis and JFR profiling.
> h2. Findings
> h3. 1. Dev profile forces messageHistory=true, overriding user properties 
> (CRITICAL)
> {{ProfileConfigurer.configureCommon()}} (line 102) unconditionally sets 
> {{messageHistory=true}} in dev mode. This overrides any user setting of 
> {{camel.main.messageHistory=false}} in application.properties, since the 
> profile configurer runs after property loading.
> *Impact:* With pooled exchanges in dev mode, {{DefaultMessageHistory}} 
> instances accumulate unbounded — in our test, 2.95M instances consumed 236MB, 
> ballooning heap from 75MB to 696MB. The history list grows because pooled 
> exchanges recycle the exchange object but the debugger re-creates message 
> history entries on each reuse.
> h3. 2. Exchange pooling only covers consumer exchanges (~40%)
> The {{PooledExchangeFactory}} only provides pooled exchanges for the 
> consumer's initial exchange. Sub-exchanges created by Splitter, Multicast, 
> and RecipientList use regular {{DefaultExchange}} instances.
> In a pipeline route with a splitter, 524K out of 1.23M exchanges were pooled 
> (42%). The remaining 703K (58%) were regular {{DefaultExchange}} instances, 
> bypassing the pool entirely.
> h3. 3. Per-exchange allocation is ~600 bytes across 10-12 objects
> Each exchange allocates:
> ||Object||Bytes||
> |DefaultExchange / DefaultPooledExchange|64-80|
> |ExtendedExchangeExtension|80|
> |EnumMap x2 (properties + internal)|80|
> |DefaultMessage|48|
> |CopyOnWriteHeadersMap|24|
> |CaseInsensitiveMap|48|
> |DefaultUnitOfWork|56|
> |ReentrantLock + NonfairSync|48|
> |ConcurrentLinkedDeque (routes)|24|
> |*Total*|*~552 bytes*|
> At 1M exchanges/s, this generates ~552MB/s allocation rate just for exchange 
> infrastructure.
> h3. 4. UnitOfWork is overweight for common single-route exchanges
> * {{ConcurrentLinkedDeque<Route> routes}} — eagerly allocated, but typically 
> holds only 1 entry. A simple field with lazy upgrade to deque would save 
> allocation.
> * {{ReentrantLock}} — allocated per UoW even though most exchanges are 
> single-threaded. Could be lazily created only when threading is detected.
> h3. 5. ExtendedExchangeExtension always allocated
> {{ExtendedExchangeExtension}} (80 bytes) is created for every exchange, even 
> though most exchanges never use extended features. Lazy initialization would 
> save 80 bytes per exchange.
> h2. Test Environment
> * JDK: Temurin 21.0.9
> * Camel: 4.21.0-SNAPSHOT (with PR #23738 CoW headers + PR #23766 O(1) 
> CaseInsensitiveMap)
> * Route: Timer(period=1) -> Split(1000 tokens) -> Direct -> CBR -> Direct -> 
> mock
> * Duration: 60s, heap histogram captured at 35s
> h2. Benchmark Results
> ||Route||Profile||Heap Used||Metaspace||Threads||
> |Baseline|prod + pooled|75 MB|42 MB|34|
> |Baseline|dev + pooled|696 MB|43 MB|35|
> |Pipeline|dev default|3,194 MB|45 MB|35|
> |Pipeline|prod + pooled|1,270 MB|45 MB|34|
> |HTTP|prod + pooled|95 MB|56 MB|58|
> h2. Positive findings
> * PR #23738 (CopyOnWriteHeadersMap) is working correctly — header copies are 
> avoided
> * PR #23766 (O(1) CaseInsensitiveMap) is active — TreeMap entries seen in 
> histograms are from JMX infrastructure, not headers
> * HTTP component adds a fixed 24 threads + 14MB metaspace (per-component, not 
> per-exchange)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to