arturobernalg commented on PR #578:
URL: 
https://github.com/apache/httpcomponents-core/pull/578#issuecomment-4099208257

   Hi Oleg,
                                                                                
                                                                                
                                                       
     I've been benchmarking different ByteBufferAllocator strategies for the 
classic-over-async facade with 1MB responses at concurrency 50.
                                                                                
                                                                                
                                                       
     Key finding: A simple ThreadLocalByteBufferAllocator that caches the 
single largest released buffer per thread significantly outperforms both the 
baseline and the bucketed PooledByteBufferAllocator.            
                                                                                
                                                                                
                                                       
     JMH microbenchmark results (HEAP, ops/ms, higher is better):               
                                                                                
                                                       
                                                               
     ┌─────────────┬───────────────────┬────────┬─────────────┐                 
                                                                                
                                                       
     │ Buffer Size │ Simple (baseline) │ Pooled │ ThreadLocal │
     ├─────────────┼───────────────────┼────────┼─────────────┤                 
                                                                                
                                                       
     │ 1KB         │ 174               │ 260    │ 367         │
     ├─────────────┼───────────────────┼────────┼─────────────┤
     │ 8KB         │ 22                │ 88     │ 90          │                 
                                                                                
                                                       
     ├─────────────┼───────────────────┼────────┼─────────────┤                 
                                                                                
                                                       
     │ 64KB        │ 2.6               │ 11.5   │ 11.6        │                 
                                                                                
                                                       
     └─────────────┴───────────────────┴────────┴─────────────┘                 
                                                                                
                                                       
                                                               
     Both Pooled and ThreadLocal show zero GC allocations on the hot path 
(gc.count ≈ 0).                                                                 
                                                             
                                                               
     End-to-end benchmark (50K GET requests, 1MB response, c=50, 4 rounds with 
alternating order):                                                             
                                                        
                                                               
     ┌─────────────┬─────────────┬─────────────┐                                
                                                                                
                                                       
     │    Agent    │ Avg req/sec │ vs Baseline │               
     ├─────────────┼─────────────┼─────────────┤                                
                                                                                
                                                       
     │ Baseline    │ 1422        │ —           │               
     ├─────────────┼─────────────┼─────────────┤
     │ Pooled      │ 1442        │ +1.4%       │
     ├─────────────┼─────────────┼─────────────┤                                
                                                                                
                                                       
     │ ThreadLocal │ 1567        │ +10.2%      │
     └─────────────┴─────────────┴─────────────┘                                
                                                                                
                                                       
                                                               
     Why ThreadLocal wins: The IO reactor uses a small fixed thread pool. After 
the first request each thread's buffer stabilizes at the final size. Subsequent 
requests get a direct cache hit — zero expandCapacity()
      chains, zero copies, zero GC. The PooledByteBufferAllocator achieves the 
same reuse but pays for bucket lookups and CAS operations that the ThreadLocal 
approach avoids entirely.
                                                                                
                                                                                
                                                       
     The implementation is ~30 lines: one ThreadLocal<ByteBuffer>, 
keep-largest-on-release policy, no locks, no contention.                        
                                                                    
      
     I'm going to prepare the PR with the ThreadLocalByteBufferAllocator and 
the full benchmark results.      


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to