arturobernalg commented on PR #578:
URL:
https://github.com/apache/httpcomponents-core/pull/578#issuecomment-4106364938
Hi @ok2c ,
I've spent considerable time benchmarking the classic-over-async facade
trying to reach the 5% improvement threshold. Here's a summary of what I found.
Benchmark setup: 50,000 requests, concurrency 200, 1MB response bodies,
HTTP/1.1, 6 rounds per configuration, A/B comparison in the same session where
possible.
What I tried:
1. `PooledByteBufferAllocator` wired through `ClassicToAsyncAdaptor` →
`SharedInputBuffer`: Result within noise (±2%). during buffer expansion
(2KB→4KB→...→2MB), only intermediate
buffers are recycled. The final ~2MB buffer cannot be safely returned to
the pool because `releaseResources()` races with async framework callbacks
(use-after-free). So every request still allocates a fresh final
buffer.
2. Content-Length pre-sizing (read Content-Length header and pre-allocate
`SharedInputBuffer` to the right size): Actually 16% slower. With 200
concurrent connections, pre-allocating 200×1MB buffers up front
overwhelms the system and disrupts the flow control dynamics (capacity
channel reports 1MB available instead of 2KB).
3. `SharedInputBuffer` micro-optimizations (signalAll() → signal(),
AtomicInteger → plain int for capacity increment since it's always accessed
under the lock): Correct improvements semantically, but not
measurable — within noise.
Conclusion: At concurrency 200 with 1MB responses, the system is
transferring ~2.5GB/sec of content. The bottleneck is memory bandwidth and
network I/O, not buffer allocation or lock contention in
`SharedInputBuffer`. The buffer management overhead is a rounding error
compared to the cost of moving data through the network stack.
The `PooledByteBufferAllocator` itself performs well in isolation (JMH:
471 ops/ms vs 194 for `SimpleByteBufferAllocator` at 1KB HEAP), but the
classic-over-async facade's one-way buffer growth pattern prevents the pool
from being effective. The allocator would likely show better results in code
paths where buffers are allocated and released at the same size repeatedly,
like async framework internals or H2 frame
handling.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]