merlimat opened a new issue, #3559:
URL: https://github.com/apache/bookkeeper/issues/3559
The BK data path is very efficient when processing large entries and it's
generally able to saturate the disk and network IO in these cases.
By contrast, when handling a large number of very small entries there are
several inefficiencies that cause the CPU to become the bottleneck, because of
the per-entry overhead.
There are several low-hanging fruits to tackle to improve performance:
#### Reduce contention between message passing
Reduce contention in journal & force-write queues:
- [ ] #3544
- [x] #3545
Improve the OrderedExecutor performance:
- [ ] #3546
#### Reduce the number of buffers allocated per entry written/read.
For each entry being written in a ledger we are using 4 `ByteBuf` instances:
1. The entry payload (this gets passed in to BK client)
2. The checksum
3. The serialized `AddRequest
4. The 4 byte size header
These buffers are passed to Netty which will do a scatter `writev`, though
it will pass all the buffers.
Allocating and managing all these buffer is expensive. There is overhead in:
* Refcounting
* Recycler to get the `ByteBuf` instances and put them back in the pool
* ByteBuf pool arena to handle allocations/deallocation
* Inter-thread synchronization: these buffer are normally allocated in one
thread and deallocated from a different thread
To make matters worse, while the checksum is computed only once, the
`AddRequest` is serialized each time we write it on a connection.
eg: if we have write-quorum=3, it would mean we are using (2 * 3) + 1 = 7
`ByteBuf` per each entry.
Finally, while for big entries is very important to avoid copying the
payload, for small entries the overhead of maintaining the `ByteBufList` is
greater than just copying the payload into a single buffer.
For that we should do:
1. If the entries are big -> keep using `ByteBufList`, with 1 buffer for
all the header and the 2nd buffer referencing the payload, with no copy.
2. If entry is small -> allocate a buffer to contain all the headers and
the payload and copy into it.
Pending changes:
- [ ] Add the 4 bytes frame size header when serializing the request,
instead of relying on a separate Netty filter
- [ ] Consolidate buffer for small entries on read-response
- [ ] Serialize only once and consolidate small entries for add requests
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]