Hi guys! I've pushed a PR (https://github.com/apache/activemq-artemis/pull/1256) to improve the durable write performance of the main file based journals: ASYNCIO and NIO.
The changes are already summarized in the PR itself, but in this post I want to give more context about the changes and what performance improvements could be expected by it. The main change that impacts both the journals (NIO, ASYNCIO) is the new TimedBuffer. The original one was designed to batch different durable write requests using a fixed time budget computed during the tuning of the broker. Its algorithm was pretty effective, saturating (with ASYNCIO in particular) very fast disks too... hence what could be changed on it? The only improvements I've noticed were related to 2 things: 1) going off-heap with the batch buffer and using only memcpy/bulk copy operations while filling it 2) minimize the pauses between each batch write while maintaining the same batching effectiveness (maximize batches) To achieve the latter I've built an heuristic to estimate an optimal write batch size to be used in addition of the given time budget. When the window is matched the flushes can be issued far before the time budged expiration, leading to a lower latency between the writes and a better overall throughput. This has improved the write performance >=30 % for ASYNCIO while for NIO is >=100 (> 2x). The more the writers the more will be visible an improvement from the original version. For NIO the improvement is far more pronounced due to other optimisations as: - direct byte buffer pooling using TLABs allocations (ie no more minor GCs dependent by journal writes) - using only direct byte buffers to write against FileChannel, avoiding the additional copy performed internally by NIO The reduction of garbage produced and buffer copies are the other reasons of that big improvement. I've measured the improvements with 2, 8, 16, 32, 64, 128 writers and different message sizes (100, 1000, 4000, 10000 bytes) against a Xeon with 16 physical processors (32 threads), a fast SSD with write cache disabled and a modern Linux kernel (>= 4.0.x). The broker is configured with the default values. The kernel version is pretty important (to be >=3.X) for NIO in particular, because of the unification of the Linux Page Cache, making the fsync/write operations much cheaper. The write improvements on NIO will work to improve the write on paging performances too, but I need to quantify the improvements with further tests. If any of you is curious to check the branch of the PR and try by himself the improvements (if any!) I'll be glad to hear other numbers too!!! Thanks, Francesco -- View this message in context: http://activemq.2283324.n4.nabble.com/Adapting-TimedBuffer-and-NIO-Buffer-Pooling-tp4725727.html Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
