Hi Michael! About the heuristic:
> As such by flushing early to try reduce the latency of one write may > impact the latency of the next You're right, I need to explain more about it. The ratio of the heuristic is pretty subtle: it "embraces" the estimation error (aka budget time expiration in the original TimedBuffer), making use of it on the assumption that the next sync write load (in term of concurrent requests/size) won't not be so much different from the previous one. In the worst case it won't be different from the original version because the budget time expiration is always taken into account (ie iops): it uses the queuing effect (due to I/O backpressure) of the synced write requests as a mean to learn the best load that satisfy the given budget time. The latency is reduced not from the point of view of the write time that, as you rightly pointed, is a physical limit: it aims to reduce the pauses between the end of the flush and the computed expiration of the given budget time, leaving more room to the broker to alert the writers (producers/consumers) to issue the next writes. If that configured budget time is already optimal, that pause time won't exists and the heuristic can't give any benefit (but without hurting), but in the more common case that a single value can't be 100% optimal for all the use cases, it can help improving not only latency but the overall throughput too, due to the "anticipated" next writes. > Predictable latency I think is more important here The max latency will be the same (actually a little better, due to the usage of LockSupport.parkNanos, measured using HdrHistogram), because at worst it behaves exactly like the original TimedBuffer. In other branches I've done something like https://mechanical-sympathy.blogspot.it/2011/10/smart-batching.html to improve latency and throughput, but it didn't work very well because of the way the synced writers in a broker blocks on each syncs, hence at the end I come up with this solution that works even better using proper queues/ring buffers :) >Also be good to have some stats where feature 1 is implemented in isolation, then again stats with feature 2. Obviously having the standard histogram of latency and throughput. I can make the heuristic configurable in order to measure in isolation its impact and I've built a tool to measure latencies: https://github.com/franz1981/artemis-load-generator. It can measure real-time producer/consumer latencies (not only histograms) and performs responsiveness under load benchmarks (ie with a target throughput). I can provide a separated benchmark too only for the journal, similar to this one: https://github.com/franz1981/activemq-artemis/blob/batch_buffer/artemis-journal/src/test/java/org/apache/activemq/artemis/core/io/JournalTptBenchmark.java But to make it more realistic it needs to simulate the pauses between each writes, simulating the WAN latencies of a real broker: seems an entire PR by itself! -- View this message in context: http://activemq.2283324.n4.nabble.com/Adapting-TimedBuffer-and-NIO-Buffer-Pooling-tp4725727p4725735.html Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
