On Wed, Sep 08, 2021 at 10:30:09PM -0400, Daniele Buono wrote: > Stefan, the patch looks great. > Thank you for debugging the performance issue that was happening with > SafeStack. > > On 9/2/2021 4:10 AM, Stefan Hajnoczi wrote: > > On Wed, Sep 01, 2021 at 05:09:23PM +0100, Stefan Hajnoczi wrote: > > > It was reported that enabling SafeStack reduces IOPS significantly > > > (>25%) with the following fio benchmark on virtio-blk using a NVMe host > > > block device: > > > > > > # fio --rw=randrw --bs=4k --iodepth=64 --runtime=1m --direct=1 \ > > > --filename=/dev/vdb --name=job1 --ioengine=libaio --thread \ > > > --group_reporting --numjobs=16 --time_based \ > > > --output=/tmp/fio_result > > > > > > Serge Guelton and I found that SafeStack is not really at fault, it just > > > increases the cost of coroutine creation. This fio workload exhausts the > > > coroutine pool and coroutine creation becomes a bottleneck. Previous > > > work by Honghao Wang also pointed to excessive coroutine creation. > > > > > > Creating new coroutines is expensive due to allocating new stacks with > > > mmap(2) and mprotect(2). Currently there are thread-local and global > > > pools that recycle old Coroutine objects and their stacks but the > > > hardcoded size limit of 64 for thread-local pools and 128 for the global > > > pool is insufficient for the fio benchmark shown above. > > > > > > This patch changes the coroutine pool algorithm to a simple thread-local > > > pool without a size limit. Threads periodically shrink the pool down to > > > a size sufficient for the maximum observed number of coroutines. > > > > > > This is a very simple algorithm. Fancier things could be done like > > > keeping a minimum number of coroutines around to avoid latency when a > > > new coroutine is created after a long period of inactivity. Another > > > thought is to stop the timer when the pool size is zero for power saving > > > on threads that aren't using coroutines. However, I'd rather not add > > > bells and whistles unless they are really necessary. > > I agree we should aim at something that is as simple as possible. > > However keeping a minumum number of coroutines could be useful to avoid > performance regressions from the previous version. > > I wouldn't say restore the global pool - that's way too much trouble - > but keeping the old "pool size" configuration parameter as the miniumum > pool size could be a good idea to maintain the previous performance in > cases where the dynamic pool shrinks too much.
Good point. We're taking a risk by freeing all coroutines when the timer expires. It's safer to stick to the old pool size limit to avoid regressions. I'll send another revision. Stefan
signature.asc
Description: PGP signature