On Wed, Nov 26, 2025 at 06:14:43PM +0000, Jon Kohler wrote:
> 
> 
> > On Nov 6, 2025, at 10:53 AM, Daniel P. Berrangé <[email protected]> wrote:
> > 
> > On Thu, Nov 06, 2025 at 09:31:43AM -0700, Jon Kohler wrote:
> >> Increase MAX_MEM_PREALLOC_THREAD_COUNT from 16 to 32. This was last
> >> touched in 2017 [1] and, since then, physical machine sizes and VMs
> >> therein have continue to get even bigger, both on average and on the
> >> extremes.
> >> 
> >> For very large VMs, using 16 threads to preallocate memory can be a
> >> non-trivial bottleneck during VM start-up and migration. Increasing
> >> this limit to 32 threads reduces the time taken for these operations.
> >> 
> >> Test results from quad socket Intel 8490H (4x 60 cores) show a fairly
> >> linear gain of 50% with the 2x thread count increase.
> >> 
> >> ---------------------------------------------
> >> Idle Guest w/ 2M HugePages   | Start-up time
> >> ---------------------------------------------
> >> 240 vCPU, 7.5TB (16 threads) | 2m41.955s
> >> ---------------------------------------------
> >> 240 vCPU, 7.5TB (32 threads) | 1m19.404s
> >> ---------------------------------------------
> >> 
> >> Note: Going above 32 threads appears to have diminishing returns at
> >> the point where the memory bandwidth and context switching costs
> >> appear to be a limiting factor to linear scaling. For posterity, on
> >> the same system as above:
> >> - 32 threads: 1m19s
> >> - 48 threads: 1m4s
> >> - 64 threads: 59s
> >> - 240 threads: 50s
> >> 
> >> Additional thread counts also get less interesting as the amount of
> >> memory is to be preallocated is smaller. Putting that all together,
> >> 32 threads appears to be a sane number with a solid speedup on fairly
> >> modern hardware. To go faster, we'd either need to improve the hardware
> >> (CPU/memory) itself or improve clear_pages_*() on the kernel side to
> >> be more efficient.
> >> 
> >> [1] 1e356fc14bea ("mem-prealloc: reduce large guest start-up and migration 
> >> time.")
> >> 
> >> Signed-off-by: Jon Kohler <[email protected]>
> >> ---
> >> util/oslib-posix.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Reviewed-by: Daniel P. Berrangé <[email protected]>
> 
> Thanks, Daniel !
> 
> Is there anything else we need on this one? Want to
> make sure it doesn’t get lost.

Paolo (CCd) is primary maintainer for this code and should queue it.

> >> diff --git a/util/oslib-posix.c b/util/oslib-posix.c
> >> index 3c14b72665..dc001da66d 100644
> >> --- a/util/oslib-posix.c
> >> +++ b/util/oslib-posix.c
> >> @@ -61,7 +61,7 @@
> >> #include "qemu/memalign.h"
> >> #include "qemu/mmap-alloc.h"
> >> 
> >> -#define MAX_MEM_PREALLOC_THREAD_COUNT 16
> >> +#define MAX_MEM_PREALLOC_THREAD_COUNT 32
> >> 
> >> struct MemsetThread;


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to