On Thu, Apr 24, 2025 at 01:22:46PM +0200, Felix Huettner via Devel wrote: > In case of a host that has a large number of cpus offline the count of > host cpus and the last bit set in the virHostCPUGetOnlineBitmap might > diverge significantly. This can e.g. be the case when disabeling smt via > /sys/devices/system/cpu/smt/control. > > On the host this looks like: > ``` > $ cat /sys/devices/system/cpu/present > 0-255 > $ cat /sys/devices/system/cpu/online > 0-127 > ``` > > However in this case virBitmapToData previously only allocated 16 bytes > for the output bitmap. This is becase the last set bit is on the 15th > byte. > > Users of virHostCPUGetMap however rely on the "cpumap" containing enough > space for all existing cpus (so they would expect 32 bytes in this case). > E.g. cmdNodeCpuMap relies on this for its output. It will then actually > read 32 bytes from the start of the "cpumap" address where in this case > the last 16 of these bytes are uninitialized. > > This manifests itself in flapping outputs of "virsh nodecpumap --pretty" like: > ``` > $ virsh nodecpumap --pretty > CPUs present: 256 > CPUs online: 128 > CPU map: 0-127,192,194,202 > > $ virsh nodecpumap --pretty > CPUs present: 256 > CPUs online: 128 > CPU map: 0-127,192,194,197 > > $ virsh nodecpumap --pretty > CPUs present: 256 > CPUs online: 128 > CPU map: 0-127,192,194,196-197 > ``` > > This in turn potentially causes users of this data to report wrong cpu > counts. > > Note that this only seems to happen with at least 256 phyiscal cpus > where at least 128 are offline. > > We fix this by preallocating the expected bitmap size. > > Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> > --- > src/util/virhostcpu.c | 14 +++++++++----- > 1 file changed, 9 insertions(+), 5 deletions(-) > > diff --git a/src/util/virhostcpu.c b/src/util/virhostcpu.c > index 5dbcc8987c..626faa88cf 100644 > --- a/src/util/virhostcpu.c > +++ b/src/util/virhostcpu.c > @@ -1091,22 +1091,26 @@ virHostCPUGetMap(unsigned char **cpumap, > { > g_autoptr(virBitmap) cpus = NULL; > int ret = -1; > - int dummy; > > virCheckFlags(0, -1); > > + ret = virHostCPUGetCount();
This sets 'ret' to a positive value.... > + > if (!cpumap && !online) > - return virHostCPUGetCount(); > + return ret; > > if (!(cpus = virHostCPUGetOnlineBitmap())) > goto cleanup; ...in the failure scenario we now jump to 'cleanup' with 'ret' still positive. Better to use a different pattern with separate variables int ret = -1; int ncpus = virHostCPUGetCount(); ...do stuff... ret = ncpus; cleanup: if (ret < 0) ... return ret; > > - if (cpumap) > - virBitmapToData(cpus, cpumap, &dummy); > + if (cpumap) { > + int len = (ret + CHAR_BIT) / CHAR_BIT; > + *cpumap = g_new0(unsigned char, len); > + virBitmapToDataBuf(cpus, *cpumap, len); > + } > + > if (online) > *online = virBitmapCountBits(cpus); > > - ret = virHostCPUGetCount(); > > cleanup: > if (ret < 0 && cpumap) > > base-commit: c5a73f75bc6cae4f466d0a6344d5b3277ac9c2f4 > -- > 2.43.0 > With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|