On Thu, Apr 24, 2025 at 02:07:17PM +0100, Daniel P. Berrangé wrote: > > On Thu, Apr 24, 2025 at 01:22:46PM +0200, Felix Huettner via Devel wrote: > > In case of a host that has a large number of cpus offline the count of > > host cpus and the last bit set in the virHostCPUGetOnlineBitmap might > > diverge significantly. This can e.g. be the case when disabeling smt via > > /sys/devices/system/cpu/smt/control. > > > > On the host this looks like: > > ``` > > $ cat /sys/devices/system/cpu/present > > 0-255 > > $ cat /sys/devices/system/cpu/online > > 0-127 > > ``` > > > > However in this case virBitmapToData previously only allocated 16 bytes > > for the output bitmap. This is becase the last set bit is on the 15th > > byte. > > > > Users of virHostCPUGetMap however rely on the "cpumap" containing enough > > space for all existing cpus (so they would expect 32 bytes in this case). > > E.g. cmdNodeCpuMap relies on this for its output. It will then actually > > read 32 bytes from the start of the "cpumap" address where in this case > > the last 16 of these bytes are uninitialized. > > > > This manifests itself in flapping outputs of "virsh nodecpumap --pretty" > > like: > > ``` > > $ virsh nodecpumap --pretty > > CPUs present: 256 > > CPUs online: 128 > > CPU map: 0-127,192,194,202 > > > > $ virsh nodecpumap --pretty > > CPUs present: 256 > > CPUs online: 128 > > CPU map: 0-127,192,194,197 > > > > $ virsh nodecpumap --pretty > > CPUs present: 256 > > CPUs online: 128 > > CPU map: 0-127,192,194,196-197 > > ``` > > > > This in turn potentially causes users of this data to report wrong cpu > > counts. > > > > Note that this only seems to happen with at least 256 phyiscal cpus > > where at least 128 are offline. > > > > We fix this by preallocating the expected bitmap size. > > > > Signed-off-by: Felix Huettner <felix.huettner@stackit.cloud> > > --- > > src/util/virhostcpu.c | 14 +++++++++----- > > 1 file changed, 9 insertions(+), 5 deletions(-) > > > > diff --git a/src/util/virhostcpu.c b/src/util/virhostcpu.c > > index 5dbcc8987c..626faa88cf 100644 > > --- a/src/util/virhostcpu.c > > +++ b/src/util/virhostcpu.c > > @@ -1091,22 +1091,26 @@ virHostCPUGetMap(unsigned char **cpumap, > > { > > g_autoptr(virBitmap) cpus = NULL; > > int ret = -1; > > - int dummy; > > > > virCheckFlags(0, -1); > > > > + ret = virHostCPUGetCount(); > > This sets 'ret' to a positive value.... > > > + > > if (!cpumap && !online) > > - return virHostCPUGetCount(); > > + return ret; > > > > if (!(cpus = virHostCPUGetOnlineBitmap())) > > goto cleanup; > > ...in the failure scenario we now jump to 'cleanup' > with 'ret' still positive. > > Better to use a different pattern with separate variables > > int ret = -1; > int ncpus = virHostCPUGetCount(); > > ...do stuff... > > ret = ncpus; > cleanup: > if (ret < 0) > ... > return ret;
Hi Daniel, thanks for the review. I will post a fixed v2. I decided to get rid of the "goto cleanup" completely and just return -1 instead. "cleanup" seems to only have existed to deallocate cpumap, but by the time we jump there cpumap can not possibly be allocated. So i hope this makes it more easy to read. Thanks a lot, Felix > > > > > - if (cpumap) > > - virBitmapToData(cpus, cpumap, &dummy); > > + if (cpumap) { > > + int len = (ret + CHAR_BIT) / CHAR_BIT; > > + *cpumap = g_new0(unsigned char, len); > > + virBitmapToDataBuf(cpus, *cpumap, len); > > + } > > + > > if (online) > > *online = virBitmapCountBits(cpus); > > > > - ret = virHostCPUGetCount(); > > > > cleanup: > > if (ret < 0 && cpumap) > > > > base-commit: c5a73f75bc6cae4f466d0a6344d5b3277ac9c2f4 > > -- > > 2.43.0 > > > > With regards, > Daniel > -- > |: https://berrange.com/ > |: https://libvirt.org/ > |: https://entangle-photo.org/ >