On 09/08/22 16:37, Daniel P. Berrangé wrote: > On Mon, Sep 05, 2022 at 01:25:27PM +0200, Laszlo Ersek wrote: >> Currently, the CPU topology for the converted domain is determined as >> follows: >> >> (1) main() [main.c] >> >> (2) set_config_defaults() [main.c] >> vcpus <-- sysconf (_SC_NPROCESSORS_ONLN) >> get_cpu_config() [cpuid.c] >> get_topology() [cpuid.c] >> sockets <-- lscpu (physical) >> cores <-- lscpu (physical) >> threads <-- lscpu (physical) >> >> (3) update_config_from_kernel_cmdline() [kernel-config.c] >> vcpus <-- kernel cmdline >> sockets <-- kernel cmdline >> cores <-- kernel cmdline >> threads <-- kernel cmdline >> >> (4) opt#1: kernel_conversion() [kernel.c] >> no change to topology info >> opt#2: gui_conversion() [gui.c] >> vcpus <-- GUI >> >> (5) ... >> start_conversion() [conversion.c] >> generate_physical_xml() [physical-xml.c] >> format vcpus, sockets, cores, threads >> >> In (2), we initialize the topology information from the physical machine. >> In (3), we override each property in isolation from the kernel command >> line (this is done for both kernel conversion and GUI conversion; in the >> former case, it is the only way for the user to influence each element of >> the topology). In (4), in case we picked GUI conversion, the VCPU number >> can be overridden yet another time on the GUI. In (5), we format the >> topology information into the domain XML. >> >> The problem is that it's easy to create inconsistent topologies that >> libvirt complains about. One example is that in step (4) on the GUI, if >> the flat VCPU count is reduced by the user, it can result in sockets >> partially populated with cores or threads, or in cores partially populated >> with threads. Another example (even if the user does not touch the VCPU >> count) is a partially onlined CPU topology on the physical machine: >> _SC_NPROCESSORS_ONLN does not reflect any topology information, and if the >> flat number it returns is less than the (sockets * cores * threads) >> product from "lscpu", then the online logical processors will be >> "compressed to the left" by libvirt just the same as after the manual VCPU >> count reduction. >> >> An over-complicated way to fix this would be the following: >> >> - parse the output of "lscpu -p" (which is available since RHEL6 -- >> "lscpu" is altogether missing in RHEL5), retrieving online-ness combined >> with full topology information >> >> - expose complete topology info on the GUI >> >> - format the complete topology information for libvirt. >> >> A simpler way (which is what this patch series chooses) is to remove some >> flexibility from virt-p2v's configuration interface. Namely, only allow >> the user to choose betwen (a) copying the physical CPU topology, *fully >> populated*, (b) specifying the core count N for a "1 socket * N >> cores/socket * 1 thread/core" topology. >> >> Option (a) eliminates the "partially onlined physical CPU topology" >> scenario; it produces a topology that exists in the physical world, and >> does not require p2v to maintain topology information more expressively >> than it currently does. >> >> Option (b) eliminates any conflicts between the physical (or any >> user-configured) sockets:cores:threads hierarchy and a manually set >> logical processor count. Option (b) produces the kind of flat CPU topology >> that QEMU has defaulted to since version 6.2, with a simple "-smp" option. >> Option (b) preserves the simple p2v user interface for entering only a >> logical processor count. > > I would further say that option (b) is the *only* one that makes > sense from a performance POV, if you are *not* doing 1:1 pCPU:vCPU > pinning. > > The guest kernel scheduler will take different placement decisions > based on the topology, and if guest CPUIs are floating across > arbitrary host CPUs, those scheduler decisions are meaningless and > potentially even harmful to performance. > > IOW, I'd say option (b) should be the default in the absence of any > explicit override from the user.
That's how this patch (#2) introduces the new knob, but then patch #5 makes option (a) the default. However, that's precisely why I made patch#5 a separate patch ("copy fully populated vCPU topology by default", <https://listman.redhat.com/archives/libguestfs/2022-September/029811.html>) -- so that it be easy to drop, if necessary. Thus, I can certainly drop it from v2. Rich, do you agree? (I had made extra sure that the GUI would work fine with either option set as the default.) Thanks, Laszlo > > When libvirt integrates support for core scheduling, then it becomes > more reasonable to consider setting different topologies even when > not pinning. > > > With regards, > Daniel > _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://listman.redhat.com/mailman/listinfo/libguestfs