ive been running multi-cpu/core openbsd VMs on esxi on top of amd
epyc cpus, and have noticed for a long time now that for some reason
openbsd decides that all the virtual cpus are threads on the one core.
this is annoying, cos setting hw.smt=1 feels dirty and wrong.

i spent the weekend figuring this out, and came up with the following.

our current code assumes that some cpuid fields on recent CPUs contain
information about package and core topologies which we base the package
and core ids on. turns out we're pretty much alone in this assumption. if
you're running on real hardware, they do tend to contain useful info,
but virtual machines don't fill them in properly.

every other operating system seems to rely on the one mechanism across
all families. specifically, the initial local apic id provides a
globally unique identifier that has bits representing at least the
package (socket) the logical thread is on, plus the core, and if
supported, the smt id. multi socket cpus provide information about
how many bits there are before you get to the ones identifying the
package, so we use that everywhere.

only the most recent generation of cpu (zen) supports SMT, but the cpuid
that provides that information is available on at least the previous
gen. fortunately they made it so reading the number of threads per
core falls back gracefully. this means we can default to there being
one thread per core, and increase that if the cpuid is available and
appropriately set.

this seems to fix my vmware problem, but also still seems to work on my
physical epyc boxes. unfortunately i do not have any other amd systems
up and running at the moment, so i would appreciate testing on any and
every amd based amd64 system.

tests please. ok?

Index: identcpu.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/identcpu.c,v
retrieving revision 1.113
diff -u -p -r1.113 identcpu.c
--- identcpu.c  14 Jun 2019 18:13:55 -0000      1.113
+++ identcpu.c  9 Mar 2020 06:44:41 -0000
@@ -824,37 +824,31 @@ cpu_topology(struct cpu_info *ci)
        apicid = (ebx >> 24) & 0xff;
 
        if (strcmp(cpu_vendor, "AuthenticAMD") == 0) {
+               uint32_t nthreads = 1; /* per core */
+               uint32_t thread_id; /* within a package */
+
                /* We need at least apicid at CPUID 0x80000008 */
                if (ci->ci_pnfeatset < 0x80000008)
                        goto no_topology;
 
-               if (ci->ci_pnfeatset >= 0x8000001e) {
-                       struct cpu_info *ci_other;
-                       CPU_INFO_ITERATOR cii;
+               CPUID(0x80000008, eax, ebx, ecx, edx);
+               core_bits = (ecx >> 12) & 0xf;
 
+               if (ci->ci_pnfeatset >= 0x8000001e) {
                        CPUID(0x8000001e, eax, ebx, ecx, edx);
-                       ci->ci_core_id = ebx & 0xff;
-                       ci->ci_pkg_id = ecx & 0xff;
-                       ci->ci_smt_id = 0;
-                       CPU_INFO_FOREACH(cii, ci_other) {
-                               if (ci != ci_other &&
-                                   ci_other->ci_core_id == ci->ci_core_id &&
-                                   ci_other->ci_pkg_id == ci->ci_pkg_id)
-                                       ci->ci_smt_id++;
-                       }
-               } else {
-                       CPUID(0x80000008, eax, ebx, ecx, edx);
-                       core_bits = (ecx >> 12) & 0xf;
-                       if (core_bits == 0)
-                               goto no_topology;
-                       /* So coreidsize 2 gives 3, 3 gives 7... */
-                       core_mask = (1 << core_bits) - 1;
-                       /* Core id is the least significant considering mask */
-                       ci->ci_core_id = apicid & core_mask;
-                       /* Pkg id is the upper remaining bits */
-                       ci->ci_pkg_id = apicid & ~core_mask;
-                       ci->ci_pkg_id >>= core_bits;
+                       nthreads = ((ebx >> 12) & 0xf) + 1;
                }
+
+               /* Shift the core_bits off to get at the pkg bits */
+               ci->ci_pkg_id = apicid >> core_bits;
+
+               /* Get rid of the package bits */
+               core_mask = (1 << core_bits) - 1;
+               thread_id = apicid & core_mask;
+
+               /* Cut logical thread_id into core id, and smt id in a core */
+               ci->ci_core_id = thread_id / nthreads;
+               ci->ci_smt_id = thread_id % nthreads;
        } else if (strcmp(cpu_vendor, "GenuineIntel") == 0) {
                /* We only support leaf 1/4 detection */
                if (cpuid_level < 4)

Reply via email to