On 8/15/22 20:43, Aaron Conole wrote: > Ilya Maximets <[email protected]> writes: > >> On 8/9/22 09:18, Michael Santana wrote: >>> The handler and CPU mapping in upcalls are incorrect, and this is >>> specially noticeable systems with cpu isolation enabled. >>> >>> Say we have a 12 core system where only every even number CPU is enabled >>> C0, C2, C4, C6, C8, C10 >>> >>> This means we will create an array of size 6 that will be sent to >>> kernel that is populated with sockets [S0, S1, S2, S3, S4, S5] >>> >>> The problem is when the kernel does an upcall it checks the socket array >>> via the index of the CPU, effectively adding additional load on some >>> CPUs while leaving no work on other CPUs. >>> >>> e.g. >>> >>> C0 indexes to S0 >>> C2 indexes to S2 (should be S1) >>> C4 indexes to S4 (should be S2) >>> >>> Modulo of 6 (size of socket array) is applied, so we wrap back to S0 >>> C6 indexes to S0 (should be S3) >>> C8 indexes to S2 (should be S4) >>> C10 indexes to S4 (should be S5) >>> >>> Effectively sockets S0, S2, S4 get overloaded while sockets S1, S3, S5 >>> get no work assigned to them >>> >>> This leads to the kernel to throw the following message: >>> "openvswitch: cpu_id mismatch with handler threads" >>> >>> Instead we will send the kernel a corrected array of sockets the size >>> of all CPUs in the system, or the largest core_id on the system, which >>> ever one is greatest. This is to take care of systems with non-continous >>> core cpus. >>> >>> In the above example we would create a >>> corrected array in a round-robin(assuming prime bias) fashion as follows: >>> [S0, S1, S2, S3, S4, S5, S6, S0, S1, S2, S3, S4] >>> >>> Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.") >>> >>> Co-authored-by: Aaron Conole <[email protected]> >>> signed-off-by: Aaron Conole <[email protected]> >>> Signed-off-by: Michael Santana <[email protected]> >>> --- >>> lib/dpif-netlink.c | 31 +++++++++++++++++++++++++------ >>> lib/ovs-numa.c | 28 ++++++++++++++++++++++++++++ >>> lib/ovs-numa.h | 1 + >>> 3 files changed, 54 insertions(+), 6 deletions(-) >>> >>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c >>> index 90be7c52d..76ff84050 100644 >>> --- a/lib/dpif-netlink.c >>> +++ b/lib/dpif-netlink.c >>> @@ -801,13 +801,26 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, >>> const uint32_t *upcall_pids, >>> uint32_t n_upcall_pids) >>> { >>> struct dpif_netlink *dpif = dpif_netlink_cast(dpif_); >>> + int largest_cpu_id = ovs_numa_get_largest_core_id(); >>> struct dpif_netlink_dp request, reply; >>> struct ofpbuf *bufp; >>> - int error; >>> - int n_cores; >>> >>> - n_cores = count_cpu_cores(); >>> - ovs_assert(n_cores == n_upcall_pids); >>> + uint32_t *corrected; >>> + int error, i, n_cores; >>> + >>> + if (largest_cpu_id == OVS_NUMA_UNSPEC) { >>> + largest_cpu_id = -1; >>> + } >>> + >>> + /* Some systems have non-continuous cpu cores ids. count_total_cores() >>> + * would return an accurate number, however, this number cannot be >>> used. >>> + * e.g. If the largest core_id of a system is cpu9, but the system only >>> + * has 4 cpus then the OVS kernel module would throw a "CPU mismatch" >>> + * warning. With the MAX() in place in this example we send an array >>> of >>> + * size 10 and prevent the warning. This has no bearing on the number >>> of >>> + * threads created. >>> + */ >>> + n_cores = MAX(count_total_cores(), largest_cpu_id + 1); >>> VLOG_DBG("Dispatch mode(per-cpu): Number of CPUs is %d", n_cores); >> >> Hi, Michael and Aaron. This version looks fine to me, except for the log >> message here. Aaron, I think, you pointed this out in reply to v6 already, >> we need to change the log message since it's not accurate. It's not really >> the number of CPUs, at least not in case of non-continuous core numbering. >> >> Suggesting the following change: >> >> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c >> index cc77b7ba8..9f9d10055 100644 >> --- a/lib/dpif-netlink.c >> +++ b/lib/dpif-netlink.c >> @@ -821,7 +821,8 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const >> uint32_t *upcall_pids, >> * threads created. >> */ >> n_cores = MAX(count_total_cores(), largest_cpu_id + 1); >> - VLOG_DBG("Dispatch mode(per-cpu): Number of CPUs is %d", n_cores); >> + VLOG_DBG("Dispatch mode(per-cpu): Setting up handler PIDs for %d cores", >> + n_cores); >> >> dpif_netlink_dp_init(&request); >> request.cmd = OVS_DP_CMD_SET; >> --- >> >> If that looks good to you, I can fold that in on commit. I'm testing the >> series right now. > > LGTM - thanks, Ilya!
Thanks! With that and a couple of cosmetic fixes, applied. Backported down to 2.16. Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
