Ilya Maximets <[email protected]> writes:

> On 8/9/22 09:18, Michael Santana wrote:
>> The handler and CPU mapping in upcalls are incorrect, and this is
>> specially noticeable systems with cpu isolation enabled.
>> 
>> Say we have a 12 core system where only every even number CPU is enabled
>> C0, C2, C4, C6, C8, C10
>> 
>> This means we will create an array of size 6 that will be sent to
>> kernel that is populated with sockets [S0, S1, S2, S3, S4, S5]
>> 
>> The problem is when the kernel does an upcall it checks the socket array
>> via the index of the CPU, effectively adding additional load on some
>> CPUs while leaving no work on other CPUs.
>> 
>> e.g.
>> 
>> C0  indexes to S0
>> C2  indexes to S2 (should be S1)
>> C4  indexes to S4 (should be S2)
>> 
>> Modulo of 6 (size of socket array) is applied, so we wrap back to S0
>> C6  indexes to S0 (should be S3)
>> C8  indexes to S2 (should be S4)
>> C10 indexes to S4 (should be S5)
>> 
>> Effectively sockets S0, S2, S4 get overloaded while sockets S1, S3, S5
>> get no work assigned to them
>> 
>> This leads to the kernel to throw the following message:
>> "openvswitch: cpu_id mismatch with handler threads"
>> 
>> Instead we will send the kernel a corrected array of sockets the size
>> of all CPUs in the system, or the largest core_id on the system, which
>> ever one is greatest. This is to take care of systems with non-continous
>> core cpus.
>> 
>> In the above example we would create a
>> corrected array in a round-robin(assuming prime bias) fashion as follows:
>> [S0, S1, S2, S3, S4, S5, S6, S0, S1, S2, S3, S4]
>> 
>> Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.")
>> 
>> Co-authored-by: Aaron Conole <[email protected]>
>> signed-off-by: Aaron Conole <[email protected]>
>> Signed-off-by: Michael Santana <[email protected]>
>> ---
>>  lib/dpif-netlink.c | 31 +++++++++++++++++++++++++------
>>  lib/ovs-numa.c     | 28 ++++++++++++++++++++++++++++
>>  lib/ovs-numa.h     |  1 +
>>  3 files changed, 54 insertions(+), 6 deletions(-)
>> 
>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>> index 90be7c52d..76ff84050 100644
>> --- a/lib/dpif-netlink.c
>> +++ b/lib/dpif-netlink.c
>> @@ -801,13 +801,26 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, 
>> const uint32_t *upcall_pids,
>>                                uint32_t n_upcall_pids)
>>  {
>>      struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
>> +    int largest_cpu_id = ovs_numa_get_largest_core_id();
>>      struct dpif_netlink_dp request, reply;
>>      struct ofpbuf *bufp;
>> -    int error;
>> -    int n_cores;
>>  
>> -    n_cores = count_cpu_cores();
>> -    ovs_assert(n_cores == n_upcall_pids);
>> +    uint32_t *corrected;
>> +    int error, i, n_cores;
>> +
>> +    if (largest_cpu_id == OVS_NUMA_UNSPEC) {
>> +        largest_cpu_id = -1;
>> +    }
>> +
>> +    /* Some systems have non-continuous cpu cores ids.  count_total_cores()
>> +     * would return an accurate number, however, this number cannot be used.
>> +     * e.g. If the largest core_id of a system is cpu9, but the system only
>> +     * has 4 cpus then the OVS kernel module would throw a "CPU mismatch"
>> +     * warning.  With the MAX() in place in this example we send an array of
>> +     * size 10 and prevent the warning.  This has no bearing on the number 
>> of
>> +     * threads created.
>> +     */
>> +    n_cores = MAX(count_total_cores(), largest_cpu_id + 1);
>>      VLOG_DBG("Dispatch mode(per-cpu): Number of CPUs is %d", n_cores);
>
> Hi, Michael and Aaron.  This version looks fine to me, except for the log
> message here.  Aaron, I think, you pointed this out in reply to v6 already,
> we need to change the log message since it's not accurate.  It's not really
> the number of CPUs, at least not in case of non-continuous core numbering.
>
> Suggesting the following change:
>
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index cc77b7ba8..9f9d10055 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -821,7 +821,8 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const 
> uint32_t *upcall_pids,
>       * threads created.
>       */
>      n_cores = MAX(count_total_cores(), largest_cpu_id + 1);
> -    VLOG_DBG("Dispatch mode(per-cpu): Number of CPUs is %d", n_cores);
> +    VLOG_DBG("Dispatch mode(per-cpu): Setting up handler PIDs for %d cores",
> +             n_cores);
>  
>      dpif_netlink_dp_init(&request);
>      request.cmd = OVS_DP_CMD_SET;
> ---
>
> If that looks good to you, I can fold that in on commit.  I'm testing the
> series right now.

LGTM - thanks, Ilya!

> Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to