On 8/15/22 20:43, Aaron Conole wrote:
> Ilya Maximets <[email protected]> writes:
> 
>> On 8/9/22 09:18, Michael Santana wrote:
>>> The handler and CPU mapping in upcalls are incorrect, and this is
>>> specially noticeable systems with cpu isolation enabled.
>>>
>>> Say we have a 12 core system where only every even number CPU is enabled
>>> C0, C2, C4, C6, C8, C10
>>>
>>> This means we will create an array of size 6 that will be sent to
>>> kernel that is populated with sockets [S0, S1, S2, S3, S4, S5]
>>>
>>> The problem is when the kernel does an upcall it checks the socket array
>>> via the index of the CPU, effectively adding additional load on some
>>> CPUs while leaving no work on other CPUs.
>>>
>>> e.g.
>>>
>>> C0  indexes to S0
>>> C2  indexes to S2 (should be S1)
>>> C4  indexes to S4 (should be S2)
>>>
>>> Modulo of 6 (size of socket array) is applied, so we wrap back to S0
>>> C6  indexes to S0 (should be S3)
>>> C8  indexes to S2 (should be S4)
>>> C10 indexes to S4 (should be S5)
>>>
>>> Effectively sockets S0, S2, S4 get overloaded while sockets S1, S3, S5
>>> get no work assigned to them
>>>
>>> This leads to the kernel to throw the following message:
>>> "openvswitch: cpu_id mismatch with handler threads"
>>>
>>> Instead we will send the kernel a corrected array of sockets the size
>>> of all CPUs in the system, or the largest core_id on the system, which
>>> ever one is greatest. This is to take care of systems with non-continous
>>> core cpus.
>>>
>>> In the above example we would create a
>>> corrected array in a round-robin(assuming prime bias) fashion as follows:
>>> [S0, S1, S2, S3, S4, S5, S6, S0, S1, S2, S3, S4]
>>>
>>> Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.")
>>>
>>> Co-authored-by: Aaron Conole <[email protected]>
>>> signed-off-by: Aaron Conole <[email protected]>
>>> Signed-off-by: Michael Santana <[email protected]>
>>> ---
>>>  lib/dpif-netlink.c | 31 +++++++++++++++++++++++++------
>>>  lib/ovs-numa.c     | 28 ++++++++++++++++++++++++++++
>>>  lib/ovs-numa.h     |  1 +
>>>  3 files changed, 54 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>>> index 90be7c52d..76ff84050 100644
>>> --- a/lib/dpif-netlink.c
>>> +++ b/lib/dpif-netlink.c
>>> @@ -801,13 +801,26 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, 
>>> const uint32_t *upcall_pids,
>>>                                uint32_t n_upcall_pids)
>>>  {
>>>      struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
>>> +    int largest_cpu_id = ovs_numa_get_largest_core_id();
>>>      struct dpif_netlink_dp request, reply;
>>>      struct ofpbuf *bufp;
>>> -    int error;
>>> -    int n_cores;
>>>  
>>> -    n_cores = count_cpu_cores();
>>> -    ovs_assert(n_cores == n_upcall_pids);
>>> +    uint32_t *corrected;
>>> +    int error, i, n_cores;
>>> +
>>> +    if (largest_cpu_id == OVS_NUMA_UNSPEC) {
>>> +        largest_cpu_id = -1;
>>> +    }
>>> +
>>> +    /* Some systems have non-continuous cpu cores ids.  count_total_cores()
>>> +     * would return an accurate number, however, this number cannot be 
>>> used.
>>> +     * e.g. If the largest core_id of a system is cpu9, but the system only
>>> +     * has 4 cpus then the OVS kernel module would throw a "CPU mismatch"
>>> +     * warning.  With the MAX() in place in this example we send an array 
>>> of
>>> +     * size 10 and prevent the warning.  This has no bearing on the number 
>>> of
>>> +     * threads created.
>>> +     */
>>> +    n_cores = MAX(count_total_cores(), largest_cpu_id + 1);
>>>      VLOG_DBG("Dispatch mode(per-cpu): Number of CPUs is %d", n_cores);
>>
>> Hi, Michael and Aaron.  This version looks fine to me, except for the log
>> message here.  Aaron, I think, you pointed this out in reply to v6 already,
>> we need to change the log message since it's not accurate.  It's not really
>> the number of CPUs, at least not in case of non-continuous core numbering.
>>
>> Suggesting the following change:
>>
>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>> index cc77b7ba8..9f9d10055 100644
>> --- a/lib/dpif-netlink.c
>> +++ b/lib/dpif-netlink.c
>> @@ -821,7 +821,8 @@ dpif_netlink_set_handler_pids(struct dpif *dpif_, const 
>> uint32_t *upcall_pids,
>>       * threads created.
>>       */
>>      n_cores = MAX(count_total_cores(), largest_cpu_id + 1);
>> -    VLOG_DBG("Dispatch mode(per-cpu): Number of CPUs is %d", n_cores);
>> +    VLOG_DBG("Dispatch mode(per-cpu): Setting up handler PIDs for %d cores",
>> +             n_cores);
>>  
>>      dpif_netlink_dp_init(&request);
>>      request.cmd = OVS_DP_CMD_SET;
>> ---
>>
>> If that looks good to you, I can fold that in on commit.  I'm testing the
>> series right now.
> 
> LGTM - thanks, Ilya!

Thanks!  With that and a couple of cosmetic fixes, applied.
Backported down to 2.16.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to