On 5/8/26 1:30 PM, zwtop wrote:
> count_cpu_cores__() allocates the CPU affinity mask based on the
> number of online CPUs returned by sysconf(_SC_NPROCESSORS_ONLN).
> On Linux, sched_getaffinity() returns EINVAL if the supplied mask is
> smaller than the kernel affinity mask.
> 
> This can happen when the possible CPU range exceeds the online CPU
> count, for example on systems or VMs configured for CPU hotplug.  In
> that case OVS falls back to the online CPU count and may ignore the
> process affinity mask when sizing handler threads.
> 
> Retry sched_getaffinity() with larger dynamically allocated CPU masks
> on EINVAL, while preserving the existing fallback behavior for other
> failures.  This follows the sched_getaffinity(2) guidance for large
> CPU affinity masks: probe with dynamically allocated masks of increasing
> size until the call no longer fails with EINVAL.
> 
> Testing:
> - Ran make check TESTSUITEFLAGS=-j4: 2751 tests passed, 6 skipped.
> - Built an Open vSwitch RPM with this patch.

No need to put these into commit message.  These are basic testing
steps that every patch should go through.

> - Installed the RPM on a Rocky Linux 9.7 VM running Linux 6.6.119,
>   with CPUs 0-159 possible, CPUs 0-19 online, and ovs-vswitchd
>   limited to CPUs 6-9; verified that per-cpu handler threads dropped
>   from 20 to 5 after restart, following the process affinity mask.
> 
> Assisted-by: GPT-5, Codex
> Signed-off-by: zwtop <[email protected]>
> ---
>  lib/ovs-thread.c | 28 ++++++++++++++++++++++++----
>  1 file changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
> index 243791de8..141721e8b 100644
> --- a/lib/ovs-thread.c
> +++ b/lib/ovs-thread.c
> @@ -647,15 +647,35 @@ count_cpu_cores__(void)
>  #endif
>  #ifdef __linux__
>      if (n_cores > 0) {
> -        cpu_set_t *set = CPU_ALLOC(n_cores);
> -
> -        if (set) {
> -            size_t size = CPU_ALLOC_SIZE(n_cores);
> +        enum { CPU_ALLOC_INITIAL = 1024 };
> +        enum { CPU_ALLOC_MAX = 1024 << 8 };

These are used only once, there is no need to define enums.  Also, better
to use '1 <<' instead of '1024 <<'.

> +        size_t n_alloc = MAX((size_t) n_cores, (size_t) CPU_ALLOC_INITIAL);

We should just start with the n_cores, whatever it is.

> +        size_t max_n_alloc = MAX(n_alloc, (size_t) CPU_ALLOC_MAX);

There are too many casts here, I'm not sure they are necessary.

> +
> +        /* sched_getaffinity() returns EINVAL when 'size' is smaller than the
> +         * kernel affinity mask.  This can happen when the possible CPU range
> +         * exceeds the online CPU count. */
> +        for (;;) {

Instead of breaking and checking in multiple places inside the loop, it may be
better to define the error variable outside and have a while loop like this:

  while (error == EINVAL && n_alloc <= max_n_alloc) {
  }


The !set break is fine to be inside the loop.

> +            cpu_set_t *set = CPU_ALLOC(n_alloc);
> +            size_t size = CPU_ALLOC_SIZE(n_alloc);

Should be defined in reverse x-mass tree order, i.e. shorter lines at the end.

> +
> +            if (!set) {
> +                break;
> +            }

Empty line here.

> +            CPU_ZERO_S(size, set);
>  

And remove this empty line.

>              if (!sched_getaffinity(0, size, set)) {
>                  n_cores = CPU_COUNT_S(size, set);
> +                CPU_FREE(set);
> +                break;
>              }
> +
> +            int error = errno;
>              CPU_FREE(set);
> +            if (error != EINVAL || n_alloc >= max_n_alloc) {
> +                break;
> +            }
> +            n_alloc = MIN(n_alloc << 2, max_n_alloc);

This MIN looks strange.  Also, should just use '* 2' instead.  And no
need to multiply by 4 from the beginning.  There shouldn't be too many
iterations and it's not a performance critical path.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to