What would be the memory impact of this patch?

Context: we have been exploring the use of ovn in SoCs for edge
environments where memory is a strong constraint and performance is not our
biggest concern in all cases. One of the optimizations we had to make
(thanks +Zenghui Shi for this investigation) was to pin ovs-vswitchd to a
single core to reduce the memory impact (along with disabling mlock-all ),
we were able to reduce ovs-vswitchd footprint from 100MB+ down to 16MB

Is it possible to make this a configurable parameter defaulting to what you
explain here?



El El mar, 19 jul 2022 a las 9:02, Michael Santana <[email protected]>
escribió:

> Additional threads are required to service upcalls when we have CPU
> isolation (in per-cpu dispatch mode). The reason additional threads
> are required is because it creates a more fair distribution. With more
> threads we decrease the load of each thread as more threads would
> decrease the number of cores each threads is assigned.
>
> Adding additional threads also increases the chance OVS utilizes all
> cores available to use. Some RPS schemas might make some handler
> threads get all the workload while others get no workload. This tends
> to happen when the handler thread count is low.
>
> An example would be an RPS that sends traffic on all even cores on a
> system with only the lower half of the cores available for OVS to use.
> In this example we have as many handlers threads as there are
> available cores. In this case 50% of the handler threads get all the
> workload while the other 50% get no workload. Not only that, but OVS
> is only utilizing half of the cores that it can use. This is the worst
> case scenario.
>
> The ideal scenario is to have as many threads as there are cores - in
> this case we guarantee that all cores OVS can use are utilized
>
> But, adding as many threads are there are cores could have a performance
> hit when the number of active cores (which all threads have to share) is
> very low. For this reason we avoid creating as many threads as there
> are cores and instead meet somewhere in the middle.
>
> The formula used to calculate the number of handler threads to create
> is as follows:
>
> handlers_n = min(next_prime(active_cores+1), total_cores)
>
> Where next_prime is a function that returns the next numeric prime
> number that is strictly greater than active_cores
>
> Assume default behavior when total_cores <= 2, that is do not create
> additional threads when we have less than 2 total cores on the system
>
> Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.")
> Signed-off-by: Michael Santana <[email protected]>
> ---
>  lib/dpif-netlink.c | 83 ++++++++++++++++++++++++++++++++++++++++++++--
>  lib/ovs-thread.c   | 17 ++++++++++
>  lib/ovs-thread.h   |  1 +
>  3 files changed, 99 insertions(+), 2 deletions(-)
>
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index 06e1e8ca0..21f149867 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -2506,6 +2506,85 @@ dpif_netlink_handler_uninit(struct dpif_handler
> *handler)
>  }
>  #endif
>
> +/*
> + * Returns true if num is a prime number,
> + * Otherwise return false
> + */
> +static bool
> +is_prime(uint32_t num)
> +{
> +    if (num == 2) {
> +        return true;
> +    }
> +
> +    if (num < 2) {
> +        return false;
> +    }
> +
> +    if (num % 2 == 0) {
> +        return false;
> +    }
> +
> +    for (uint64_t i = 3; i * i <= num; i += 2) {
> +        if (num % i == 0) {
> +            return false;
> +        }
> +    }
> +
> +    return true;
> +}
> +
> +/*
> + * Returns num if num is a prime number.  Otherwise returns the next
> + * prime greater than num.  Search is limited by UINT32_MAX.
> + *
> + * Returns 0 if no prime has been found between num and UINT32_MAX
> + */
> +static uint32_t
> +next_prime(uint32_t num)
> +{
> +    if (num <= 2) {
> +        return 2;
> +    }
> +
> +    for (uint32_t i = num; i < UINT32_MAX; i++) {
> +        if (is_prime(i)) {
> +            return i;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +/*
> + * Calculates and returns the number of handler threads needed based
> + * the following formula:
> + *
> + * handlers_n = min(next_prime(active_cores+1), total_cores)
> + *
> + * Where next_prime is a function that returns the next numeric prime
> + * number that is strictly greater than active_cores
> + */
> +static uint32_t
> +dpif_netlink_calculate_n_handlers(void)
> +{
> +    uint32_t total_cores = count_total_cores();
> +    uint32_t n_handlers = count_cpu_cores();
> +    uint32_t next_prime_num;
> +
> +    /*
> +     * If we have isolated cores, add additional handler threads to
> +     * minimize the number of cores assigned to individual handler threads
> +     */
> +    if (n_handlers < total_cores && total_cores > 2) {
> +        next_prime_num = next_prime(n_handlers + 1);
> +        n_handlers = next_prime_num >= total_cores ?
> +                                            total_cores : next_prime_num;
> +    }
> +
> +    return n_handlers;
> +}
> +
>  static int
>  dpif_netlink_refresh_handlers_cpu_dispatch(struct dpif_netlink *dpif)
>      OVS_REQ_WRLOCK(dpif->upcall_lock)
> @@ -2515,7 +2594,7 @@ dpif_netlink_refresh_handlers_cpu_dispatch(struct
> dpif_netlink *dpif)
>      uint32_t n_handlers;
>      uint32_t *upcall_pids;
>
> -    n_handlers = count_cpu_cores();
> +    n_handlers = dpif_netlink_calculate_n_handlers();
>      if (dpif->n_handlers != n_handlers) {
>          VLOG_DBG("Dispatch mode(per-cpu): initializing %d handlers",
>                     n_handlers);
> @@ -2755,7 +2834,7 @@ dpif_netlink_number_handlers_required(struct dpif
> *dpif_, uint32_t *n_handlers)
>      struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
>
>      if (dpif_netlink_upcall_per_cpu(dpif)) {
> -        *n_handlers = count_cpu_cores();
> +        *n_handlers = dpif_netlink_calculate_n_handlers();
>          return true;
>      }
>
> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
> index 805cba622..78ed3e970 100644
> --- a/lib/ovs-thread.c
> +++ b/lib/ovs-thread.c
> @@ -663,6 +663,23 @@ count_cpu_cores(void)
>      return n_cores > 0 ? n_cores : 0;
>  }
>
> +/* Returns the total number of cores on the system, or 0 if the
> + * number cannot be determined. */
> +int
> +count_total_cores(void)
> +{
> +    long int n_cores;
> +
> +#ifndef _WIN32
> +    n_cores = sysconf(_SC_NPROCESSORS_CONF);
> +#else
> +    n_cores = 0;
> +    errno = ENOTSUP;
> +#endif
> +
> +    return n_cores > 0 ? n_cores : 0;
> +}
> +
>  /* Returns 'true' if current thread is PMD thread. */
>  bool
>  thread_is_pmd(void)
> diff --git a/lib/ovs-thread.h b/lib/ovs-thread.h
> index 3b444ccdc..aac5e19c9 100644
> --- a/lib/ovs-thread.h
> +++ b/lib/ovs-thread.h
> @@ -522,6 +522,7 @@ bool may_fork(void);
>  /* Useful functions related to threading. */
>
>  int count_cpu_cores(void);
> +int count_total_cores(void);
>  bool thread_is_pmd(void);
>
>  #endif /* ovs-thread.h */
> --
> 2.33.1
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
> --
Miguel Ángel Ajo  @mangel_ajo <https://twitter.com/mangel_ajo>
Office of CTO EDGE team.
ex Submariner (ACM) and OpenStack networking
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to