On Mon, Dec 16, 2024 at 7:27 PM <[email protected]> wrote:

> From: Numan Siddique <[email protected]>
>
> When the ovn-controller main thread has locked the pinctrl mutex,
> pinctrl thread will not be able to process packet-ins (like DHCP,
> DNS and few others) since it tries to acquire this mutex for
> other purposes (besides packet-in processing) - like ip_mcast
> snoop run and svc monitor runs etc.  Its better to try
> acquiring this mutex so that we can continue processing
> the packet-ins which don't require this mutex.
>
> Signed-off-by: Numan Siddique <[email protected]>
> ---
>

Hi Numan,

thank you for the patch.

I was wondering if we could solve all this contention by using
"lockless" structures for all data that require some protection.
We could replace maps with cmaps, and use RCU if there is any
other storage instead of hmap. WDYT?

Since the scope of the change proposed above would be a more
long term solution I'm probably fine with having this patch as
intermittent solution. I have just one question down below.

 controller/pinctrl.c | 65 ++++++++++++++++++++++++++------------------
>  1 file changed, 39 insertions(+), 26 deletions(-)
>
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index 3fb7e2fd7..bdf1d151d 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -3885,10 +3885,14 @@ pinctrl_handler(void *arg_)
>
>      while (!latch_is_set(&pctrl->pinctrl_thread_exit)) {
>          long long int bfd_time = LLONG_MAX;
> +        bool lock_failed = false;
>
> -        ovs_mutex_lock(&pinctrl_mutex);
> -        ip_mcast_snoop_run();
> -        ovs_mutex_unlock(&pinctrl_mutex);
> +        if (!ovs_mutex_trylock(&pinctrl_mutex)) {
> +            ip_mcast_snoop_run();
> +            ovs_mutex_unlock(&pinctrl_mutex);
> +        } else {
> +            lock_failed = true;
> +        }
>
>          rconn_run(swconn);
>          new_seq = seq_read(pinctrl_handler_seq);
> @@ -3913,35 +3917,44 @@ pinctrl_handler(void *arg_)
>              }
>
>              if (may_inject_pkts()) {
> -                ovs_mutex_lock(&pinctrl_mutex);
> -                send_garp_rarp_run(swconn, &send_garp_rarp_time);
> -                send_ipv6_ras(swconn, &send_ipv6_ra_time);
> -                send_ipv6_prefixd(swconn, &send_prefixd_time);
> -                send_mac_binding_buffered_pkts(swconn);
> -                bfd_monitor_send_msg(swconn, &bfd_time);
> -                ovs_mutex_unlock(&pinctrl_mutex);
> -
> +                if (!ovs_mutex_trylock(&pinctrl_mutex)) {
> +                    send_garp_rarp_run(swconn, &send_garp_rarp_time);
> +                    send_ipv6_ras(swconn, &send_ipv6_ra_time);
> +                    send_ipv6_prefixd(swconn, &send_prefixd_time);
> +                    send_mac_binding_buffered_pkts(swconn);
> +                    bfd_monitor_send_msg(swconn, &bfd_time);
> +                    ovs_mutex_unlock(&pinctrl_mutex);
> +                } else {
> +                    lock_failed = true;
> +                }
>                  ip_mcast_querier_run(swconn, &send_mcast_query_time);
>              }
>
> -            ovs_mutex_lock(&pinctrl_mutex);
> -            svc_monitors_run(swconn, &svc_monitors_next_run_time);
> -            ovs_mutex_unlock(&pinctrl_mutex);
> +            if (!ovs_mutex_trylock(&pinctrl_mutex)) {
> +                svc_monitors_run(swconn, &svc_monitors_next_run_time);
> +                ovs_mutex_unlock(&pinctrl_mutex);
> +            } else {
> +                lock_failed = true;
> +            }
>          }
>
> -        rconn_run_wait(swconn);
> -        rconn_recv_wait(swconn);
> -        if (rconn_is_connected(swconn)) {
> -            send_garp_rarp_wait(send_garp_rarp_time);
> -            ipv6_ra_wait(send_ipv6_ra_time);
> -            ip_mcast_querier_wait(send_mcast_query_time);
> -            svc_monitors_wait(svc_monitors_next_run_time);
> -            ipv6_prefixd_wait(send_prefixd_time);
> -            bfd_monitor_wait(bfd_time);
> -        }
> -        seq_wait(pinctrl_handler_seq, new_seq);
> +        if (lock_failed) {
> +            poll_immediate_wake();
>

This will effectively degrade the lock into spinlock.
Are we ok with that? Or how about scheduling the next wake
in a few ms e.g. 'poll_timer_wait_until(5)'?


> +        } else {
> +            rconn_run_wait(swconn);
> +            rconn_recv_wait(swconn);
> +            if (rconn_is_connected(swconn)) {
> +                send_garp_rarp_wait(send_garp_rarp_time);
> +                ipv6_ra_wait(send_ipv6_ra_time);
> +                ip_mcast_querier_wait(send_mcast_query_time);
> +                svc_monitors_wait(svc_monitors_next_run_time);
> +                ipv6_prefixd_wait(send_prefixd_time);
> +                bfd_monitor_wait(bfd_time);
> +            }
> +            seq_wait(pinctrl_handler_seq, new_seq);
>
> -        latch_wait(&pctrl->pinctrl_thread_exit);
> +            latch_wait(&pctrl->pinctrl_thread_exit);
> +        }
>          poll_block();
>      }
>
> --
> 2.39.3 (Apple Git-146)
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>

Thanks,
Ales
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to