On Fri, May 10, 2019 at 2:46 PM Ankur Sharma <[email protected]>
wrote:
>
> ISSUE:
> a. As soon as entries get added to put_mac_bindings in pinctrl.c,
>    ovn-controller CPU consumption reached 100%.
>
> b. This happens because in wait_put_mac_bindings,
>    if !hmap_is_empty(&put_mac_bindings) succeeds and
>    calls poll_immediat_wake().
>
>    ovn-controller.log:
>    "2019-05-10T19:43:28.035Z|00035|poll_loop|INFO|wakeup due to
>     0-ms timeout at ovn/controller/pinctrl.c:2520 (99% CPU usage)"
>
> ROOT_CAUSE:
> a. Earlier it used to work fine, because in run_put_mac_bindings
>    all the bindings in put_mac_bindings would be flushed.
>
> b. Looks like, as a part of adding a new thread in pinctrl, this
>    line got replaced with calling buffer_put_mac_bindings.
>
>     "
>     .
>     .
>     .
>        /* Move the mac bindings from 'put_mac_bindings' hmap to
>      * 'buffered_mac_bindings' and notify the pinctrl_handler.
>      * pinctrl_handler will reinject the buffered packets. */
>     if (!hmap_is_empty(&put_mac_bindings)) {
>         buffer_put_mac_bindings();
>         notify_pinctrl_handler();
>     }
>     "
>
> c. Because of which put_mac_bindings is never emptied and
wait_put_mac_bindings
>    ends up doing poll_immediate_wake.
>
> FIX:
> a. Added call to flush_put_mac_bindings back in
>    run_put_mac_bindings.
>
> b. Additioanlly, logic mentioned in ROOT_CAUSE.b has been changed
>    in 1c24b2f490ba002bbfeb23006965188a7c5b9747, hence updated
>    the documentation in pinctrl.c accordingly.
>
> Signed-off-by: Ankur Sharma <[email protected]>
> ---
>  ovn/controller/pinctrl.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
> index 8ae1f9b..b7bb4c9 100644
> --- a/ovn/controller/pinctrl.c
> +++ b/ovn/controller/pinctrl.c
> @@ -91,11 +91,13 @@ VLOG_DEFINE_THIS_MODULE(pinctrl);
>   *
>   *   - arp/nd_ns      - These actions generate an ARP/IPv6 Neighbor
solicit
>   *                      requests. The original packets are buffered and
> - *                      injected back when put_arp/put_nd actions are
called.
> + *                      injected back when put_arp/put_nd resolves
> + *                      corresponding ARP/IPv6 Neighbor solicit requests.
>   *                      When pinctrl_run(), writes the mac bindings from
the
>   *                      'put_mac_bindings' hmap to the MAC_Binding table
in
> - *                      SB DB, it moves these mac bindings to another
hmap -
> - *                      'buffered_mac_bindings'.
> + *                      SB DB, run_buffered_binding will add the buffered
> + *                      packets to buffered_mac_bindings and notify
> + *                      pinctrl_handler.
>   *
>   *                      The pinctrl_handler thread calls the function -
>   *                      send_mac_binding_buffered_pkts(), which uses
> @@ -2468,6 +2470,7 @@ run_put_mac_bindings(struct ovsdb_idl_txn
*ovnsb_idl_txn,
>                              sbrec_mac_binding_by_lport_ip,
>                              pmb);
>      }
> +    flush_put_mac_bindings();
>  }
>
>  static void
> --
> 1.8.3.1
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Thanks for the fix. Does it mean the the mac-binding will not work after
some time since it is never flushed, and the check
"hmap_count(&put_mac_bindings) >= 1000" in pinctrl_handle_put_mac_binding()
will prevent any new mac-binding being handled?

The fix looks good to me.

Acked-by: Han Zhou <[email protected]>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to