> On 2 Oct 2024, at 2:34 PM, Ales Musil <[email protected]> wrote:
>
> CAUTION: External Email
>
>
> On Thu, Sep 19, 2024 at 7:43 PM Naveen Yerramneni via dev
> <[email protected]> wrote:
> Issue:
> =====
> OVN controller main() is caching the "sbrec_mac_binding/sbrec_fdb"
> pointers in "mac_cache_mac_binding->mac_bindings/fdbs" which are
> used to update timestamp in mac_cache_mb_stats_run()/
> mac_cache_fdb_stats_run(). There is a condition where cached
> "sbrec_mac_binding/sbrec_fdb" pointers gets freed without
> removing the refs from "mac_cache_mac_binding->mac_bindings/fdbs"
> which is leading to crash due to "use after free".
>
> This condition can occur due to following reason.
> - en_runtime_data is input to en_mac_cache.
> - en_runtime_data went for full recompute but engine_run()
> is called with recompute_allowed as false due to
> ofctrl_has_backlog() returned true. This caused to avoid
> the recompute in the current engine run.
> - If there are any mac_binding/fdb deletes in the same run
> then, entries gets deleted during ovsdb_idl_loop_run() and
> rows are freed in ovsdb_idl_track_clear() but references
> to them are present in mac_cache_data->mac_bindings/fdbs.
>
> Fix:
> ===
> Avoid statctrl_run() when engine recompute is not allowed.
>
> Signed-off-by: Naveen Yerramneni <[email protected]>
> ---
>
>
> Hello Neveen,
>
> thank you for the patch. I have one comment down below. This also
> makes me wonder, don't we have potentially the same issue with pinctrl?
> I'm pretty sure that there are some SB references too.
Hi Ales,
Sorry for the late reply.
In pinctrl, run_put_mac_bindings() is called from main thread which is always
doing the lookup into SB Mac_Binding table before add/modify.
Thanks,
Naveen
>
> controller/ovn-controller.c | 57 +++++++++++++++++++------------------
> 1 file changed, 30 insertions(+), 27 deletions(-)
>
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index c48667887..6bb8f07db 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -5363,6 +5363,7 @@ main(int argc, char *argv[])
> /* Main loop. */
> bool sb_monitor_all = false;
> while (!exit_args.exiting) {
> + bool recompute_allowed = true;
>
> There is no need for the scope of this variable to be the
> whole main loop. It could actually be changed to:
>
> bool recompute_allowed =
> !ovnsb_idl_txn || ofctrl_has_backlog();
> engine_run(recompute_allowed);
>
> Keeping the combined comment of course.
>
>
> memory_run();
> if (memory_should_report()) {
> struct simap usage = SIMAP_INITIALIZER(&usage);
> @@ -5536,34 +5537,34 @@ main(int argc, char *argv[])
>
> stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME,
> time_msec());
> - if (ovnsb_idl_txn) {
> - if (ofctrl_has_backlog()) {
> - /* When there are in-flight messages pending to
> - * ovs-vswitchd, we should hold on recomputing so
> - * that the previous flow installations won't be
> - * delayed. However, we still want to try if
> - * recompute is not needed and we can quickly
> - * incrementally process the new changes, to
> avoid
> - * unnecessarily forced recomputes later on.
> This
> - * is because the OVSDB change tracker cannot
> - * preserve tracked changes across iterations.
> If
> - * change tracking is improved, we can simply
> skip
> - * this round of engine_run and continue
> processing
> - * acculated changes incrementally later when
> - * ofctrl_has_backlog() returns false. */
> - engine_run(false);
> - } else {
> - engine_run(true);
> - }
> - } else {
> + if (!ovnsb_idl_txn || ofctrl_has_backlog()) {
> + /* No ovnsb_idl_txn */
> /* Even if there's no SB DB transaction available,
> - * try to run the engine so that we can handle any
> - * incremental changes that don't require a
> recompute.
> - * If a recompute is required, the engine will
> cancel,
> - * triggerring a full run in the next iteration.
> - */
> - engine_run(false);
> + * try to run the engine so that we can handle any
> + * incremental changes that don't require a recompute.
> + * If a recompute is required, the engine will cancel,
> + * triggerring a full run in the next iteration.
> + */
> +
> + /* OR */
> +
> + /* ofctrl_has_backlog */
> + /* When there are in-flight messages pending to
> + * ovs-vswitchd, we should hold on recomputing so
> + * that the previous flow installations won't be
> + * delayed. However, we still want to try if
> + * recompute is not needed and we can quickly
> + * incrementally process the new changes, to avoid
> + * unnecessarily forced recomputes later on. This
> + * is because the OVSDB change tracker cannot
> + * preserve tracked changes across iterations. If
> + * change tracking is improved, we can simply skip
> + * this round of engine_run and continue processing
> + * acculated changes incrementally later when
> + * ofctrl_has_backlog() returns false. */
> + recompute_allowed = false;
> }
> + engine_run(recompute_allowed);
> stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME,
> time_msec());
> if (engine_has_updated()) {
> @@ -5685,7 +5686,9 @@ main(int argc, char *argv[])
> }
> }
>
> - if (mac_cache_data) {
> + if (mac_cache_data && recompute_allowed) {
> + /* Run only when engine recompute is allowed
> + * since mac_binding/FDB rows are cached */
> statctrl_run(ovnsb_idl_txn, mac_cache_data);
> }
>
> --
> 2.36.6
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev [mail.openvswitch.org]
>
>
> Thanks,
> Ales
>
> --
> Ales Musil
> Senior Software Engineer - OVN Core
> Red Hat EMEA [redhat.com][email protected] [red.ht]
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev