On Thu, Sep 19, 2024 at 7:43 PM Naveen Yerramneni via dev < [email protected]> wrote:
> Issue: > ===== > OVN controller main() is caching the "sbrec_mac_binding/sbrec_fdb" > pointers in "mac_cache_mac_binding->mac_bindings/fdbs" which are > used to update timestamp in mac_cache_mb_stats_run()/ > mac_cache_fdb_stats_run(). There is a condition where cached > "sbrec_mac_binding/sbrec_fdb" pointers gets freed without > removing the refs from "mac_cache_mac_binding->mac_bindings/fdbs" > which is leading to crash due to "use after free". > > This condition can occur due to following reason. > - en_runtime_data is input to en_mac_cache. > - en_runtime_data went for full recompute but engine_run() > is called with recompute_allowed as false due to > ofctrl_has_backlog() returned true. This caused to avoid > the recompute in the current engine run. > - If there are any mac_binding/fdb deletes in the same run > then, entries gets deleted during ovsdb_idl_loop_run() and > rows are freed in ovsdb_idl_track_clear() but references > to them are present in mac_cache_data->mac_bindings/fdbs. > > Fix: > === > Avoid statctrl_run() when engine recompute is not allowed. > > Signed-off-by: Naveen Yerramneni <[email protected]> > --- > Hello Neveen, thank you for the patch. I have one comment down below. This also makes me wonder, don't we have potentially the same issue with pinctrl? I'm pretty sure that there are some SB references too. controller/ovn-controller.c | 57 +++++++++++++++++++------------------ > 1 file changed, 30 insertions(+), 27 deletions(-) > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index c48667887..6bb8f07db 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -5363,6 +5363,7 @@ main(int argc, char *argv[]) > /* Main loop. */ > bool sb_monitor_all = false; > while (!exit_args.exiting) { > + bool recompute_allowed = true; > There is no need for the scope of this variable to be the whole main loop. It could actually be changed to: bool recompute_allowed = !ovnsb_idl_txn || ofctrl_has_backlog(); engine_run(recompute_allowed); Keeping the combined comment of course. memory_run(); > if (memory_should_report()) { > struct simap usage = SIMAP_INITIALIZER(&usage); > @@ -5536,34 +5537,34 @@ main(int argc, char *argv[]) > > stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME, > time_msec()); > - if (ovnsb_idl_txn) { > - if (ofctrl_has_backlog()) { > - /* When there are in-flight messages pending > to > - * ovs-vswitchd, we should hold on > recomputing so > - * that the previous flow installations won't > be > - * delayed. However, we still want to try if > - * recompute is not needed and we can quickly > - * incrementally process the new changes, to > avoid > - * unnecessarily forced recomputes later on. > This > - * is because the OVSDB change tracker cannot > - * preserve tracked changes across > iterations. If > - * change tracking is improved, we can simply > skip > - * this round of engine_run and continue > processing > - * acculated changes incrementally later when > - * ofctrl_has_backlog() returns false. */ > - engine_run(false); > - } else { > - engine_run(true); > - } > - } else { > + if (!ovnsb_idl_txn || ofctrl_has_backlog()) { > + /* No ovnsb_idl_txn */ > /* Even if there's no SB DB transaction available, > - * try to run the engine so that we can handle any > - * incremental changes that don't require a > recompute. > - * If a recompute is required, the engine will > cancel, > - * triggerring a full run in the next iteration. > - */ > - engine_run(false); > + * try to run the engine so that we can handle any > + * incremental changes that don't require a > recompute. > + * If a recompute is required, the engine will > cancel, > + * triggerring a full run in the next iteration. > + */ > + > + /* OR */ > + > + /* ofctrl_has_backlog */ > + /* When there are in-flight messages pending to > + * ovs-vswitchd, we should hold on recomputing so > + * that the previous flow installations won't be > + * delayed. However, we still want to try if > + * recompute is not needed and we can quickly > + * incrementally process the new changes, to avoid > + * unnecessarily forced recomputes later on. This > + * is because the OVSDB change tracker cannot > + * preserve tracked changes across iterations. If > + * change tracking is improved, we can simply skip > + * this round of engine_run and continue > processing > + * acculated changes incrementally later when > + * ofctrl_has_backlog() returns false. */ > + recompute_allowed = false; > } > + engine_run(recompute_allowed); > stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, > time_msec()); > if (engine_has_updated()) { > @@ -5685,7 +5686,9 @@ main(int argc, char *argv[]) > } > } > > - if (mac_cache_data) { > + if (mac_cache_data && recompute_allowed) { > + /* Run only when engine recompute is allowed > + * since mac_binding/FDB rows are cached */ > statctrl_run(ovnsb_idl_txn, mac_cache_data); > } > > -- > 2.36.6 > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > Thanks, Ales -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
