On Mon, Oct 21, 2024 at 5:21 PM Naveen Yerramneni < [email protected]> wrote:
> Issue: > ===== > OVN controller main() is caching the "sbrec_mac_binding/sbrec_fdb" > pointers in "mac_cache_mac_binding->mac_bindings/fdbs" which are > used to update timestamp in mac_cache_mb_stats_run()/ > mac_cache_fdb_stats_run(). There is a condition where cached > "sbrec_mac_binding/sbrec_fdb" pointers gets freed without > removing the refs from "mac_cache_mac_binding->mac_bindings/fdbs" > which is leading to crash due to "use after free". > > This condition can occur due to following reason. > - en_runtime_data is input to en_mac_cache. > - en_runtime_data went for full recompute but engine_run() > is called with recompute_allowed as False due to > ofctrl_has_backlog() returned True. This caused to avoid > the recompute in the current engine run and engine_run_aborted > is set to True. If recompute_allowed is False and > engine_run_aborted is True then, inc processing engine is > skipped. > - If there are any mac_binding/fdb deletes in the same run > then, entries gets deleted during ovsdb_idl_loop_run() and > rows are freed in ovsdb_idl_track_clear() but references > to them are present in mac_cache_data->mac_bindings/fdbs. > > Fix: > === > Avoid statctrl_run() when engine recompute is not allowed. > > Signed-off-by: Naveen Yerramneni <[email protected]> > --- > v2: > - Addressed review comments from Ales. > - Updated the description. > --- > controller/ovn-controller.c | 59 +++++++++++++++++++------------------ > 1 file changed, 30 insertions(+), 29 deletions(-) > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index c48667887..88430a111 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -5536,34 +5536,33 @@ main(int argc, char *argv[]) > > stopwatch_start(CONTROLLER_LOOP_STOPWATCH_NAME, > time_msec()); > - if (ovnsb_idl_txn) { > - if (ofctrl_has_backlog()) { > - /* When there are in-flight messages pending > to > - * ovs-vswitchd, we should hold on > recomputing so > - * that the previous flow installations won't > be > - * delayed. However, we still want to try if > - * recompute is not needed and we can quickly > - * incrementally process the new changes, to > avoid > - * unnecessarily forced recomputes later on. > This > - * is because the OVSDB change tracker cannot > - * preserve tracked changes across > iterations. If > - * change tracking is improved, we can simply > skip > - * this round of engine_run and continue > processing > - * acculated changes incrementally later when > - * ofctrl_has_backlog() returns false. */ > - engine_run(false); > - } else { > - engine_run(true); > - } > - } else { > - /* Even if there's no SB DB transaction available, > - * try to run the engine so that we can handle any > - * incremental changes that don't require a > recompute. > - * If a recompute is required, the engine will > cancel, > - * triggerring a full run in the next iteration. > - */ > - engine_run(false); > - } > + > + /* Recompute is not allowed in following cases: */ > + /* 1. No ovnsb_idl_txn */ > + /* Even if there's no SB DB transaction available, > + * try to run the engine so that we can handle any > + * incremental changes that don't require a recompute. > + * If a recompute is required, the engine will cancel, > + * triggerring a full run in the next iteration. > + */ > + /* 2. ofctrl_has_backlog */ > + /* When there are in-flight messages pending to > + * ovs-vswitchd, we should hold on recomputing so > + * that the previous flow installations won't be > + * delayed. However, we still want to try if > + * recompute is not needed and we can quickly > + * incrementally process the new changes, to avoid > + * unnecessarily forced recomputes later on. This > + * is because the OVSDB change tracker cannot > + * preserve tracked changes across iterations. If > + * change tracking is improved, we can simply skip > + * this round of engine_run and continue processing > + * acculated changes incrementally later when > + * ofctrl_has_backlog() returns false. */ > + > + bool recompute_allowed = (ovnsb_idl_txn && > + !ofctrl_has_backlog()); > + engine_run(recompute_allowed); > stopwatch_stop(CONTROLLER_LOOP_STOPWATCH_NAME, > time_msec()); > if (engine_has_updated()) { > @@ -5685,7 +5684,9 @@ main(int argc, char *argv[]) > } > } > > - if (mac_cache_data) { > + if (mac_cache_data && recompute_allowed) { > + /* Run only when engine recompute is allowed > + * since mac_binding/FDB rows are cached */ > statctrl_run(ovnsb_idl_txn, mac_cache_data); > } > > -- > 2.36.6 > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > Looks good to me, thanks. Acked-by: Ales Musil <[email protected]> -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
