On 11/29/23 16:45, Xavier Simonart wrote: > The group_table and meter_table are initialized in ovn-controller, with n_ids > = 0. > Then they are re-initialized in ofctrl, with proper number of n_ids, in state > S_CLEAR_FLOWS. > However, nothing prevented to start adding flows (by adding logical ports) > using groups > before ofctrl reaches this state. This was causing some wrong flows (missing > group in the flow). > > With this patch, as soon as the feature rconn is available, i.e. before > adding flows, those table > are properly re-initialized. > > This issue is usually not visible in main and recent branches ci, since > "Wait for new ovn-controllers to connect to Southbound." as this was slowing > down the moment when > the test started to add ports. > This was causing the following test to fail: > "ECMP static routes -- ovn-northd -- parallelization=yes -- > ovn_monitor_all=yes". > > Fixes: 1d6d953bf883 ("controller: Don't artificially limit group and meter > IDs to 16bit.") > Signed-off-by: Xavier Simonart <xsimo...@redhat.com> > ---
Thanks for the patch, Xavier! And sorry for introducing the issue in the first place. > controller/ovn-controller.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index 44605eb4e..98d95b1f3 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -5654,6 +5654,14 @@ main(int argc, char *argv[]) > br_int ? br_int->name : NULL)) { > VLOG_INFO("OVS feature set changed, force recompute."); > engine_set_force_recompute(true); > + if (ovs_feature_set_discovered()) { > + lflow_output_data = > + engine_get_internal_data(&en_lflow_output); > + ovn_extend_table_reinit(&lflow_output_data->group_table, > + ovs_feature_max_select_groups_get()); > + ovn_extend_table_reinit(&lflow_output_data->meter_table, > + ovs_feature_max_meters_get()); I don't think this is correct. What if the supported number of groups/meters actually changed, from X to Y, with X != 0? In that case ovs_feature_support_run() would also return true and we'll reinitialize the "extend_tables" (call ovn_extend_table_reinit()). That currently just recreates the id_pool but doesn't touch the "ovn_extend_table->existing" map. Doesn't that mean that we'd potentially end up in an inconsistent state? Funny thing, in the v3 of the original patch that introduced the problem I was calling ovn_extend_table_clear() from ovn_extend_table_reinit() but we decided to remove it as it was already done by the caller. :) https://patchwork.ozlabs.org/project/ovn/patch/20231026113713.1718954-1-dce...@redhat.com/#3209437 Thinking more about it, I wonder, isn't it enough to change the way this works and just call ovn_extend_table_reinit() here instead of in run_S_CLEAR_FLOWS()? We should also clear the "existing" map inside ovn_extend_table_reinit() if we're changing the size of the id_pool. Wpuldn't that properly cover all scenarios? > + } > } > > if (br_int && ovs_feature_set_discovered()) { Regards, Dumitru _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev