Hi Numan, Tiago, Numan, your suggestion about investigating the Neutron agent makes sense when it comes to the issue of new flows not being created. However, my main concern is that *even the existing flows are being affected* during the upgrade process.
Following Tiago’s suggestion, I adapted the process slightly and was able to reduce the impact of flow loss to *under 5 seconds*, which is a great improvement—thank you for the helpful tip, Tiago! Here is how I implemented the flow preservation using ovs-save: - I generate the sh script with ovs-save and copy it, along with the flow dump files, into a volume that is mounted from the OVS container to the host. - Then, I restart the openvswitch-vswitchd service (i.e., the container). - After that, I run the saved script to restore the flows. - Finally, I restart the openvswitch-db service to complete the upgrade. Throughout this entire process, I *did not need to interact with the Neutron agent at all*—even new flow creation worked fine without restarting or reconfiguring anything on the Neutron side. That said, I’m still curious about *why* the existing flows are being lost in the first place during the OVS upgrade. Understanding the root cause would help a lot in improving the reliability of upgrades in production. Any insights on this would be much appreciated. Best regards, Ali Akyürek Numan Siddique <num...@ovn.org>, 2 Tem 2025 Çar, 19:34 tarihinde şunu yazdı: > On Tue, Jul 1, 2025 at 8:15 AM Tiago Pires via discuss > <ovs-discuss@openvswitch.org> wrote: > > > > Hi Ali, > > > > I believe you can explore the ovs-save tool to save the OF flows > > before upgrading the chassi and restore the OF flows after upgrading > > the OVS package. It will depend on the amount of OF flows that you > > have on the chassis that determine the downtime on the dataplane that > > you will have. > > > > Example: > > #save the OF flows > > /usr/share/openvswitch/scripts/ovs-save save-flows <bridge> > > /tmp/offlows_saved.sh > > #restore the OF flows > > sh /tmp/offlows_saved.sh > > > > I recommend you to try it in a lab or some controlled place to > > homologate the steps. > > > > Regards, > > > > Tiago Pires > > > > On Tue, Jul 1, 2025 at 4:17 AM Ali AKYÜREK via discuss > > <ovs-discuss@openvswitch.org> wrote: > > > > > > Hi Team, > > > > > > I’m running an OpenStack cluster deployed using Kolla-Ansible, and I’m > using OpenvSwitch (OVS) as the neutron_plugin_agent. I want to upgrade to > OpenStack with zero-downtime. > > > > > > During an upgrade of the OVS components (openvswitch-db and > openvswitch-vswitchd), I observe that the flows are lost. These flows do > not get recreated until the neutron-openvswitch-agent service is manually > restarted, which causes a noticeable disruption in network connectivity. > > > > > > As a workaround, I’ve tried the following sequence: > > > > > > docker exec openvswitch_vswitchd ovs-appctl -T 5 -t ovs-vswitchd > exit > > > docker restart openvswitch_db > > > docker start openvswitch_vswitchd > > > > > > With this approach, the downtime is reduced to approximately 10 > seconds, and the flows are restored without restarting the Neutron agent. > > > > > > However, I’m looking for a way to perform the upgrade with zero > downtime, or at least without having to restart the > neutron-openvswitch-agent service. > > > > > > During the issue, I noticed the following recurring log messages in > neutron-openvswitch-agent: > > > > > > 2025-06-25 09:12:34.977 7 ERROR neutron.agent.common.ovsdb_monitor > [...] Interface monitor is not active > > > ... > > > 2025-06-25 09:12:44.980 7 ERROR neutron.agent.common.ovsdb_monitor > [...] Interface monitor is not active > > > > > > These messages appear every few seconds until the agent is restarted. > Full log snippet: > > > > > > 2025-06-25 09:12:34.976 INFO ovs_neutron_agent [...] Agent > rpc_loop - iteration:17887 started > > > 2025-06-25 09:12:34.977 ERROR ovsdb_monitor [...] Interface > monitor is not active > > > 2025-06-25 09:12:34.977 INFO ovs_neutron_agent [...] Agent > rpc_loop - iteration:17887 completed. Processed ports statistics: > {'regular': {'added': 0, 'updated': 0, 'removed': 0}}. Elapsed:0.001 > > > ... > > > 2025-06-25 09:12:46.981 INFO ovs_neutron_agent [...] Agent > rpc_loop - iteration:17893 - starting polling. Elapsed:0.001 > > > 2025-06-25 09:12:46.982 INFO ovs_neutron_agent [...] Agent > rpc_loop - iteration:17893 - port information retrieved. Elapsed:0.002 > > > > > > Has anyone encountered a similar issue or found a reliable strategy > for upgrading OVS in a containerized Kolla environment without flow loss? > > > > > > Thanks in advance for your support and suggestions. > > > To me it looks like there is some issue with neuton agent. I'd > suggest asking this question in the Openstack mailing list. > > Thanks > Numan > > > > > > > Best regards, > > > > > > Ali > > > > > > _______________________________________________ > > > discuss mailing list > > > disc...@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > > > -- > > > > > > > > > > _‘Esta mensagem é direcionada apenas para os endereços constantes no > > cabeçalho inicial. Se você não está listado nos endereços constantes no > > cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa > > mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas > estão > > imediatamente anuladas e proibidas’._ > > > > > > * **‘Apesar do Magazine Luiza tomar > > todas as precauções razoáveis para assegurar que nenhum vírus esteja > > presente nesse e-mail, a empresa não poderá aceitar a responsabilidade > por > > quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.* > > > > > > > > _______________________________________________ > > discuss mailing list > > disc...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >
_______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss