On Thu, Jul 3, 2025 at 9:17 AM Ali AKYÜREK <aa.akyur...@gmail.com> wrote: > > Hi Numan, Tiago, > > Numan, your suggestion about investigating the Neutron agent makes sense when > it comes to the issue of new flows not being created. However, my main > concern is that even the existing flows are being affected during the upgrade > process. > > > Following Tiago’s suggestion, I adapted the process slightly and was able to > reduce the impact of flow loss to under 5 seconds, which is a great > improvement—thank you for the helpful tip, Tiago! > > > Here is how I implemented the flow preservation using ovs-save: > > I generate the sh script with ovs-save and copy it, along with the flow dump > files, into a volume that is mounted from the OVS container to the host. > > Then, I restart the openvswitch-vswitchd service (i.e., the container). > > After that, I run the saved script to restore the flows. > > Finally, I restart the openvswitch-db service to complete the upgrade. > > > Throughout this entire process, I did not need to interact with the Neutron > agent at all—even new flow creation worked fine without restarting or > reconfiguring anything on the Neutron side. > > > That said, I’m still curious about why the existing flows are being lost in > the first place during the OVS upgrade. Understanding the root cause would > help a lot in improving the reliability of upgrades in production. > > > Any insights on this would be much appreciated. >
Whenever ovs-vswitchd is stopped and then started again and if you don't save/restore the flows as mentioned by Tiago, all the flows are gone. ovs-vswitchd stores the openflows in its memory. So an external entity is expected to reprogram the flows. In the case of OVN, ovn-controller does this. I'd expect neutron-agent to do the same. Seems like a bug to me in neutron-agent. That's why I suggested checking the openstack mailing list. @dalva...@redhat.com @jlibo...@redhat.com if you've any suggestions here. Thanks Numan > > Best regards, > > Ali Akyürek > > > Numan Siddique <num...@ovn.org>, 2 Tem 2025 Çar, 19:34 tarihinde şunu yazdı: >> >> On Tue, Jul 1, 2025 at 8:15 AM Tiago Pires via discuss >> <ovs-discuss@openvswitch.org> wrote: >> > >> > Hi Ali, >> > >> > I believe you can explore the ovs-save tool to save the OF flows >> > before upgrading the chassi and restore the OF flows after upgrading >> > the OVS package. It will depend on the amount of OF flows that you >> > have on the chassis that determine the downtime on the dataplane that >> > you will have. >> > >> > Example: >> > #save the OF flows >> > /usr/share/openvswitch/scripts/ovs-save save-flows <bridge> >> > /tmp/offlows_saved.sh >> > #restore the OF flows >> > sh /tmp/offlows_saved.sh >> > >> > I recommend you to try it in a lab or some controlled place to >> > homologate the steps. >> > >> > Regards, >> > >> > Tiago Pires >> > >> > On Tue, Jul 1, 2025 at 4:17 AM Ali AKYÜREK via discuss >> > <ovs-discuss@openvswitch.org> wrote: >> > > >> > > Hi Team, >> > > >> > > I’m running an OpenStack cluster deployed using Kolla-Ansible, and I’m >> > > using OpenvSwitch (OVS) as the neutron_plugin_agent. I want to upgrade >> > > to OpenStack with zero-downtime. >> > > >> > > During an upgrade of the OVS components (openvswitch-db and >> > > openvswitch-vswitchd), I observe that the flows are lost. These flows do >> > > not get recreated until the neutron-openvswitch-agent service is >> > > manually restarted, which causes a noticeable disruption in network >> > > connectivity. >> > > >> > > As a workaround, I’ve tried the following sequence: >> > > >> > > docker exec openvswitch_vswitchd ovs-appctl -T 5 -t ovs-vswitchd exit >> > > docker restart openvswitch_db >> > > docker start openvswitch_vswitchd >> > > >> > > With this approach, the downtime is reduced to approximately 10 seconds, >> > > and the flows are restored without restarting the Neutron agent. >> > > >> > > However, I’m looking for a way to perform the upgrade with zero >> > > downtime, or at least without having to restart the >> > > neutron-openvswitch-agent service. >> > > >> > > During the issue, I noticed the following recurring log messages in >> > > neutron-openvswitch-agent: >> > > >> > > 2025-06-25 09:12:34.977 7 ERROR neutron.agent.common.ovsdb_monitor >> > > [...] Interface monitor is not active >> > > ... >> > > 2025-06-25 09:12:44.980 7 ERROR neutron.agent.common.ovsdb_monitor >> > > [...] Interface monitor is not active >> > > >> > > These messages appear every few seconds until the agent is restarted. >> > > Full log snippet: >> > > >> > > 2025-06-25 09:12:34.976 INFO ovs_neutron_agent [...] Agent rpc_loop >> > > - iteration:17887 started >> > > 2025-06-25 09:12:34.977 ERROR ovsdb_monitor [...] Interface monitor >> > > is not active >> > > 2025-06-25 09:12:34.977 INFO ovs_neutron_agent [...] Agent rpc_loop >> > > - iteration:17887 completed. Processed ports statistics: {'regular': >> > > {'added': 0, 'updated': 0, 'removed': 0}}. Elapsed:0.001 >> > > ... >> > > 2025-06-25 09:12:46.981 INFO ovs_neutron_agent [...] Agent rpc_loop >> > > - iteration:17893 - starting polling. Elapsed:0.001 >> > > 2025-06-25 09:12:46.982 INFO ovs_neutron_agent [...] Agent rpc_loop >> > > - iteration:17893 - port information retrieved. Elapsed:0.002 >> > > >> > > Has anyone encountered a similar issue or found a reliable strategy for >> > > upgrading OVS in a containerized Kolla environment without flow loss? >> > > >> > > Thanks in advance for your support and suggestions. >> >> >> To me it looks like there is some issue with neuton agent. I'd >> suggest asking this question in the Openstack mailing list. >> >> Thanks >> Numan >> >> > > >> > > Best regards, >> > > >> > > Ali >> > > >> > > _______________________________________________ >> > > discuss mailing list >> > > disc...@openvswitch.org >> > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >> > >> > -- >> > >> > >> > >> > >> > _‘Esta mensagem é direcionada apenas para os endereços constantes no >> > cabeçalho inicial. Se você não está listado nos endereços constantes no >> > cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa >> > mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão >> > imediatamente anuladas e proibidas’._ >> > >> > >> > * **‘Apesar do Magazine Luiza tomar >> > todas as precauções razoáveis para assegurar que nenhum vírus esteja >> > presente nesse e-mail, a empresa não poderá aceitar a responsabilidade por >> > quaisquer perdas ou danos causados por esse e-mail ou por seus anexos’.* >> > >> > >> > >> > _______________________________________________ >> > discuss mailing list >> > disc...@openvswitch.org >> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss