Re: [openstack-dev] [neutron] Backup port info to restore the flow rules
On Mon, Feb 22, 2016 at 7:03 PM, Ihar Hrachyshkawrote: > Agent could probably try to restore the state from its internal state. If > that’s the missing bit you want to have, I think that could stand for a > proper RFE. > Good point. Thanks. -- Best, Jian __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Backup port info to restore the flow rules
Jian Wenwrote: I don't think it's enough for a large scale cloud. When the neutron server is not available and the flow rules are gone, we need the backup to restore the flow rules. Flows should not be reset when neutron-server is down. If that’s the case, it’s a bug to fix (and we fixed one in stable/liberty+ lately). We have more than a thousand physical servers in our production environment. Rare events will occur where combined failures or unanticipated failures require human interaction. For example, a cron job accidentlly killed the OvS service(flows will be gone) when one of RabbitMQ, MySQL and neutron server is down/unavailable. Well, one could argue that’s an issue in the cron job itself. Agent could probably try to restore the state from its internal state. If that’s the missing bit you want to have, I think that could stand for a proper RFE. Ihar __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Backup port info to restore the flow rules
I don't think it's enough for a large scale cloud. When the neutron server is not available and the flow rules are gone, we need the backup to restore the flow rules. We have more than a thousand physical servers in our production environment. Rare events will occur where combined failures or unanticipated failures require human interaction. For example, a cron job accidentlly killed the OvS service(flows will be gone) when one of RabbitMQ, MySQL and neutron server is down/unavailable. On Mon, Feb 22, 2016 at 5:44 PM, Ihar Hrachyshkawrote: > Jian Wen wrote: > > Hello, >> >> If we restart OvS/ovs-agent when one or more of Neutron, MySQL and >> RabbitMQ is not available, the flow rules in OvS will be gone. If >> Neutron/MySQL/RabbitMQ doesn't become available in time, the VMs >> will lose their network connections. It's not easy for an >> operations engineer to manually restore the flow rules. An >> operations engineer working under pressure at 2 a.m. will make >> mistakes. >> >> We can backup the ports info to a local file. In case of emergency >> the ovs-agent can use it to restore the flow rules. What do you >> think of this feature? >> >> Related bugs: >> Restarting neutron openvswitch agent causes network hiccup by >> throwing away all flows >> https://bugs.launchpad.net/neutron/+bug/1383674 >> >> Restarting OVS agent drops VMs traffic when using VLAN provider >> bridges >> https://bugs.launchpad.net/neutron/+bug/1514056 >> >> After restarting an ovs agent, it still drops useful flows if the >> neutron server is busy/down >> https://bugs.launchpad.net/neutron/+bug/1515075 >> >> Ovs agent loses OpenFlow rules if OVS gets restarted while Neutron is >> disconnected from SQL >> https://bugs.launchpad.net/neutron/+bug/1531210 >> >> > Most of those bugs are fixed (at least for stable/liberty+). Isn’t it > enough to avoid data plane reset when the agent fails to fetch new port > data from its controller? Why do we need another mechanism here? > > Ihar > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best, Jian __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] Backup port info to restore the flow rules
Jian Wenwrote: Hello, If we restart OvS/ovs-agent when one or more of Neutron, MySQL and RabbitMQ is not available, the flow rules in OvS will be gone. If Neutron/MySQL/RabbitMQ doesn't become available in time, the VMs will lose their network connections. It's not easy for an operations engineer to manually restore the flow rules. An operations engineer working under pressure at 2 a.m. will make mistakes. We can backup the ports info to a local file. In case of emergency the ovs-agent can use it to restore the flow rules. What do you think of this feature? Related bugs: Restarting neutron openvswitch agent causes network hiccup by throwing away all flows https://bugs.launchpad.net/neutron/+bug/1383674 Restarting OVS agent drops VMs traffic when using VLAN provider bridges https://bugs.launchpad.net/neutron/+bug/1514056 After restarting an ovs agent, it still drops useful flows if the neutron server is busy/down https://bugs.launchpad.net/neutron/+bug/1515075 Ovs agent loses OpenFlow rules if OVS gets restarted while Neutron is disconnected from SQL https://bugs.launchpad.net/neutron/+bug/1531210 Most of those bugs are fixed (at least for stable/liberty+). Isn’t it enough to avoid data plane reset when the agent fails to fetch new port data from its controller? Why do we need another mechanism here? Ihar __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [neutron] Backup port info to restore the flow rules
Hello, If we restart OvS/ovs-agent when one or more of Neutron, MySQL and RabbitMQ is not available, the flow rules in OvS will be gone. If Neutron/MySQL/RabbitMQ doesn't become available in time, the VMs will lose their network connections. It's not easy for an operations engineer to manually restore the flow rules. An operations engineer working under pressure at 2 a.m. will make mistakes. We can backup the ports info to a local file. In case of emergency the ovs-agent can use it to restore the flow rules. What do you think of this feature? Related bugs: Restarting neutron openvswitch agent causes network hiccup by throwing away all flows https://bugs.launchpad.net/neutron/+bug/1383674 Restarting OVS agent drops VMs traffic when using VLAN provider bridges https://bugs.launchpad.net/neutron/+bug/1514056 After restarting an ovs agent, it still drops useful flows if the neutron server is busy/down https://bugs.launchpad.net/neutron/+bug/1515075 Ovs agent loses OpenFlow rules if OVS gets restarted while Neutron is disconnected from SQL https://bugs.launchpad.net/neutron/+bug/1531210 -- Best, Jian __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev