Re: [openstack-dev] [neutron] Backup port info to restore the flow rules

2016-02-22 Thread Jian Wen
On Mon, Feb 22, 2016 at 7:03 PM, Ihar Hrachyshka 
wrote:

> Agent could probably try to restore the state from its internal state. If
> that’s the missing bit you want to have, I think that could stand for a
> proper RFE.
>
Good point. Thanks.

-- 
Best,

Jian
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Backup port info to restore the flow rules

2016-02-22 Thread Ihar Hrachyshka

Jian Wen  wrote:


   I don't think it's enough for a large scale cloud.

   When the neutron server is not available and the flow rules are gone,
   we need the backup to restore the flow rules.


Flows should not be reset when neutron-server is down. If that’s the case,  
it’s a bug to fix (and we fixed one in stable/liberty+ lately).




   We have more than a thousand physical servers in our production environment.
   Rare events will occur where combined failures or unanticipated failures
   require human interaction. For example, a cron job accidentlly killed the
   OvS service(flows will be gone) when one of RabbitMQ, MySQL and neutron
   server is down/unavailable.


Well, one could argue that’s an issue in the cron job itself.

Agent could probably try to restore the state from its internal state. If  
that’s the missing bit you want to have, I think that could stand for a  
proper RFE.


Ihar

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Backup port info to restore the flow rules

2016-02-22 Thread Jian Wen
   I don't think it's enough for a large scale cloud.

   When the neutron server is not available and the flow rules are gone,
   we need the backup to restore the flow rules.

   We have more than a thousand physical servers in our production
environment.
   Rare events will occur where combined failures or unanticipated failures
   require human interaction. For example, a cron job accidentlly killed the
   OvS service(flows will be gone) when one of RabbitMQ, MySQL and neutron
   server is down/unavailable.


On Mon, Feb 22, 2016 at 5:44 PM, Ihar Hrachyshka 
wrote:

> Jian Wen  wrote:
>
> Hello,
>>
>> If we restart OvS/ovs-agent when one or more of Neutron, MySQL and
>> RabbitMQ is not available, the flow rules in OvS will be gone. If
>> Neutron/MySQL/RabbitMQ doesn't become available in time, the VMs
>> will lose their network connections. It's not easy for an
>> operations engineer to manually restore the flow rules. An
>> operations engineer working under pressure at 2 a.m. will make
>> mistakes.
>>
>> We can backup the ports info to a local file. In case of emergency
>> the ovs-agent can use it to restore the flow rules. What do you
>> think of this feature?
>>
>> Related bugs:
>> Restarting neutron openvswitch agent causes network hiccup by
>> throwing away all flows
>> https://bugs.launchpad.net/neutron/+bug/1383674
>>
>> Restarting OVS agent drops VMs traffic when using VLAN provider
>> bridges
>> https://bugs.launchpad.net/neutron/+bug/1514056
>>
>> After restarting an ovs agent, it still drops useful flows if the
>> neutron server is busy/down
>> https://bugs.launchpad.net/neutron/+bug/1515075
>>
>> Ovs agent loses OpenFlow rules if OVS gets restarted while Neutron is
>> disconnected from SQL
>> https://bugs.launchpad.net/neutron/+bug/1531210
>>
>>
> Most of those bugs are fixed (at least for stable/liberty+). Isn’t it
> enough to avoid data plane reset when the agent fails to fetch new port
> data from its controller? Why do we need another mechanism here?
>
> Ihar
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best,

Jian
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Backup port info to restore the flow rules

2016-02-22 Thread Ihar Hrachyshka

Jian Wen  wrote:


Hello,

If we restart OvS/ovs-agent when one or more of Neutron, MySQL and
RabbitMQ is not available, the flow rules in OvS will be gone. If
Neutron/MySQL/RabbitMQ doesn't become available in time, the VMs
will lose their network connections. It's not easy for an
operations engineer to manually restore the flow rules. An
operations engineer working under pressure at 2 a.m. will make
mistakes.

We can backup the ports info to a local file. In case of emergency
the ovs-agent can use it to restore the flow rules. What do you
think of this feature?

Related bugs:
Restarting neutron openvswitch agent causes network hiccup by throwing away 
all flows
https://bugs.launchpad.net/neutron/+bug/1383674

Restarting OVS agent drops VMs traffic when using VLAN provider bridges
https://bugs.launchpad.net/neutron/+bug/1514056

After restarting an ovs agent, it still drops useful flows if the neutron 
server is busy/down
https://bugs.launchpad.net/neutron/+bug/1515075

Ovs agent loses OpenFlow rules if OVS gets restarted while Neutron is 
disconnected from SQL
https://bugs.launchpad.net/neutron/+bug/1531210



Most of those bugs are fixed (at least for stable/liberty+). Isn’t it  
enough to avoid data plane reset when the agent fails to fetch new port  
data from its controller? Why do we need another mechanism here?


Ihar

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] Backup port info to restore the flow rules

2016-02-16 Thread Jian Wen
Hello,

If we restart OvS/ovs-agent when one or more of Neutron, MySQL and
RabbitMQ is not available, the flow rules in OvS will be gone. If
Neutron/MySQL/RabbitMQ doesn't become available in time, the VMs
will lose their network connections. It's not easy for an
operations engineer to manually restore the flow rules. An
operations engineer working under pressure at 2 a.m. will make
mistakes.

We can backup the ports info to a local file. In case of emergency
the ovs-agent can use it to restore the flow rules. What do you
think of this feature?

Related bugs:
Restarting neutron openvswitch agent causes network hiccup by throwing
away all flows
https://bugs.launchpad.net/neutron/+bug/1383674

Restarting OVS agent drops VMs traffic when using VLAN provider bridges
https://bugs.launchpad.net/neutron/+bug/1514056

After restarting an ovs agent, it still drops useful flows if the
neutron server is busy/down
https://bugs.launchpad.net/neutron/+bug/1515075

Ovs agent loses OpenFlow rules if OVS gets restarted while Neutron is
disconnected from SQL
https://bugs.launchpad.net/neutron/+bug/1531210


-- 
Best,

Jian
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev