Sorry about the long delay.

Can you dump the OVS flows before and after the outage? This will let us
know if the flows Neutron setup are getting wiped out.

On Tue, May 2, 2017 at 12:26 PM, Gustavo Randich <gustavo.rand...@gmail.com>
wrote:

> Hi Kevin, here is some information aout this issue:
>
> - if the network outage lasts less than ~1 minute, then connectivity to
> host and instances is automatically restored without problem
>
> - otherwise:
>
> - upon outage, "ovs-vsctl show" reports "is_connected: true" in all
> bridges (br-ex / br-int / br-tun)
>
> - after about ~1 minute, "ovs-vsctl show" ceases to show "is_connected:
> true" on every bridge
>
> - upon restoring physical interface (fix outage)
>
>         - "ovs-vsctl show" now reports "is_connected: true" in all bridges
> (br-ex / br-int / br-tun)
>
>        - access to host and VMs is NOT restored, although some pings are
> sporadically answered by host (~1 out of 20)
>
>
> - to restore connectivity, we:
>
>
>       - execute "ifdown br-ex; ifup br-ex" -> access to host is restored,
> but not to VMs
>
>
>       - restart neutron-openvswitch-agent -> access to VMs is restored
>
> Thank you!
>
>
>
>
> On Fri, Apr 28, 2017 at 5:07 PM, Kevin Benton <ke...@benton.pub> wrote:
>
>> With the network down, does ovs-vsctl show that it is connected to the
>> controller?
>>
>> On Fri, Apr 28, 2017 at 2:21 PM, Gustavo Randich <
>> gustavo.rand...@gmail.com> wrote:
>>
>>> Exactly, we access via a tagged interface, which is part of br-ex
>>>
>>> # ip a show vlan171
>>> 16: vlan171: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue
>>> state UNKNOWN group default qlen 1
>>>     link/ether 8e:14:8d:c1:1a:5f brd ff:ff:ff:ff:ff:ff
>>>     inet 10.171.1.240/20 brd 10.171.15.255 scope global vlan171
>>>        valid_lft forever preferred_lft forever
>>>     inet6 fe80::8c14:8dff:fec1:1a5f/64 scope link
>>>        valid_lft forever preferred_lft forever
>>>
>>> # ovs-vsctl show
>>>     ...
>>>     Bridge br-ex
>>>         Controller "tcp:127.0.0.1:6633"
>>>             is_connected: true
>>>         Port "vlan171"
>>>             tag: 171
>>>             Interface "vlan171"
>>>                 type: internal
>>>     ...
>>>
>>>
>>> On Fri, Apr 28, 2017 at 3:03 PM, Kevin Benton <ke...@benton.pub> wrote:
>>>
>>>> Ok, that's likely not the issue then. I assume the way you access each
>>>> host is via an IP assigned to an OVS bridge or an interface that somehow
>>>> depends on OVS?
>>>>
>>>> On Apr 28, 2017 12:04, "Gustavo Randich" <gustavo.rand...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Kevin, we are using the default listen address of loopback
>>>>> interface:
>>>>>
>>>>> # grep -r of_listen_address /etc/neutron
>>>>> /etc/neutron/plugins/ml2/openvswitch_agent.ini:#of_listen_address =
>>>>> 127.0.0.1
>>>>>
>>>>>
>>>>>         tcp/127.0.0.1:6640 -> ovsdb-server /etc/openvswitch/conf.db
>>>>> -vconsole:emer -vsyslog:err -vfile:info 
>>>>> --remote=punix:/var/run/openvswitch/db.sock
>>>>> --private-key=db:Open_vSwitch,SSL,private_key
>>>>> --certificate=db:Open_vSwitch,SSL,certificate
>>>>> --bootstrap-ca-cert=db:Open_vSwitch,SSL,ca_cert --no-chdir
>>>>> --log-file=/var/log/openvswitch/ovsdb-server.log
>>>>> --pidfile=/var/run/openvswitch/ovsdb-server.pid --detach --monitor
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 28, 2017 at 5:00 AM, Kevin Benton <ke...@benton.pub>
>>>>> wrote:
>>>>>
>>>>>> Are you using an of_listen_address value of an interface being
>>>>>> brought down?
>>>>>>
>>>>>> On Apr 25, 2017 17:34, "Gustavo Randich" <gustavo.rand...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> (using Mitaka / Ubuntu 16 / Neutron DVR / OVS / VXLAN /
>>>>>>> l2_population)
>>>>>>>
>>>>>>> This sounds very strange (to me): recently, after a switch outage,
>>>>>>> we lost connectivity to all our Mitaka hosts. We had to enter via iLO 
>>>>>>> host
>>>>>>> by host and restart networking service to regain access. Then restart
>>>>>>> neutron-openvswitch-agent to regain access to VMs.
>>>>>>>
>>>>>>> At first glance we thought it was a problem with the NIC linux
>>>>>>> driver of the hosts not detecting link state correctly.
>>>>>>>
>>>>>>> Then we reproduced the issue simply bringing down physical
>>>>>>> interfaces for around 5 minutes, then up again. Same issue.
>>>>>>>
>>>>>>> And then.... we found that if instead of using native (ryu) OpenFlow
>>>>>>> interface in Neutron Openvswitch we used ovs-ofctl, the problem 
>>>>>>> disappears.
>>>>>>>
>>>>>>> Any clue?
>>>>>>>
>>>>>>> Thanks in advance.
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Mailing list: http://lists.openstack.org/cgi
>>>>>>> -bin/mailman/listinfo/openstack
>>>>>>> Post to     : openst...@lists.openstack.org
>>>>>>> Unsubscribe : http://lists.openstack.org/cgi
>>>>>>> -bin/mailman/listinfo/openstack
>>>>>>>
>>>>>>>
>>>>>
>>>
>>
>
_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to