Hi Kevin,
Sorry for the late reply, We have tried doing that and we were still
seeing the same issues. I don't think the bug was quite the same as
what we were seeing.
Unfortunately we have had to roll back to Mitaka as we had a tight
deadline and not being able to create networks / have HA was fairly
critical. Interestingly, now we are back on Mitaka, everything is
working fine.
I will try and get a testing environment set up to see if I get the same
results as we were seeing when we upgraded to Newton from Mitaka. I am
not sure if it is something to do with our specific set up, but we have
followed the OSA guidelines and as everything was working on Liberty and
Mitaka I assume we have it all set up correctly.
I will keep you posted to our findings, as we may be onto another bug.
Regards,
On 06/12/16 14:07, Kevin Benton wrote:
There was a bug that the fixes just recently merged for where removing
a router on the L3 agent was done in the wrong order and it resulted
in issues cleaning up the interfaces with Linux Bridge + L3HA.
https://bugs.launchpad.net/neutron/+bug/1629159
It could be the case that there is an orphaned veth pair in a deleted
namespace from the same router when it was removed from the L3 agent.
For each L3 agent, can you shutdown the L3 agent, run the netns
cleanup script, ensure all keepalived processes are dead, and then
start the agent again?
On Tue, Dec 6, 2016 at 4:59 AM, Grant Morley <[email protected]
<mailto:[email protected]>> wrote:
They both appear to be "ACTIVE" which is what I would expect:
root@management-1-utility-container-f1222d05:~# neutron port-show
8cd027f1-9f8c-4077-9c8a-92abc62fadd4
+-----------------------+--------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+--------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:host_id |
network-1-neutron-agents-container-11d47568 |
| binding:profile | {} |
| binding:vif_details | {"port_filter": true} |
| binding:vif_type | bridge |
| binding:vnic_type | normal |
| created_at | 2016-12-05T10:58:01Z |
| description | |
| device_id | a8a10308-d62f-420f-99cf-f3727ef2b784 |
| device_owner | network:router_ha_interface |
| extra_dhcp_opts | |
| fixed_ips | {"subnet_id":
"6495d542-4b78-40df-84af-31500aaa0bf8", "ip_address":
"169.254.192.5"} |
| id | 8cd027f1-9f8c-4077-9c8a-92abc62fadd4 |
| mac_address | fa:16:3e:58:a1:a4 |
| name | HA port tenant
e0ffdeb1e910469d9e625b95f2fa6c54 |
| network_id | 2b04fc3a-5c0d-4f55-996f-8888d8bd1e1d |
| port_security_enabled | False |
| project_id | |
| revision_number | 23 |
| security_groups | |
| status | ACTIVE |
| tenant_id | |
| updated_at | 2016-12-06T10:18:00Z |
+-----------------------+--------------------------------------------------------------------------------------+
root@management-1-utility-container-f1222d05:~# neutron port-show
bda1f324-3178-46e5-8638-0f454ba09cab
+-----------------------+--------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+--------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:host_id |
network-2-neutron-agents-container-40906bfc |
| binding:profile | {} |
| binding:vif_details | {"port_filter": true} |
| binding:vif_type | bridge |
| binding:vnic_type | normal |
| created_at | 2016-12-05T10:58:01Z |
| description | |
| device_id | a8a10308-d62f-420f-99cf-f3727ef2b784 |
| device_owner | network:router_ha_interface |
| extra_dhcp_opts | |
| fixed_ips | {"subnet_id":
"6495d542-4b78-40df-84af-31500aaa0bf8", "ip_address":
"169.254.192.1"} |
| id | bda1f324-3178-46e5-8638-0f454ba09cab |
| mac_address | fa:16:3e:c3:8a:14 |
| name | HA port tenant
e0ffdeb1e910469d9e625b95f2fa6c54 |
| network_id | 2b04fc3a-5c0d-4f55-996f-8888d8bd1e1d |
| port_security_enabled | False |
| project_id | |
| revision_number | 15 |
| security_groups | |
| status | ACTIVE |
| tenant_id | |
| updated_at | 2016-12-05T14:35:16Z |
+-----------------------+--------------------------------------------------------------------------------------+
On 06/12/16 12:53, Kevin Benton wrote:
Can you do a 'neutron port-show' for both of those HA ports to
check their status field?
On Tue, Dec 6, 2016 at 2:29 AM, Grant Morley
<[email protected] <mailto:[email protected]>> wrote:
Hi Kevin & Neil,
Many thanks for the reply. I have attached a screen shot
showing that we cannot ping between the L3 HA nodes on the
router name spaces. This was previously working fine with
Mitaka, and has only stopped working since the upgrade to Newton.
From the packet captures and TCP dumps, the traffic doesn't
seem to be even leaving the namespace.
On the attachment, the left hand side shows the state of
keepalived showing both HA agents as master and the ring hand
side is the ping attempt.
Regards,
On 06/12/16 10:14, Kevin Benton wrote:
Yes, that is a misleading warning. What is happening is that
it's trying to load the interface driver as an alias first,
which results in a stevedore warning that you see and then
it falls back to loading it by the class path, which is what
you have configured. We will need to see if there is a way
we can suppress that warning somehow when we make the call
to load by an alias and it fails.
If you switch your interface to just 'linuxbridge', that
should get rid of the warning.
For both L3 HA nodes becoming master, we need a little more
info to figure out the root cause. Can you try switching
into the router namespace on one of the L3 HA nodes and see
if you can ping the other router instance across the L3 HA
network for that router?
On Mon, Dec 5, 2016 at 7:54 AM, Neil Jerram <[email protected]
<mailto:[email protected]>> wrote:
I have also recently been seeing 'Could not load
<whatever>InterfaceDriver' warnings from the DHCP agent,
and haven't yet understood that - although I'm pretty
sure that my interface driver is being loaded really -
or else none of my networking function would work at all.
So it's possible that that part of your report is
benign, and just a misleading warning. That said, I am
still worried about it too, and would like to understand
it properly.
I'm not aware of seeing the other symptoms you mentioned.
Neil
On Mon, Dec 5, 2016 at 3:14 PM Grant Morley
<[email protected]
<mailto:[email protected]>> wrote:
Hi All,
We have just upgraded from Mitaka to Newton. We are
running OSA and we seem to have come across some
weird networking issues since the upgrade. Basically
network access to instances is very intermittent and
seems to randomly stop working.
We are running neutron in HA and it appears that
both of the neutron nodes are now trying to be
master and are both trying to bring up the gateway
IP addresses which would be causing conflicts.
We are also seeing a lot of the following in the
"neutron-dhcp-agent" log files:
2016-12-05 14:42:24.837 2020 WARNING stevedore.named
[req-1955d0a1-1453-4c65-a93a-54e8ea39b230
1ac995c0729142289f7237222f335806
3cc95dbe91c84e3e8ebbb9893ee54d20 - - -] Could not
load neutron.agent.linux.interface.BridgeInterfaceDriver
2016-12-05 14:42:42.803 2020 INFO
neutron.agent.dhcp.agent
[req-fad7d2bb-9d3c-4192-868a-0164b382aecf
1ac995c0729142289f7237222f335806
3cc95dbe91c84e3e8ebbb9893ee54d20 - - -] Trigger
reload_allocations for port admin_state_up=True,
allowed_address_pairs=[], binding:host_id=,
binding:profile=, binding:vif_details=,
binding:vif_type=unbound, binding:vnic_type=normal,
created_at=2016-12-05T14:42:42Z, description=,
device_id=8752effa-2ff2-4ce1-be70-e9f2243612cb,
device_owner=network:floatingip, extra_dhcp_opts=[],
fixed_ips=[{u'subnet_id':
u'4ca7db2d-544a-4a97-b5a4-3cbf2467a4b7',
u'ip_address': u'XXX.XXX.XXX.XXX'}],
id=b3cf223d-8e76-484a-a649-d8a7dd435124,
mac_address=fa:16:3e:ff:0d:50, name=,
network_id=af5db886-0178-4f8d-9189-f55f773b37fa,
port_security_enabled=False, project_id=,
revision_number=4, security_groups=[], status=N/A,
tenant_id=, updated_at=2016-12-05T14:42:42Z
I am a bit concerned about neutron not being able to
load the Bridge interface driver.
Has anyone else come across this at all or have any
pointers? This was working fine in Mitaka it just
seems since the upgrade to Newton, we have these issues.
I am able to provide more logs if they are needed.
Regards,
--
Grant Morley
Cloud Lead
Absolute DevOps Ltd
Units H, J & K, Gateway 1000, Whittle Way,
Stevenage, Herts, SG1 2FP
www.absolutedevops.io <http://www.absolutedevops.io>
[email protected]
<mailto:[email protected]> 0845 874 0580
_______________________________________________
OpenStack-operators mailing list
[email protected]
<mailto:[email protected]>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators>
_______________________________________________
OpenStack-operators mailing list
[email protected]
<mailto:[email protected]>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators>
--
Grant Morley
Cloud Lead
Absolute DevOps Ltd
Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts,
SG1 2FP
www.absolutedevops.io <http://www.absolutedevops.io>
[email protected] <mailto:[email protected]> 0845
874 0580
--
Grant Morley
Cloud Lead
Absolute DevOps Ltd
Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP
www.absolutedevops.io <http://www.absolutedevops.io>
[email protected] <mailto:[email protected]> 0845 874
0580
--
Grant Morley
Cloud Lead
Absolute DevOps Ltd
Units H, J & K, Gateway 1000, Whittle Way, Stevenage, Herts, SG1 2FP
www.absolutedevops.io <http://www.absolutedevops.io/>
[email protected] <mailto:[email protected]> 0845 874 0580
_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators