[Yahoo-eng-team] [Bug 1716401] Re: FWaaS: Ip tables rules do not get updated in case of distributed virtual routers (DVR)
*** This bug is a duplicate of bug 1845557 *** https://bugs.launchpad.net/bugs/1845557 This bug is also a duplicate of https://bugs.launchpad.net/neutron/+bug/1845557 ** This bug is no longer a duplicate of bug 1715395 FWaaS: Firewall creation fails in case of distributed routers (Pike) ** This bug has been marked a duplicate of bug 1845364 [fullstack] Race condition when updating the router port information and updating the network MTU -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1716401 Title: FWaaS: Ip tables rules do not get updated in case of distributed virtual routers (DVR) Status in neutron: New Bug description: I have set up an HA/DVR deployment of OpenStack Pike on Ubuntu 16.04 and enabled FWaaS v1. After applying the Fix from Bug #1715395, firewall rules get created in case of HA/DVR, but updates do not have any effect, e.g. when you disassociate a firewall from a distributed router. Use Case: 1. Set up an HA/DVP deployment of OpenStack Pike. 2. Create a firewall rule. $ neutron firewall-rule-create --name test-rule --protocol icmp --action reject Created a new firewall_rule: ++--+ | Field | Value| ++--+ | action | reject | | description| | | destination_ip_address | | | destination_port | | | enabled| True | | firewall_policy_id | | | id | 6c2516cb-b69d-46b6-958e-e47c1cf1709e | | ip_version | 4| | name | test-rule| | position | | | project_id | ed2d2efd86dd40e7a45491d8502318d3 | | protocol | icmp | | shared | False| | source_ip_address | | | source_port| | | tenant_id | ed2d2efd86dd40e7a45491d8502318d3 | ++--+ 3. Create a firewall policy. $ neutron firewall-policy-create --firewall-rules test-rule test-policy Created a new firewall_policy: ++--+ | Field | Value| ++--+ | audited| False| | description| | | firewall_rules | 6c2516cb-b69d-46b6-958e-e47c1cf1709e | | id | 53a8d733-e81c-4113-9354-d40b5b426e00 | | name | test-policy | | project_id | ed2d2efd86dd40e7a45491d8502318d3 | | shared | False| | tenant_id | ed2d2efd86dd40e7a45491d8502318d3 | ++--+ 4. Create a firewall. $ neutron firewall-create --name test-firewall test-policy Created a new firewall: ++--+ | Field | Value| ++--+ | admin_state_up | True | | description| | | firewall_policy_id | 53a8d733-e81c-4113-9354-d40b5b426e00 | | id | a468caca-c555-4f89-adbc-bcdbb06a3fca | | name | test-firewall| | project_id | ed2d2efd86dd40e7a45491d8502318d3 | | router_ids | | | status | INACTIVE | | tenant_id | ed2d2efd86dd40e7a45491d8502318d3 | ++--+ 5. Assign the firewall to a distributed router. $ neutron firewall-update --router demo-router test-firewall Updated firewall: test-firewall 6. Spawn a virtual machine and assign a floating ip. 7. Check namespaces on the compute node hosting the virtual machine. $ ip netns fip-4a3959c3-b011-4bd0-8f4f-f405be92d9ac qrouter-09a379b5-907f-4e3e-b29a-8701b82f2641 8. Check ip tables rules in the router's namespace. $ ip netns exec qrouter-09a379b5-907f-4e3e-b29a-8701b82f2641 iptables -n -L -v Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts
[Yahoo-eng-team] [Bug 1845557] [NEW] DVR: FWaaS rules created for a router after the FIP and VM created, not applied to routers rfp port on router-update
Public bug reported: This was seen in Rocky. When network, subnet, router and a VM instance created with a FloatingIP before attaching FireWall rules to the router, causes the Firewall rules not to be applied to the 'rfp' port for north-south routing when using Firewall-as-Service in legacy 'iptables' mode. After applying the Firewall rules to the Router, it is expected that the router-update would trigger adding the Firewall rules to the existing routers, but the logic is not right. Any new VMs added to the subnet on a new compute host, gets the Firewall rules applied to the 'rfp' interface. So the only way to get around this problem is to restart the 'l3-agent'. Once the 'l3-agent' is restarted, the Firewall rules are applied again. This is also true when Firewall rules are removed after the VM and routers are in place, since the update is not handled properly, the firewall rules may stay there until we reboot the l3-agent. How to reproduce this problem: This is FWaaS v2 with legacy 'iptables': 1. Create a Network 2. Create a Subnet 3. Create a Router (DVR) 4. Attach the Subnet to the router. 5. Assign the gateway to the router. 6. Create a VM on the given private network. 7. Create a FloatingIP and associate the FloatingIP to the VM's private IP. 8. Now the VM, router, fipnamespace are all in place. 9. Now create Firwall rules neutron firewall-rule-create --protocol icmp --action allow --name allow-icmp neutron firewall-rule-create --protocol tcp --destination-port 80 --action deny --name deny-http neutron firewall-rule-create --protocol tcp --destination-port 22 --action allow --name allow-ssh 10. Then create firewall policy neutron firewall-policy-create --firewall-rules "allow-icmp deny-http allow-ssh" policy-fw 11. Create a firewall neutron firewall-create policy-fw --name user-fw 12. Check if the firewall was created: neutron firewall-show user-fw 13. If the firewall was created after the router have been created, based on the documentation you need to manually update the router. $ neutron firewall-update —router —router 14. After the update we would expect that all existing router-1 and router-2 to have the firewall rules. But we don't see if configured for the router-1 that was created before the firewall was created. And so the VM is not protected by the Firewall rules. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: Confirmed ** Tags: fwaas l3-dvr-backlog ** Changed in: neutron Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan) ** Changed in: neutron Status: New => Confirmed ** Summary changed: - DVR: FWaaS rules created for a router after the FIP and VM created not applied to routers rfp port + DVR: FWaaS rules created for a router after the FIP and VM created, not applied to routers rfp port on router-update -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1845557 Title: DVR: FWaaS rules created for a router after the FIP and VM created, not applied to routers rfp port on router-update Status in neutron: Confirmed Bug description: This was seen in Rocky. When network, subnet, router and a VM instance created with a FloatingIP before attaching FireWall rules to the router, causes the Firewall rules not to be applied to the 'rfp' port for north-south routing when using Firewall-as-Service in legacy 'iptables' mode. After applying the Firewall rules to the Router, it is expected that the router-update would trigger adding the Firewall rules to the existing routers, but the logic is not right. Any new VMs added to the subnet on a new compute host, gets the Firewall rules applied to the 'rfp' interface. So the only way to get around this problem is to restart the 'l3-agent'. Once the 'l3-agent' is restarted, the Firewall rules are applied again. This is also true when Firewall rules are removed after the VM and routers are in place, since the update is not handled properly, the firewall rules may stay there until we reboot the l3-agent. How to reproduce this problem: This is FWaaS v2 with legacy 'iptables': 1. Create a Network 2. Create a Subnet 3. Create a Router (DVR) 4. Attach the Subnet to the router. 5. Assign the gateway to the router. 6. Create a VM on the given private network. 7. Create a FloatingIP and associate the FloatingIP to the VM's private IP. 8. Now the VM, router, fipnamespace are all in place. 9. Now create Firwall rules neutron firewall-rule-create --protocol icmp --action allow --name allow-icmp neutron firewall-rule-create --protocol tcp --destination-port 80 --action deny --name deny-http neutron firewall-rule-create --protocol tcp --destination-port 22 --action allow --name allow-ssh 10. Then create firewall po
[Yahoo-eng-team] [Bug 1840979] Re: [L2] [opinion] update the port DB status directly in agent-side
** Changed in: neutron Status: New => Opinion ** Changed in: neutron Importance: Undecided => Wishlist -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1840979 Title: [L2] [opinion] update the port DB status directly in agent-side Status in neutron: Opinion Bug description: When ovs-agent done processing the port, it will call neutron-server to make some DB update. Especially when restart the ovs-agent, all ports in one agent will do such RPC and DB update again to make port status consistent. When a large number of concurrent agent restart happen, neutron-server may not work fine. So how about making the following DB updating locally in neutron agent side directly? It may have some mechanism driver notification, IMO, this can also be done in agent-side. def update_device_down(self, context, device, agent_id, host=None): cctxt = self.client.prepare() return cctxt.call(context, 'update_device_down', device=device, agent_id=agent_id, host=host) def update_device_up(self, context, device, agent_id, host=None): cctxt = self.client.prepare() return cctxt.call(context, 'update_device_up', device=device, agent_id=agent_id, host=host) def update_device_list(self, context, devices_up, devices_down, ret = cctxt.call(context, 'update_device_list', To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1840979/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1824571] Re: l3agent can't create router if there are multiple external networks
** Changed in: neutron Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1824571 Title: l3agent can't create router if there are multiple external networks Status in neutron: Fix Released Bug description: In case there are more than one external network the l3 agent unable to create routers with the following error: 2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 701, in _process_routers_if_compatible 2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) 2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 548, in _process_router_if_compatible 2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent target_ex_net_id = self._fetch_external_net_id() 2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent File "/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in _fetch_external_net_id 2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent raise Exception(msg) 2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent Exception: The 'gateway_external_network_id' option must be configured for this agent as Neutron has more than one external network. It happens in DVR scenario on both dvr and dvr_snat agents and it started after upgraded from Rocky to Stein, before the upgrade it worked fine. The gateway_external_network_id is not set in my config, because I want the l3 agent to be able to use multiple external networks. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1824571/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1824566] Re: DVR-Nexthop static routes are not getting configured in FIP Namespace when disassociating and reassociating a FloatingIP in Ocata
** Changed in: neutron Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1824566 Title: DVR-Nexthop static routes are not getting configured in FIP Namespace when disassociating and reassociating a FloatingIP in Ocata Status in neutron: Fix Released Bug description: Nexthop static routes for external network are not getting configured in the FIP Namespace table, after disassociating and re-associating a FloatingIP. This is seen in Ocata and Newton. Not seen in Pike and later branches. Steps to reproduce this problem. 1. Deploy the devstack cloud with DVR routers 2. Create a VM 3. Assign a FloatingIP to the VM. 4. Now configure a Nexthop static route for the external Network. 5. Make sure the Nexthop routes are seen in the SNAT Namespace and in the FIP Namespace under router specific lookup table-id. 6. Now Disassociate the floatingIP. 7. Make sure that the Nexthop routes are cleared from the FIP Namespace if this the only FloatingIP, under the router specific lookup table-id. 8. Now re-associate the FloatingIP. 9. Now you will see the 'Nexthop static routes' will be missing in the FIP Namespaces router specific lookup table-id. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1824566/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1823314] Re: ha router sometime goes in standby mode in all controllers
** Changed in: neutron Status: Confirmed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1823314 Title: ha router sometime goes in standby mode in all controllers Status in neutron: Fix Released Bug description: Sometimes when 2 HA routers are created for same tenant in very short time, it may happen that both routers will have same vr_id assigned thus it will be same application for keepalived and only one of those routers will be active on some hosts. When I spotted it it looked like: [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-2 +--+--++---+--+ | id | host | admin_state_up | alive | ha_state | +--+--++---+--+ | 0d654b7c-da42-4847-a24f-6d1df804ca3b | controller-1.localdomain | True | :-) | standby | | 242e1e81-7e4e-466e-8354-a9c46982ff88 | controller-0.localdomain | True | :-) | active | | 3d241b02-031a-4623-a179-88e1953b3889 | controller-2.localdomain | True | :-) | standby | +--+--++---+--+ [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-1 +--+--++---+--+ | id | host | admin_state_up | alive | ha_state | +--+--++---+--+ | 3d241b02-031a-4623-a179-88e1953b3889 | controller-2.localdomain | True | :-) | standby | | 0d654b7c-da42-4847-a24f-6d1df804ca3b | controller-1.localdomain | True | :-) | standby | | 242e1e81-7e4e-466e-8354-a9c46982ff88 | controller-0.localdomain | True | :-) | standby | +--+--++---+--+ And in db it looks like: MariaDB [ovs_neutron]> select * from router_extra_attributes; +--+-+++--+-+ | router_id| distributed | service_router | ha | ha_vr_id | availability_zone_hints | +--+-+++--+-+ | 6ba430d7-2f9d-4e8e-a59f-4d4fb5644a8e | 0 | 0 | 1 | 1 | [] | | ace64e85-5f3b-4815-aeae-3b54c75ef5eb | 0 | 0 | 1 | 1 | [] | | cd6b61e1-60c9-47da-8866-169ca29ece20 | 1 | 0 | 0 | 0 | [] | +--+-+++--+-+ 3 rows in set (0.01 sec) MariaDB [ovs_neutron]> select * from ha_router_vrid_allocations; +--+---+ | network_id | vr_id | +--+---+ | 45aaae94-ce16-412d-bd74-b3812b16ff6f | 1 | +--+---+ 1 row in set (0.01 sec) So indeed there is possible race during such creation of 2 different routers in very short time. But when I then created another router, it was created properly with new vr_id and all worked fine for it: [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-3 +--+--++---+--+ | id | host | admin_state_up | alive | ha_state | +--+--++---+--+ | 0d654b7c-da42-4847-a24f-6d1df804ca3b | controller-1.localdomain | True | :-) | standby | | 242e1e81-7e4e-466e-8354-a9c46982ff88 | controller-0.localdomain | True | :-) | active | | 3d241b02-031a-4623-a179-88e1953b3889 | controller-2.localdomain | True | :-) | standby | +--+--++---+--+ MariaDB [ovs_neutron]> select * from ha_router_vrid_allocations; +--+---+ | network_id | vr_id | +--+---+ | 45aaae94-ce16-412d-bd74-b3812b16ff6f | 1 | | 45aaae94-ce16-412d-bd74-b3812b16ff6f | 2 | +--+---+ I
[Yahoo-eng-team] [Bug 1815676] Re: DVR: External process monitor for keepalived should be removed when external gateway is removed for DVR HA routers
** Changed in: neutron Status: In Progress => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1815676 Title: DVR: External process monitor for keepalived should be removed when external gateway is removed for DVR HA routers Status in neutron: Invalid Bug description: External process monitor for keepalived state change should be removed when the External Gateway is removed for DVR HA routers. We have seen under certain conditions when the SNAT namespace is missing, the External process Monitor is try to respawn the keepalived state change monitor process within the namespace. But the External process monitor does not check for the SNAT namespace and it is up to the process that calls it. The 'delete' ha-router takes care of cleaning the external process monitor subscription for the keepalived state change, but the external gateway remove function is not calling this function. This is how I was able to reproduce the problem. But this is how I was able to reproduce. Create HA/DVR routers Delete the SNAT Namespace of the routers. Also delete the PID files for the ip_monitor under /opt/stack/data/neutron/external/pids/ip_monitor pid Once deleted I was able to see the log message in the neutron-l3.service logs. ` Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR neutron.agent.linux.external_process [-] ip_monitor for router with uuid 04fabe76-9316-4270-a99f-4f0ccffb8feb not found. The process should not have died Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: WARNING neutron.agent.linux.external_process [-] Respawning ip_monitor for uui d 04fabe76-9316-4270-a99f-4f0ccffb8feb Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG neutron.agent.linux.utils [-] Unable to access /opt/stack/data/neutron/e xternal/pids/04fabe76-9316-4270-a99f-4f0ccffb8feb.monitor.pid {{(pid=12153) get_value_from_file /opt/stack/neutron/neutron/agent/linux/utils .py:250}} Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', 'neutron-keepalived-state-change', '--router_id=04fabe76-9316-4270-a99f-4f0ccf fb8feb', '--namespace=snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', '--conf_dir=/opt/stack/data/neutron/ha_confs/04fabe76-9316-4270-a99f-4f0cc ffb8feb', '--monitor_interface=ha-4af17105-bd', '--monitor_cidr=169.254.0.1/24', '--pid_file=/opt/stack/data/neutron/external/pids/04fabe76- 9316-4270-a99f-4f0ccffb8feb.monitor.pid', '--state_path=/opt/stack/data/neutron', '--user=1000', '--group=1004'] {{(pid=12153) execute_rootw rap_daemon /opt/stack/neutron/neutron/agent/linux/utils.py:103}} Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR neutron.agent.linux.utils [-] Exit code: 1; Stdin: ; Stdout: ; Stderr: C annot open network namespace "snat-04fabe76-9316-4270-a99f-4f0ccffb8feb": No such file or directory Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG oslo_concurrency.lockutils [-] Lock "_check_child_processes" released by "neutron.agent.linux.external_process._check_child_processes" :: held 0.007s {{(pid=12153) inner /usr/local/lib/python2.7/dist-packages/osl o_concurrency/lockutils.py:285}} Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: Traceback (most recent call last): Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 460 , in fire_timers To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1815676/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1824566] [NEW] DVR-Nexthop static routes are not getting configured in FIP Namespace when disassociating and reassociating a FloatingIP in Ocata
Public bug reported: Nexthop static routes for external network are not getting configured in the FIP Namespace table, after disassociating and re-associating a FloatingIP. This is seen in Ocata and Newton. Not seen in Pike and later branches. Steps to reproduce this problem. 1. Deploy the devstack cloud with DVR routers 2. Create a VM 3. Assign a FloatingIP to the VM. 4. Now configure a Nexthop static route for the external Network. 5. Make sure the Nexthop routes are seen in the SNAT Namespace and in the FIP Namespace under router specific lookup table-id. 6. Now Disassociate the floatingIP. 7. Make sure that the Nexthop routes are cleared from the FIP Namespace if this the only FloatingIP, under the router specific lookup table-id. 8. Now re-associate the FloatingIP. 9. Now you will see the 'Nexthop static routes' will be missing in the FIP Namespaces router specific lookup table-id. ** Affects: neutron Importance: Undecided Status: New ** Tags: ocata-backport-potential -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1824566 Title: DVR-Nexthop static routes are not getting configured in FIP Namespace when disassociating and reassociating a FloatingIP in Ocata Status in neutron: New Bug description: Nexthop static routes for external network are not getting configured in the FIP Namespace table, after disassociating and re-associating a FloatingIP. This is seen in Ocata and Newton. Not seen in Pike and later branches. Steps to reproduce this problem. 1. Deploy the devstack cloud with DVR routers 2. Create a VM 3. Assign a FloatingIP to the VM. 4. Now configure a Nexthop static route for the external Network. 5. Make sure the Nexthop routes are seen in the SNAT Namespace and in the FIP Namespace under router specific lookup table-id. 6. Now Disassociate the floatingIP. 7. Make sure that the Nexthop routes are cleared from the FIP Namespace if this the only FloatingIP, under the router specific lookup table-id. 8. Now re-associate the FloatingIP. 9. Now you will see the 'Nexthop static routes' will be missing in the FIP Namespaces router specific lookup table-id. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1824566/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1821815] [NEW] Gate jobs are failing for stable/ocata
Public bug reported: Some Gate jobs are failing for stable/ocata, is there any known issues with the stable/ocata branch. See the patch for details. https://review.openstack.org/#/c/640176/ https://review.openstack.org/#/c/642363/ ** Affects: neutron Importance: Undecided Status: New ** Tags: gate-failure -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1821815 Title: Gate jobs are failing for stable/ocata Status in neutron: New Bug description: Some Gate jobs are failing for stable/ocata, is there any known issues with the stable/ocata branch. See the patch for details. https://review.openstack.org/#/c/640176/ https://review.openstack.org/#/c/642363/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1821815/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1816698] [NEW] DVR-HA: Removing a router from an agent, does not clear the namespaces on the agent
Public bug reported: Removing an active or a standby ha-router from an agent, does not clear the router namespace and the Snat namespaces. This leads to sometimes having two Active HA routers and two 'ha-interface' in the snat namespace for DVR routers. This can be very easily reproduced. 1. Create a HA-DVR router. ( minimum two node setup with 'dvr_snat' agent mode) 2. Attach interface to the router 3. Attach gateway to the router. 4. Now check the l3-agent-list-hosting-router for router. 5. Then remove the router from one of the agent ( l3-agent-router-remove ) 6. Expected result is router namespace and snat namespace to be removed. ( But it is removed). 7. At the minimum we should clear the HA interfaces for that agent so that the HA router does not get into Active mode again. This bug might have been introduced by this patch. https://review.openstack.org/#/c/522362/7 This bug is seen since Ocata/Pike and probably also in master branch. ** Affects: neutron Importance: High Status: Confirmed ** Tags: l3-dvr-backlog l3-ha ** Changed in: neutron Status: New => Confirmed ** Changed in: neutron Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1816698 Title: DVR-HA: Removing a router from an agent, does not clear the namespaces on the agent Status in neutron: Confirmed Bug description: Removing an active or a standby ha-router from an agent, does not clear the router namespace and the Snat namespaces. This leads to sometimes having two Active HA routers and two 'ha-interface' in the snat namespace for DVR routers. This can be very easily reproduced. 1. Create a HA-DVR router. ( minimum two node setup with 'dvr_snat' agent mode) 2. Attach interface to the router 3. Attach gateway to the router. 4. Now check the l3-agent-list-hosting-router for router. 5. Then remove the router from one of the agent ( l3-agent-router-remove ) 6. Expected result is router namespace and snat namespace to be removed. ( But it is removed). 7. At the minimum we should clear the HA interfaces for that agent so that the HA router does not get into Active mode again. This bug might have been introduced by this patch. https://review.openstack.org/#/c/522362/7 This bug is seen since Ocata/Pike and probably also in master branch. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1816698/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1815676] [NEW] DVR: External process monitor for keepalived should be removed when external gateway is removed for DVR HA routers
Public bug reported: External process monitor for keepalived state change should be removed when the External Gateway is removed for DVR HA routers. We have seen under certain conditions when the SNAT namespace is missing, the External process Monitor is try to respawn the keepalived state change monitor process within the namespace. But the External process monitor does not check for the SNAT namespace and it is up to the process that calls it. The 'delete' ha-router takes care of cleaning the external process monitor subscription for the keepalived state change, but the external gateway remove function is not calling this function. This is how I was able to reproduce the problem. But this is how I was able to reproduce. Create HA/DVR routers Delete the SNAT Namespace of the routers. Also delete the PID files for the ip_monitor under /opt/stack/data/neutron/external/pids/ip_monitor pid Once deleted I was able to see the log message in the neutron-l3.service logs. ` Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR neutron.agent.linux.external_process [-] ip_monitor for router with uuid 04fabe76-9316-4270-a99f-4f0ccffb8feb not found. The process should not have died Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: WARNING neutron.agent.linux.external_process [-] Respawning ip_monitor for uui d 04fabe76-9316-4270-a99f-4f0ccffb8feb Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG neutron.agent.linux.utils [-] Unable to access /opt/stack/data/neutron/e xternal/pids/04fabe76-9316-4270-a99f-4f0ccffb8feb.monitor.pid {{(pid=12153) get_value_from_file /opt/stack/neutron/neutron/agent/linux/utils .py:250}} Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', 'neutron-keepalived-state-change', '--router_id=04fabe76-9316-4270-a99f-4f0ccf fb8feb', '--namespace=snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', '--conf_dir=/opt/stack/data/neutron/ha_confs/04fabe76-9316-4270-a99f-4f0cc ffb8feb', '--monitor_interface=ha-4af17105-bd', '--monitor_cidr=169.254.0.1/24', '--pid_file=/opt/stack/data/neutron/external/pids/04fabe76- 9316-4270-a99f-4f0ccffb8feb.monitor.pid', '--state_path=/opt/stack/data/neutron', '--user=1000', '--group=1004'] {{(pid=12153) execute_rootw rap_daemon /opt/stack/neutron/neutron/agent/linux/utils.py:103}} Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR neutron.agent.linux.utils [-] Exit code: 1; Stdin: ; Stdout: ; Stderr: C annot open network namespace "snat-04fabe76-9316-4270-a99f-4f0ccffb8feb": No such file or directory Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG oslo_concurrency.lockutils [-] Lock "_check_child_processes" released by "neutron.agent.linux.external_process._check_child_processes" :: held 0.007s {{(pid=12153) inner /usr/local/lib/python2.7/dist-packages/osl o_concurrency/lockutils.py:285}} Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: Traceback (most recent call last): Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 460 , in fire_timers ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1815676 Title: DVR: External process monitor for keepalived should be removed when external gateway is removed for DVR HA routers Status in neutron: New Bug description: External process monitor for keepalived state change should be removed when the External Gateway is removed for DVR HA routers. We have seen under certain conditions when the SNAT namespace is missing, the External process Monitor is try to respawn the keepalived state change monitor process within the namespace. But the External process monitor does not check for the SNAT namespace and it is up to the process that calls it. The 'delete' ha-router takes care of cleaning the external process monitor subscription for the keepalived state change, but the external gateway remove function is not calling this function. This is how I was able to reproduce the problem. But this is how I was able to reproduce. Create HA/DVR routers Delete the SNAT Namespace of the routers. Also delete the PID files for the ip_monitor under /opt/stack/data/neutron/external/pids/ip_monitor pid Once deleted I was able to see the log message in the neutron-l3.service logs. ` Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR neutron.agent.linux.external_process [-] ip_monitor for router with uuid 04fabe76-9316-4270-a99f-4f0ccffb8feb not found. The process should not have died
[Yahoo-eng-team] [Bug 1814002] [NEW] Packets getting lost during SNAT with too many connections using the same source and destination on Network Node
Public bug reported: Probably we have a problem with SNAT, with too many connections using the same source / destination, on the network nodes. We have reproduced the bug with DNS requests, but we assume that it affects other packages as well. When we send a lot of DNS requests, we see that sometimes a packet does not pass through the NAT and simply "gets lost". In addition, we can see in the conntrack table that the who "insert_failed" increases. ip netns exec snat-848819dc-efa2-45d9-9bc3-d96f093fa87a conntrack -S | grep insert_failed | grep -v insert_failed=0 cpu=0 searched=1166140 found=5587918 new=6659 invalid=5 ignore=0 delete=27726 delete_list=27712 insert=6645 insert_failed=14 drop=0 early_drop=0 error=0 search_restart=0 cpu=2 searched=12015 found=64626 new=2467 invalid=0 ignore=0 delete=15205 delete_list=15204 insert=2466 insert_failed=1 drop=0 early_drop=0 error=0 search_restart=0 cpu=3 searched=1348502 found=6097345 new=4093 invalid=0 ignore=0 delete=23200 delete_list=23173 insert=4066 insert_failed=27 drop=0 early_drop=0 error=0 search_restart=0 cpu=4 searched=1068516 found=5398514 new=3299 invalid=0 ignore=0 delete=14144 delete_list=14126 insert=3281 insert_failed=18 drop=0 early_drop=0 error=0 search_restart=0 cpu=5 searched=2280948 found=9908854 new=6770 invalid=0 ignore=0 delete=17224 delete_list=17185 insert=6731 insert_failed=39 drop=0 early_drop=0 error=0 search_restart=0 cpu=6 searched=1123341 found=5264368 new=9749 invalid=0 ignore=0 delete=17272 delete_list=17247 insert=9724 insert_failed=25 drop=0 early_drop=0 error=0 search_restart=0 cpu=7 searched=1553934 found=7234262 new=8734 invalid=0 ignore=0 delete=15658 delete_list=15634 insert=8710 insert_failed=24 drop=0 early_drop=0 error=0 search_restart=0 This might be a generic problem with conntrack and linux. We suspect that we encounter the following "limitation / bug" in the kernel: https://github.com/torvalds/linux/blob/24de3d377539e384621c5b8f8f8d8d01852dddc8/net/netfilter/nf_nat_core.c#L290-L291 There seems to be a workaround to alleviate this behavior by setting the -random-fully flag in iptables. Unfortunately, this is only available since iptables 1.6.2. Also this is not currently supported in neutron for the SNAT rules, it just uses the --to-source. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1814002 Title: Packets getting lost during SNAT with too many connections using the same source and destination on Network Node Status in neutron: New Bug description: Probably we have a problem with SNAT, with too many connections using the same source / destination, on the network nodes. We have reproduced the bug with DNS requests, but we assume that it affects other packages as well. When we send a lot of DNS requests, we see that sometimes a packet does not pass through the NAT and simply "gets lost". In addition, we can see in the conntrack table that the who "insert_failed" increases. ip netns exec snat-848819dc-efa2-45d9-9bc3-d96f093fa87a conntrack -S | grep insert_failed | grep -v insert_failed=0 cpu=0 searched=1166140 found=5587918 new=6659 invalid=5 ignore=0 delete=27726 delete_list=27712 insert=6645 insert_failed=14 drop=0 early_drop=0 error=0 search_restart=0 cpu=2 searched=12015 found=64626 new=2467 invalid=0 ignore=0 delete=15205 delete_list=15204 insert=2466 insert_failed=1 drop=0 early_drop=0 error=0 search_restart=0 cpu=3 searched=1348502 found=6097345 new=4093 invalid=0 ignore=0 delete=23200 delete_list=23173 insert=4066 insert_failed=27 drop=0 early_drop=0 error=0 search_restart=0 cpu=4 searched=1068516 found=5398514 new=3299 invalid=0 ignore=0 delete=14144 delete_list=14126 insert=3281 insert_failed=18 drop=0 early_drop=0 error=0 search_restart=0 cpu=5 searched=2280948 found=9908854 new=6770 invalid=0 ignore=0 delete=17224 delete_list=17185 insert=6731 insert_failed=39 drop=0 early_drop=0 error=0 search_restart=0 cpu=6 searched=1123341 found=5264368 new=9749 invalid=0 ignore=0 delete=17272 delete_list=17247 insert=9724 insert_failed=25 drop=0 early_drop=0 error=0 search_restart=0 cpu=7 searched=1553934 found=7234262 new=8734 invalid=0 ignore=0 delete=15658 delete_list=15634 insert=8710 insert_failed=24 drop=0 early_drop=0 error=0 search_restart=0 This might be a generic problem with conntrack and linux. We suspect that we encounter the following "limitation / bug" in the kernel: https://github.com/torvalds/linux/blob/24de3d377539e384621c5b8f8f8d8d01852dddc8/net/netfilter/nf_nat_core.c#L290-L291 There seems to be a workaround to alleviate this behavior by setting the -random-fully flag in iptables. Unfortunately, this is only available since iptables 1.6.2. Also this is not currently
[Yahoo-eng-team] [Bug 1804136] Re: Industry Standard approach for DVR E/W routing issue of port/mac movement by vlan based mac learning
So if i understand your recommendation are you suggesting that we completely ignore the HOST MAC change that we make today and just use VLAN+MAC learning, so that the packets will get out of the host with its own MAC. What will happen to the switch learning that would happen in the intermediate physical switches that would connect the hosts.? ** Changed in: neutron Status: New => Opinion -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1804136 Title: Industry Standard approach for DVR E/W routing issue of port/mac movement by vlan based mac learning Status in neutron: Opinion Bug description: Problem statement: In the current implementation of DVR E/W Routing when the DVR instance having same mac running in multiple compute node will create mac movement in the br-int bridge. The way we addressed this issue doesn't follow any l2/l3 standard. I am proposing a simpler solution for this. Proposal: Keep br-int as vlan+mac based learning switch. And, set DVR port connected with br-int as tagged. Scenario: Please refer https://assafmuller.com/2015/04/ for a diagrammatic view. Say, blue host running in left compute node trying to reach orange host running in right compute node. Both the compute node running DVR and do E/W routing. Blue host subnet vlan is 10, and Orange host subnet vlan is 20. Packet Forwarding: 1. When vlan based mac learning happens in both br-int bridges, there will be two entries with same DVR mac one with vlan 10 and other with 20. Thus no mac-movement issue will not arise. 2. When packets send by blue host having vlan 10 reaches the left-dvr, it will route the packet and it send out with vlan 20 to Orange host. 3. br-int in right side will also have two mac entries for the same MAC one for vlan 10 and another for vlan 20. 4. Since DVR has access to connected to both vlans, packets from blue/orange host have to hop only the DVR in its compute node. Please review this proposal will it work and simplify the DVR E/W routing. Thanks Subbu iimksu...@gmail.com To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1804136/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1606741] Re: Metadata service for instances is unavailable when the l3-agent on the compute host is dvr_snat mode
** Changed in: neutron Status: In Progress => Confirmed ** Changed in: neutron Status: Confirmed => Won't Fix -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1606741 Title: Metadata service for instances is unavailable when the l3-agent on the compute host is dvr_snat mode Status in neutron: Won't Fix Bug description: In my mitaka environment, there are five nodes here, including controller, network1, network2, computer1, computer2 node. I start l3-agents with dvr_snat mode in all network and compute nodes and set enable_metadata_proxy to true in l3-agent.ini. It works well for most neutron services unless the metadata proxy service. When I run command "curl http://169.254.169.254; in an instance booting from cirros, it returns "curl: couldn't connect to host" and the instance can't fetch metadata in its first booting. * Pre-conditions: start l3-agent with dvr_snat mode in all computer and network nodes and set enable_metadata_proxy to true in l3-agent.ini. * Step-by-step reproduction steps: 1.create a network and a subnet under this network; 2.create a router; 3.add the subnet to the router 4.create an instance with cirros (or other images) on this subnet 5.open the console for this instance and run command 'curl http://169.254.169.254' in bash, waiting for result. * Expected output: this command should return the true metadata info with the command 'curl http://169.254.169.254' * Actual output: the command actually returns "curl: couldn't connect to host" * Version: ** Mitaka ** All hosts are centos7 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1606741/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1797037] [NEW] Extra routes configured on routers are not set in the router namespace and snat namespace with DVR-HA routers
Public bug reported: When DVR routers are configured for HA and if we try to add an extra route to the DVR routers, the extra route is not set in the router namespace or in the snat namespace. Configure for HA and DVR 1. Create Router 2. Attach Interface 3. Try to add an extra route with destination and nexthop. 4. You can see the routes in the router dict, but it is missing in the router namespace on the 'dvr-snat' node. The routes are handled properly on the compute nodes that are running as 'dvr' or 'dvr_no_external' agent modes. ** Affects: neutron Importance: Medium Status: Confirmed ** Tags: l3-dvr-backlog l3-ha ** Changed in: neutron Status: New => Confirmed ** Changed in: neutron Importance: Undecided => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1797037 Title: Extra routes configured on routers are not set in the router namespace and snat namespace with DVR-HA routers Status in neutron: Confirmed Bug description: When DVR routers are configured for HA and if we try to add an extra route to the DVR routers, the extra route is not set in the router namespace or in the snat namespace. Configure for HA and DVR 1. Create Router 2. Attach Interface 3. Try to add an extra route with destination and nexthop. 4. You can see the routes in the router dict, but it is missing in the router namespace on the 'dvr-snat' node. The routes are handled properly on the compute nodes that are running as 'dvr' or 'dvr_no_external' agent modes. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1797037/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1716782] Re: DVR multinode job has linuxbridge agent mech driver defined
** Changed in: neutron Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1716782 Title: DVR multinode job has linuxbridge agent mech driver defined Status in neutron: Fix Released Bug description: There's an ML2 port binding error being generated in the DVR multinode job. http://logs.openstack.org/50/502850/2/check/gate-grenade-dsvm-neutron- dvr-multinode-ubuntu-xenial- nv/9d7ab88/logs/screen-q-svc.txt.gz#_Sep_12_16_36_00_230807 DEBUG neutron.plugins.ml2.drivers.mech_agent [None req-a68ce697-14fd- 497a-9bb6-b55a899a8d54 None None] Port 1f89c2a7-de09-49f7-a298-37b22d37192c on network 3608f5e9-c01b-4791-9a18-90461a614fa8 not bound, no agent of type Linux bridge agent registered on host ubuntu-xenial-2-node-rax-dfw-10898269 {{(pid=26134) bind_port /opt/stack/new/neutron/neutron/plugins/ml2/drivers/mech_agent.py:102}} ERROR neutron.plugins.ml2.managers [None req-a68ce697-14fd-497a- 9bb6-b55a899a8d54 None None] Failed to bind port 1f89c2a7-de09-49f7-a298-37b22d37192c on host ubuntu-xenial-2-node-rax- dfw-10898269 for vnic_type normal using segments [] This is happening because by default, devstack enables the linuxbridge mechanism driver when DVR mode != legacy, even though the linuxbridge agent doesn't work with DVR. Devstack needs to change to not enable the driver in this case. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1716782/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1779194] [NEW] neutron-lbaas haproxy agent, when configured with allow_automatic_lbaas_agent_failover = True, after failover, when the failed agent restarts or reconnects to Rabb
Public bug reported: When we configure two or more lbaas haproxy agents with high availability by setting the allow_automatic_lbaas_agent_failover to True for failover, then the LBaaS fails over to an available active agent, either when the agent is not responsive or the agent lost connection with RabitMQ. This works exactly as per the expectation. But when the dead agent comes up active and when it trys to re-sync the state with the server, the agent finds the LBaaS configured or associated with that agent is an 'Orphan' and tries to clean up the Orphan LBaaS. In the process of cleaning it up, it tries to unplug the VIF port, which affects the other agent that is hosting the LBaaS. When the VIF port is unplugged, the port device_owner changes and it causes other issues. So there should be a check before the VIF port is removed, to make sure, if there is an active agent using the port. In that case the VIF port should not be unplugged. ** Affects: neutron Importance: Undecided Status: New ** Tags: lbaas -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1779194 Title: neutron-lbaas haproxy agent, when configured with allow_automatic_lbaas_agent_failover = True, after failover, when the failed agent restarts or reconnects to RabbitMQ, it tries to unplug the vif port without checking if it is used by other agent Status in neutron: New Bug description: When we configure two or more lbaas haproxy agents with high availability by setting the allow_automatic_lbaas_agent_failover to True for failover, then the LBaaS fails over to an available active agent, either when the agent is not responsive or the agent lost connection with RabitMQ. This works exactly as per the expectation. But when the dead agent comes up active and when it trys to re-sync the state with the server, the agent finds the LBaaS configured or associated with that agent is an 'Orphan' and tries to clean up the Orphan LBaaS. In the process of cleaning it up, it tries to unplug the VIF port, which affects the other agent that is hosting the LBaaS. When the VIF port is unplugged, the port device_owner changes and it causes other issues. So there should be a check before the VIF port is removed, to make sure, if there is an active agent using the port. In that case the VIF port should not be unplugged. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1779194/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1778643] [NEW] DVR: Fip gateway port is tagged as DEAD port by OVS when external_bridge is configured
Public bug reported: When external bridge is configured in Neutron, the FIP Agent Gateway port 'fg-' is tagged as a DEAD port with Vlan id of 4095. This issue is seen in Pike. It seems that there was fix that recently merged in neutron shown below https://review.openstack.org/#/c/564825/10 Based on this patch, the 4095 vlan tag for the 'qg-' is removed when external bridge is configured. But it has not been handled for the DVR FIP agent gateway port. So we are seeing the port as DEAD always, when external bridge such as 'br-vlan1087' is configured. Bridge "br-vlan1087" Port "br-vlan1087" Interface "br-vlan1087" type: internal Port "vlan1087" Interface "vlan1087" Port "fg-0a4a425d-d5" tag: 4095 Interface "fg-0a4a425d-d5" type: internal ovs_version: "2.7.0" ** Affects: neutron Importance: High Status: Confirmed ** Tags: l3-dvr-backlog ** Changed in: neutron Status: New => Confirmed ** Changed in: neutron Importance: Undecided => High -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1778643 Title: DVR: Fip gateway port is tagged as DEAD port by OVS when external_bridge is configured Status in neutron: Confirmed Bug description: When external bridge is configured in Neutron, the FIP Agent Gateway port 'fg-' is tagged as a DEAD port with Vlan id of 4095. This issue is seen in Pike. It seems that there was fix that recently merged in neutron shown below https://review.openstack.org/#/c/564825/10 Based on this patch, the 4095 vlan tag for the 'qg-' is removed when external bridge is configured. But it has not been handled for the DVR FIP agent gateway port. So we are seeing the port as DEAD always, when external bridge such as 'br-vlan1087' is configured. Bridge "br-vlan1087" Port "br-vlan1087" Interface "br-vlan1087" type: internal Port "vlan1087" Interface "vlan1087" Port "fg-0a4a425d-d5" tag: 4095 Interface "fg-0a4a425d-d5" type: internal ovs_version: "2.7.0" To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1778643/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1776984] [NEW] DVR: Self recover from the loss of 'fg' ports in FIP Namespace
Public bug reported: Sometimes we have seen the 'fg' ports within the fip-namespace either goes down, not created in time or getting deleted due to some race conditions. When this happens, the code tries to recover itself after couple of exceptions when there is a router_update message. But after recovery we could see that the fip-namespace is recreated and the 'fg-' port is plugged in and active, but the 'fpr' and the 'rfp' ports are missing which leads to the FloatingIP failure. So we need to fix this issue, if this happens, then it should check for all the ports within the 'fipnamespace' and recreate the necessary plumbing. Here is the error log we have been seeing when the 'fg' port was missing. http://paste.openstack.org/show/723505/ ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog pike-backport-potential queens-backport-potential -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1776984 Title: DVR: Self recover from the loss of 'fg' ports in FIP Namespace Status in neutron: New Bug description: Sometimes we have seen the 'fg' ports within the fip-namespace either goes down, not created in time or getting deleted due to some race conditions. When this happens, the code tries to recover itself after couple of exceptions when there is a router_update message. But after recovery we could see that the fip-namespace is recreated and the 'fg-' port is plugged in and active, but the 'fpr' and the 'rfp' ports are missing which leads to the FloatingIP failure. So we need to fix this issue, if this happens, then it should check for all the ports within the 'fipnamespace' and recreate the necessary plumbing. Here is the error log we have been seeing when the 'fg' port was missing. http://paste.openstack.org/show/723505/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1776984/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1776566] [NEW] DVR: FloatingIP create throws an error if the L3 agent is not running in the given host
Public bug reported: FloatingIP create throws an error if the L3 agent is not running on the given host for DVR Routers. This can be reproduced by 1. Configure the global router settings to be 'Legacy' CVR routers. 2. Then configure a DVR Router by manually setting '--distributed = True' from CLI. 3. Create a network 4. Create a Subnet 5. Attach the subnet to the DVR router 6. Configure the Gateway for the Router. 7. Then create a VM on the created Subnet 8. Now create a FloatingIP and associate it with the VM port. 9. You would see an 'Internal Server Error' while creating the FloatingIP. ~/devstack$ neutron floatingip-associate 1cafc567-c6fc-4424-9c44-ab7d90bc6ce0 5c95fa16-a8cc-4d93-8f31-988f692e01ae neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. Request Failed: internal server error while processing your request. The reason is before creating the 'FloatingIP Agent Gateway Port' it checks for the Agent type by the given host, and it raises an Exception since the Agent is not running on the Compute Host. This is basically a Test Error, but still we should handle the error condition and not throw an Internal Server Error. ** Affects: neutron Importance: Low Status: Confirmed ** Tags: l3-dvr-backlog ** Changed in: neutron Status: New => Confirmed ** Changed in: neutron Importance: Undecided => Critical ** Changed in: neutron Importance: Critical => High ** Changed in: neutron Importance: High => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1776566 Title: DVR: FloatingIP create throws an error if the L3 agent is not running in the given host Status in neutron: Confirmed Bug description: FloatingIP create throws an error if the L3 agent is not running on the given host for DVR Routers. This can be reproduced by 1. Configure the global router settings to be 'Legacy' CVR routers. 2. Then configure a DVR Router by manually setting '--distributed = True' from CLI. 3. Create a network 4. Create a Subnet 5. Attach the subnet to the DVR router 6. Configure the Gateway for the Router. 7. Then create a VM on the created Subnet 8. Now create a FloatingIP and associate it with the VM port. 9. You would see an 'Internal Server Error' while creating the FloatingIP. ~/devstack$ neutron floatingip-associate 1cafc567-c6fc-4424-9c44-ab7d90bc6ce0 5c95fa16-a8cc-4d93-8f31-988f692e01ae neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. Request Failed: internal server error while processing your request. The reason is before creating the 'FloatingIP Agent Gateway Port' it checks for the Agent type by the given host, and it raises an Exception since the Agent is not running on the Compute Host. This is basically a Test Error, but still we should handle the error condition and not throw an Internal Server Error. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1776566/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1774463] [NEW] RFE: Add support for IPv6 on DVR Routers for the Fast-path exit
Public bug reported: This RFE is to add support for IPv6 on DVR Routers for the Fast-Path-Exit. Today DVR support Fast-Path-Exit through the FIP Namespace, but FIP Namespace does not support IPv6 addresses for the Link local address and also we don't have any ra proxy enabled in the FIP Namespace. So this RFE should address those issues. 1. Update the link local address for 'rfp' and 'fpr' ports to support both IPv4 and IPv6. 2. Enable ra proxy in the FIP Namespace and also assign IPv6 address to the FIP gateway port. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1774463 Title: RFE: Add support for IPv6 on DVR Routers for the Fast-path exit Status in neutron: New Bug description: This RFE is to add support for IPv6 on DVR Routers for the Fast-Path-Exit. Today DVR support Fast-Path-Exit through the FIP Namespace, but FIP Namespace does not support IPv6 addresses for the Link local address and also we don't have any ra proxy enabled in the FIP Namespace. So this RFE should address those issues. 1. Update the link local address for 'rfp' and 'fpr' ports to support both IPv4 and IPv6. 2. Enable ra proxy in the FIP Namespace and also assign IPv6 address to the FIP gateway port. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1774463/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1774459] [NEW] RFE: Update permanent ARP entries for allowed_address_pair IPs in DVR Routers
Public bug reported: We have a long term issue with Allowed_address_pairs IP which associated with unbound ports and DVR routers. The ARP entry for the allowed_address_pair IP does not change based on the GARP issued by any keepalived instance. Since DVR does the ARP table update through the control plane, and does not allow any ARP entry to get out of the node to prevent the router IP/MAC from polluting the network, there has been always an issue with this. A recent patch in master https://review.openstack.org/#/c/550676/ to address this issue was not successful. This patch helped in updating the ARP entry dynamically from the GARP message. But the entry has to be Temporary(NUD - reachable). Only if it is set to 'reachable' we were able to update it on the fly from the GARP message, without using any external tools. But the problem here is, when we have VMs residing in two different subnets (Subnet A and Subnet B) and if a VM from the Subnet B which is on a different isolated node and is trying to ping the VRRP IP in the Subnet A, the packet from the VM comes to the router namespace where the ARP entry for the VRRP IP is available as reachable. While it is reachable the VM is able to send couple of pings, and later within in 15 sec, the pings timeout. The reason is that the Router is in turn trying to make sure that if the IP/MAC combination for the VRRP IP is still valid or not, since the entry in the ARP table is "REACHABLE" and not "PERMANENT". When it tries to re-ARP for the IP, the ARP entries are blocked by the DVR flow rules in the br-tun and so the ARP timesout and the ARP entry in the Router Namespace becomes incomplete. Option A: So the way to address this situation is to make use of some GARP sniffer tool/utility that would be running in the router namespace to sniff a GARP packet with a specific IP as a filter. If that IP is seen in the GARP message, the tool/utility should in-turn try to reset the ARP entry for the VRRP IP as permanent. ( This is one option ). This is very performance intensive and so not sure if it would be helpful. So we should probably make it configurable, so that people can use it if required. Option B: The other option is, instead of running it on all nodes and in all router-namespace, we can probably just run it on the network_node router_namespace, or in the network node host, and then send a message to the neutron that there was a change in IP/MAC somehow and then neutron will then communicate to all the hosts to do an ARP update for the given IP/MAC. ( Just an idea not sure how simple it is when compared to the former) Any ideas or thoughts would be helpful. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1774459 Title: RFE: Update permanent ARP entries for allowed_address_pair IPs in DVR Routers Status in neutron: New Bug description: We have a long term issue with Allowed_address_pairs IP which associated with unbound ports and DVR routers. The ARP entry for the allowed_address_pair IP does not change based on the GARP issued by any keepalived instance. Since DVR does the ARP table update through the control plane, and does not allow any ARP entry to get out of the node to prevent the router IP/MAC from polluting the network, there has been always an issue with this. A recent patch in master https://review.openstack.org/#/c/550676/ to address this issue was not successful. This patch helped in updating the ARP entry dynamically from the GARP message. But the entry has to be Temporary(NUD - reachable). Only if it is set to 'reachable' we were able to update it on the fly from the GARP message, without using any external tools. But the problem here is, when we have VMs residing in two different subnets (Subnet A and Subnet B) and if a VM from the Subnet B which is on a different isolated node and is trying to ping the VRRP IP in the Subnet A, the packet from the VM comes to the router namespace where the ARP entry for the VRRP IP is available as reachable. While it is reachable the VM is able to send couple of pings, and later within in 15 sec, the pings timeout. The reason is that the Router is in turn trying to make sure that if the IP/MAC combination for the VRRP IP is still valid or not, since the entry in the ARP table is "REACHABLE" and not "PERMANENT". When it tries to re-ARP for the IP, the ARP entries are blocked by the DVR flow rules in the br-tun and so the ARP timesout and the ARP entry in the Router Namespace becomes incomplete. Option A: So the way to address this situation is to make use of some GARP sniffer tool/utility that would be running in the router namespace to sniff a GARP packet with a specific IP as a filter. If that IP is seen in the GARP
[Yahoo-eng-team] [Bug 1768919] [NEW] PCI-Passthrough fails when we have Flavor configured and provide a port with vnic_type=direct-physical
Public bug reported: PCI-Passthrough of a NIC device to the VM fails, when we have both the Flavor configured with Alias and also provide a network port with 'vnic_type=direct-physical'. The comment shown in the source code shown below, https://github.com/openstack/nova/blob/644ac5ec37903b0a08891cc403c8b3b63fc2a91c/nova/compute/api.py#L812 # PCI requests come from two sources: instance flavor and # requested_networks. The first call in below returns an # InstancePCIRequests object which is a list of InstancePCIRequest # objects. The second call in below creates an InstancePCIRequest # object for each SR-IOV port, and append it to the list in the # InstancePCIRequests object In this case there would be two PCI-requests for the same device and _test_pci fails when the compute tries to check for the Claims. 088d81f6653242318245b137b1ef91c7] _test_pci /opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:201 2018-04-30 22:17:06.058 13396 DEBUG nova.compute.claims [req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 088d81f6653242318245b137b1ef91c7] pci requests: [InstancePCIRequest(alias_name='intel10fb',count=1,is_new=False,request_id=None,spec=[{dev_type='type-PF',product_id='10fb',vendor_id='8086'}]), InstancePCIRequest(alias_name=None,count=1,is_new=False,request_id=13befe5f-478f-4f4c-aa72-78cce84d942d,spec=[{dev_type='type-PF',physical_network='physnet2'}])] _test_pci /opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:202 2018-04-30 22:17:06.059 13396 DEBUG nova.compute.claims [req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 088d81f6653242318245b137b1ef91c7] PCI request stats failed _test_pci /opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:206 2018-04-30 22:17:06.059 13396 DEBUG oslo_concurrency.lockutils [req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 088d81f6653242318245b137b1ef91c7] Lock "compute_resources" released by "nova.compute.resource_tracker.instance_claim" :: held 0.059s inner /opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:282 2018-04-30 22:17:06.060 13396 DEBUG nova.compute.manager [req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 088d81f6653242318245b137b1ef91c7] [instance: 39ad3a47-66dc-4114-9653-fee5ee0c87dc] Insufficient compute resources: Claim pci failed.. Not sure why the Claim pci failed for the same device entry twice. Probably if the device id is the same on both Flavor and network, then it should only compose one entry since they both are identical. ** Affects: nova Importance: Undecided Status: New ** Tags: pci -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1768919 Title: PCI-Passthrough fails when we have Flavor configured and provide a port with vnic_type=direct-physical Status in OpenStack Compute (nova): New Bug description: PCI-Passthrough of a NIC device to the VM fails, when we have both the Flavor configured with Alias and also provide a network port with 'vnic_type=direct-physical'. The comment shown in the source code shown below, https://github.com/openstack/nova/blob/644ac5ec37903b0a08891cc403c8b3b63fc2a91c/nova/compute/api.py#L812 # PCI requests come from two sources: instance flavor and # requested_networks. The first call in below returns an # InstancePCIRequests object which is a list of InstancePCIRequest # objects. The second call in below creates an InstancePCIRequest # object for each SR-IOV port, and append it to the list in the # InstancePCIRequests object In this case there would be two PCI-requests for the same device and _test_pci fails when the compute tries to check for the Claims. 088d81f6653242318245b137b1ef91c7] _test_pci /opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:201 2018-04-30 22:17:06.058 13396 DEBUG nova.compute.claims [req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 088d81f6653242318245b137b1ef91c7] pci requests: [InstancePCIRequest(alias_name='intel10fb',count=1,is_new=False,request_id=None,spec=[{dev_type='type-PF',product_id='10fb',vendor_id='8086'}]), InstancePCIRequest(alias_name=None,count=1,is_new=False,request_id=13befe5f-478f-4f4c-aa72-78cce84d942d,spec=[{dev_type='type-PF',physical_network='physnet2'}])] _test_pci /opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:202 2018-04-30 22:17:06.059 13396 DEBUG nova.compute.claims [req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 088d81f6653242318245b137b1ef91c7] PCI request stats
[Yahoo-eng-team] [Bug 1768917] [NEW] PCI-Passthrough documentation is incorrect while trying to pass through a NIC
Public bug reported: As per the documentation shown below https://docs.openstack.org/nova/pike/admin/pci-passthrough.html In order to achieve PCI passthrough of a network device, it states that we should create a 'flavor' based on the alias and then associate a flavor to the server create function. Steps to follow: Create an Alias: [pci] alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", "name":"a1" } Create a Flavor: [pci] alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", "name":"a1" } Add a whitelist: [pci] passthrough_whitelist = { "address": ":41:00.0" } Create a Server with the Flavor: # openstack server create --flavor m1.large --image cirros-0.3.5-x86_64-uec --wait test-pci With the above command, the VM creation errors out and we see a PortBindingFailure. The reason for the PortBindingFailure is the 'vif_type' is always set to 'BINDING_FAILED". The reason being, flavor does not mention about the 'vnic_type'='direct- physical' without this information the sriov mechanism driver is not able to bind the port. Not sure if there is any way to specify the info in the flavor. ** Affects: nova Importance: Undecided Status: New ** Tags: pci -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1768917 Title: PCI-Passthrough documentation is incorrect while trying to pass through a NIC Status in OpenStack Compute (nova): New Bug description: As per the documentation shown below https://docs.openstack.org/nova/pike/admin/pci-passthrough.html In order to achieve PCI passthrough of a network device, it states that we should create a 'flavor' based on the alias and then associate a flavor to the server create function. Steps to follow: Create an Alias: [pci] alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", "name":"a1" } Create a Flavor: [pci] alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", "name":"a1" } Add a whitelist: [pci] passthrough_whitelist = { "address": ":41:00.0" } Create a Server with the Flavor: # openstack server create --flavor m1.large --image cirros-0.3.5-x86_64-uec --wait test-pci With the above command, the VM creation errors out and we see a PortBindingFailure. The reason for the PortBindingFailure is the 'vif_type' is always set to 'BINDING_FAILED". The reason being, flavor does not mention about the 'vnic_type '='direct-physical' without this information the sriov mechanism driver is not able to bind the port. Not sure if there is any way to specify the info in the flavor. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1768917/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1761260] [NEW] DVR: Add a check for the item_allocator IP before trying to release it, since we see a KeyError sometimes, when the item is not there anymore.
Public bug reported: We have seen this Traceback in Pike based installation, while trying to cleanup a gateway with DVR routers. 2018-04-03 20:30:10.081 9672 DEBUG neutron.agent.l3.dvr_fip_ns [-] Delete FIP link interfaces for router: e415276a-4f37-4ee0-ba48-12d3909153c7 delete_rtr_2_fip_link /opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-pac kages/neutron/agent/l3/dvr_fip_ns.py:364 2018-04-03 20:30:10.082 9672 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-e415276a-4f37-4ee0-ba48-12d3909153c7', 'ip', '-o', 'link', 'show', 'rfp-e415276a-4'] execute_ro otwrap_daemon /opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info [-] u'e415276a-4f37-4ee0-ba48-12d3909153c7': KeyError: u'e415276a-4f37-4ee0-ba48-12d3909153c7' 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info Traceback (most recent call last): 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info File "/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/common/utils.py", line 186, in call 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info return func(*args, **kwargs) 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info File "/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 1118, in process_delete 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info self._process_external_on_delete() 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info File "/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 890, in _process_external_on_delete 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info self._process_external_gateway(ex_gw_port) 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info File "/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 799, in _process_external_gateway 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info self.external_gateway_removed(self.ex_gw_port, interface_name) 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info File "/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py", line 513, in external_gateway_removed 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info self.fip_ns.delete_rtr_2_fip_link(self) 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info File "/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_fip_ns.py", line 402, in delete_rtr_2_fip_link 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info self.local_subnets.release(ri.router_id) 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info File "/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/item_allocator.py", line 116, in release 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info self.pool.add(self.allocations.pop(key)) 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info KeyError: u'e415276a-4f37-4ee0-ba48-12d3909153c7' 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info Probably a check to make sure if the Key exists before release would be a good idea. We might also see if we can reproduce this in the master branch. ** Affects: neutron Importance: Low Status: New ** Tags: l3-dvr-backlog ** Changed in: neutron Importance: Undecided => Critical ** Changed in: neutron Importance: Critical => Low -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1761260 Title: DVR: Add a check for the item_allocator IP before trying to release it, since we see a KeyError sometimes, when the item is not there anymore. Status in neutron: New Bug description: We have seen this Traceback in Pike based installation, while trying to cleanup a gateway with DVR routers. 2018-04-03 20:30:10.081 9672 DEBUG neutron.agent.l3.dvr_fip_ns [-] Delete FIP link interfaces for router: e415276a-4f37-4ee0-ba48-12d3909153c7 delete_rtr_2_fip_link /opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-pac kages/neutron/agent/l3/dvr_fip_ns.py:364 2018-04-03 20:30:10.082 9672 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qrouter-e415276a-4f37-4ee0-ba48-12d3909153c7', 'ip', '-o', 'link', 'show', 'rfp-e415276a-4'] execute_ro otwrap_daemon /opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108 2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info [-]
[Yahoo-eng-team] [Bug 1759694] Re: DHCP agent doesn't respawn metadata when enable_isolated_metadata and gateway removed
*** This bug is a duplicate of bug 1753540 *** https://bugs.launchpad.net/bugs/1753540 Cherry-picked to stable/pike https://review.openstack.org/#/c/557536/ ** This bug has been marked a duplicate of bug 1753540 When isolated metadata is enabled, metadata proxy doesn't get automatically started/stopped when needed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1759694 Title: DHCP agent doesn't respawn metadata when enable_isolated_metadata and gateway removed Status in neutron: New Bug description: Hi, We are running Neutron Pike with OVS and DVR. When enable_isolated_metadata is True and we remove the gateway port for a network from a router, a metadata process is not respawned to start serving metadata. How to replicate : [root@5c1fced0888e /]# openstack network create test_nw +---+--+ | Field | Value| +---+--+ | admin_state_up| UP | | availability_zone_hints | | | availability_zones| | | created_at| 2018-03-28T21:18:29Z | | description | | | dns_domain| | | id| d19dabb2-f8c8-4608-8387-1f356a9f0f14 | | ipv4_address_scope| None | | ipv6_address_scope| None | | is_default| False| | is_vlan_transparent | None | | mtu | 1500 | | name | test_nw | | port_security_enabled | True | | project_id| c053ae2460e741008fa0ea908ae7da8c | | provider:network_type | vxlan| | provider:physical_network | None | | provider:segmentation_id | 65035| | qos_policy_id | None | | revision_number | 2| | router:external | Internal | | segments | None | | shared| False| | status| ACTIVE | | subnets | | | tags | | | updated_at| 2018-03-28T21:18:30Z | +---+--+ [root@5c1fced0888e /]# openstack subnet create --network d19dabb2-f8c8-4608-8387-1f356a9f0f14 --subnet-range 10.10.10.0/24 --gateway 10.10.10.254 test_sn +-+--+ | Field | Value| +-+--+ | allocation_pools| 10.10.10.1-10.10.10.253 | | cidr| 10.10.10.0/24| | created_at | 2018-03-28T21:20:03Z | | description | | | dns_nameservers | | | enable_dhcp | True | | gateway_ip | 10.10.10.254 | | host_routes | | | id | 1cd9d1f4-8c43-411b-85db-9514fe7b5e06 | | ip_version | 4| | ipv6_address_mode | None | | ipv6_ra_mode| None | | name| test_sn | | network_id | d19dabb2-f8c8-4608-8387-1f356a9f0f14 | | project_id | c053ae2460e741008fa0ea908ae7da8c | | revision_number | 0| | segment_id | None | | service_types | | | subnetpool_id | None | | tags| | | updated_at | 2018-03-28T21:20:03Z | | use_default_subnet_pool | None
[Yahoo-eng-team] [Bug 1758093] [NEW] DVR: RPC error handling missing for get_network_info_for_id
Public bug reported: To avoid exceptions from the l2 agent while trying to access the 'get_network_info_for_id' when the server is not updated, we need to handle the error case when the oslo_messaging reports an error that the API not found. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1758093 Title: DVR: RPC error handling missing for get_network_info_for_id Status in neutron: In Progress Bug description: To avoid exceptions from the l2 agent while trying to access the 'get_network_info_for_id' when the server is not updated, we need to handle the error case when the oslo_messaging reports an error that the API not found. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1758093/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1757188] Re: some L3 HA routers does not work
** Changed in: neutron Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1757188 Title: some L3 HA routers does not work Status in neutron: Invalid Bug description: Pike DVR + L3_HA L2population enabled Some of our L3 HA routers are not working correctly. They are not reachable from instances. After deep investigation, I've found that "HA port tenant " ports are in state DOWN. They are DOWN because they don't have binding information. They don't have binding information because 'HA network tenant ' network is corrupted. I mean it does not have provider:network_type and provider:segmentation_id parameters set. The weird thing is that this network was OK and worked but in some point in time has been corrupted. I don't have any logs from this point in time. For comparison working HA tenant network: +---++ | Field | Value | +---++ | admin_state_up| True | | availability_zone_hints | | | availability_zones| nova | | created_at| 2018-02-16T16:52:31Z | | description | | | id| fa2fea5c-ccaa-4116-bb0c-ff59bbd8229a | | ipv4_address_scope| | | ipv6_address_scope| | | mtu | 9000 | | name | HA network tenant afeeb372d7934795b63868330eca0dfe | | port_security_enabled | True | | project_id| | | provider:network_type | vxlan | | provider:physical_network | | | provider:segmentation_id | 35 | | revision_number | 3 | | router:external | False | | shared| False | | status| ACTIVE | | subnets | 5cbc612d-13cf-4889-88fb-02d1debe5f8d | | tags | | | tenant_id | | | updated_at| 2018-02-16T16:52:31Z | +---++ and not working HA tenant network: +---++ | Field | Value | +---++ | admin_state_up| True | | availability_zone_hints | | | availability_zones| | | created_at| 2018-01-26T12:24:15Z | | description | | | id| 6390c381-871e-4945-bfa0-00828bb519bc | | ipv4_address_scope| | | ipv6_address_scope| | | mtu | 9000 | | name | HA network tenant 3e88cffb9dbb4e1fba96ee72a02e012e | | port_security_enabled | True | | project_id| | | provider:network_type | | | provider:physical_network | | | provider:segmentation_id | | | revision_number | 5 | |
[Yahoo-eng-team] [Bug 1757495] Re: Using dvr and centralized routers in same network fails
** Changed in: neutron Status: Incomplete => Invalid ** Changed in: neutron Status: Invalid => Opinion -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1757495 Title: Using dvr and centralized routers in same network fails Status in neutron: Opinion Bug description: Brief overview and reproducing steps: 1. Create tenant network, let's say 10.3.2.0/24. 2. Create centralized HA router. Attach it at 10.3.2.1 3. Boot VM and ping 10.3.2.1 - works. 4. Create distributed, no-snat router. Attach it at any free IP, e.g. 10.3.2.5 5. Try to ping 10.3.2.1 from VM - fails. Ping 10.3.2.5 - works. I can reproduce this consistently. The setup might be a bit of a corner case: - deployment with openstack kolla. Openvswitch. - openstack pike. neutron 11.0.2 - tenant provider networks are vlan - there are 2 neutron nodes to host HA routers - all compute nodes configured for DVR No errors in logs. On the compute node hosting the VM, I can see dropped packages on integration bridge br-int. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1757495/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1756406] [NEW] DVR: Fix dvr mac address format to be backward compatible with non native openflow interface
Public bug reported: DVR MAC address is configured on the server for every node that is configured to run in one of the dvr agent modes ( dvr,dvr_snat and dvr_no_external). The DVR MAC addresses are stored in the 'AA-BB-CC-DD-EE-FF' format. When the agent tries to configure the DVR MAC addresses into the openflow rules using the native interface drivers, they are ok. But when used with the non native interface drivers this throws an error as shown below. Unable to execute ['ovs-ofctl', 'add-flows', 'br-vlan1078', '-']. Exception: Exit code: 1; Stdin: hard_timeout=0,idle_timeout=0,priority=2,table=3,cookie=12002607947458125225,dl_src=FA-16 -3F-AA-78-20,actions=output:2; Stdout: ; Stderr: ovs-ofctl: -:1: FA-16 -3F-AA-78-20: invalid Ethernet address. This is also seen in the Master branch. So to provide backward compatibility, we need to add a patch to change the format of the MAC before it is handed over to the openflow interface driver. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1756406 Title: DVR: Fix dvr mac address format to be backward compatible with non native openflow interface Status in neutron: In Progress Bug description: DVR MAC address is configured on the server for every node that is configured to run in one of the dvr agent modes ( dvr,dvr_snat and dvr_no_external). The DVR MAC addresses are stored in the 'AA-BB-CC-DD-EE-FF' format. When the agent tries to configure the DVR MAC addresses into the openflow rules using the native interface drivers, they are ok. But when used with the non native interface drivers this throws an error as shown below. Unable to execute ['ovs-ofctl', 'add-flows', 'br-vlan1078', '-']. Exception: Exit code: 1; Stdin: hard_timeout=0,idle_timeout=0,priority=2,table=3,cookie=12002607947458125225,dl_src=FA-16 -3F-AA-78-20,actions=output:2; Stdout: ; Stderr: ovs-ofctl: -:1: FA-16 -3F-AA-78-20: invalid Ethernet address. This is also seen in the Master branch. So to provide backward compatibility, we need to add a patch to change the format of the MAC before it is handed over to the openflow interface driver. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1756406/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1657981] Re: FloatingIPs not reachable after restart of compute node (DVR)
** Changed in: neutron Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1657981 Title: FloatingIPs not reachable after restart of compute node (DVR) Status in neutron: Invalid Bug description: I am running OpenStack Newton on Ubuntu 16.04 using DVR. When I restart a compute node, the FloatingIPs of those vms running on this node are unreachable. A manual restart of the service "neutron-l3-agent" or "neutron-vpn-agent" running in on node solves the issue. I think there must be a race condition at startup. I get the following error in the neutron-vpn-agent.log: 2017-01-20 07:04:52.379 2541 INFO neutron.common.config [-] Logging enabled! 2017-01-20 07:04:52.379 2541 INFO neutron.common.config [-] /usr/bin/neutron-vpn-agent version 9.0.0 2017-01-20 07:04:52.380 2541 WARNING stevedore.named [-] Could not load neutron.agent.linux.interface.OVSInterfaceDriver 2017-01-20 07:04:53.112 2541 WARNING stevedore.named [req-a9e10331-51ab-4c67-bfdd-0f6296510594 - - - - -] Could not load neutron_fwaas.services.firewall.drivers.linux.iptables_fwaas.IptablesFwaasDriver 2017-01-20 07:04:53.127 2541 INFO neutron.agent.agent_extensions_manager [req-a9e10331-51ab-4c67-bfdd-0f6296510594 - - - - -] Loaded agent extensions: ['fwaas'] 2017-01-20 07:04:53.128 2541 INFO neutron.agent.agent_extensions_manager [req-a9e10331-51ab-4c67-bfdd-0f6296510594 - - - - -] Initializing agent extension 'fwaas' 2017-01-20 07:04:53.163 2541 WARNING oslo_config.cfg [req-bdd95fb9-bcd7-473e-a350-3bd8d6be8758 - - - - -] Option "external_network_bridge" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future. 2017-01-20 07:04:53.165 2541 WARNING stevedore.named [req-bdd95fb9-bcd7-473e-a350-3bd8d6be8758 - - - - -] Could not load neutron_vpnaas.services.vpn.device_drivers.strongswan_ipsec.StrongSwanDriver 2017-01-20 07:04:53.236 2541 INFO eventlet.wsgi.server [-] (2541) wsgi starting up on http:/var/lib/neutron/keepalived-state-change 2017-01-20 07:04:53.261 2541 INFO neutron.agent.l3.agent [-] Agent has just been revived. Doing a full sync. 2017-01-20 07:04:53.373 2541 INFO neutron.agent.l3.agent [-] L3 agent started 2017-01-20 07:05:22.832 2541 ERROR neutron.agent.linux.utils [-] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "fg-67afaa06-bb" 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info [-] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "fg-67afaa06-bb" 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info Traceback (most recent call last): 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/common/utils.py", line 239, in call 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info return func(*args, **kwargs) 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 1062, in process 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info self.process_external(agent) 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/dvr_local_router.py", line 515, in process_external 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info self.create_dvr_fip_interfaces(ex_gw_port) 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/dvr_local_router.py", line 546, in create_dvr_fip_interfaces 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info self.fip_ns.update_gateway_port(fip_agent_port) 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/l3/dvr_fip_ns.py", line 239, in update_gateway_port 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info ipd.route.add_gateway(gw_ip) 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 702, in add_gateway 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info self._as_root([ip_version], tuple(args)) 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 373, in _as_root 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info use_root_namespace=use_root_namespace) 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info File "/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 95, in _as_root 2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info log_fail_as_error=self.log_fail_as_error) 2017-01-20 07:05:22.833 2541 ERROR
[Yahoo-eng-team] [Bug 1716194] Re: IPTables rules are not updated if there is a change in the FWaaS rules when FWaaS is deployed in DVR mode
*** This bug is a duplicate of bug 1715395 *** https://bugs.launchpad.net/bugs/1715395 ** This bug is no longer a duplicate of bug 1716401 FWaaS: Ip tables rules do not get updated in case of distributed virtual routers (DVR) ** This bug has been marked a duplicate of bug 1715395 FWaaS: Firewall creation fails in case of distributed routers (Pike) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1716194 Title: IPTables rules are not updated if there is a change in the FWaaS rules when FWaaS is deployed in DVR mode Status in neutron: New Bug description: Please see https://bugs.launchpad.net/neutron/+bug/1715395/comments/4 and https://bugs.launchpad.net/neutron/+bug/1716401 for more information about this issue To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1716194/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1751396] [NEW] DVR: Inter Tenant Traffic between two networks and connected through a shared network not reachable with DVR routers
Public bug reported: Inter Tenant Traffic between Two Tenants on two different private networks connected through a common shared network (created by Admin) is not route able through DVR routers Steps to reproduce it: (NOTE: No external, just shared network) This is only reproducable in Multinode scenario. ( 1 Controller - 2 compute ). Make sure that the two VMs are isolated in two different computes. openstack network create --share shared_net openstack subnet create shared_net_sn --network shared_net --subnet- range 172.168.10.0/24 openstack network create net_A openstack subnet create net_A_sn --network net_A --subnet-range 10.1.0.0/24 openstack network create net_B openstack subnet create net_B_sn --network net_B --subnet-range 10.2.0.0/24 openstack router create router_A openstack port create --network=shared_net --fixed-ip subnet=shared_net_sn,ip-address=172.168.10.20 port_router_A_shared_net openstack router add port router_A port_router_A_shared_net openstack router add subnet router_A net_A_sn openstack router create router_B openstack port create --network=shared_net --fixed-ip subnet=shared_net_sn,ip-address=172.168.10.30 port_router_B_shared_net openstack router add port router_B port_router_B_shared_net openstack router add subnet router_B net_B_sn openstack server create server_A --flavor m1.tiny --image cirros --nic net-id=net_A openstack server create server_B --flavor m1.tiny --image cirros --nic net-id=net_B Add static routes to the router. openstack router set router_A --route destination=10.1.0.0/24,gateway=172.168.10.20 openstack router set router_B --route destination=10.2.0.0/24,gateway=172.168.10.30 ``` Ping from one instance to the other times out ** Affects: neutron Importance: Undecided Status: Confirmed ** Tags: l3-dvr-backlog ** Changed in: neutron Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1751396 Title: DVR: Inter Tenant Traffic between two networks and connected through a shared network not reachable with DVR routers Status in neutron: Confirmed Bug description: Inter Tenant Traffic between Two Tenants on two different private networks connected through a common shared network (created by Admin) is not route able through DVR routers Steps to reproduce it: (NOTE: No external, just shared network) This is only reproducable in Multinode scenario. ( 1 Controller - 2 compute ). Make sure that the two VMs are isolated in two different computes. openstack network create --share shared_net openstack subnet create shared_net_sn --network shared_net --subnet- range 172.168.10.0/24 openstack network create net_A openstack subnet create net_A_sn --network net_A --subnet-range 10.1.0.0/24 openstack network create net_B openstack subnet create net_B_sn --network net_B --subnet-range 10.2.0.0/24 openstack router create router_A openstack port create --network=shared_net --fixed-ip subnet=shared_net_sn,ip-address=172.168.10.20 port_router_A_shared_net openstack router add port router_A port_router_A_shared_net openstack router add subnet router_A net_A_sn openstack router create router_B openstack port create --network=shared_net --fixed-ip subnet=shared_net_sn,ip-address=172.168.10.30 port_router_B_shared_net openstack router add port router_B port_router_B_shared_net openstack router add subnet router_B net_B_sn openstack server create server_A --flavor m1.tiny --image cirros --nic net-id=net_A openstack server create server_B --flavor m1.tiny --image cirros --nic net-id=net_B Add static routes to the router. openstack router set router_A --route destination=10.1.0.0/24,gateway=172.168.10.20 openstack router set router_B --route destination=10.2.0.0/24,gateway=172.168.10.30 ``` Ping from one instance to the other times out To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1751396/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1749577] Re: DVR: Static routes are not configured in snat-namespase for DVR Routers
User error. ** Changed in: neutron Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1749577 Title: DVR: Static routes are not configured in snat-namespase for DVR Routers Status in neutron: Invalid Bug description: Static routes are not configured in snat-namespace for DVR routers. Steps to reproduce: 1. Create Network 2. Create Subnet 3. Create Router 4. Add interface to Router 5. Set gateway for the Router 6. Add a static route (next hop to the Router) 7. Go check the 'snat-namespace' if the static routes are configured in there. stack@ubuntu-ctlr:~/devstack$ neutron router-update router2-alt-demo --route destination=10.3.0.0/24,nexthop=192.168.100.20 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. Updated router: router2-alt-demo stack@ubuntu-ctlr:~/devstack$ stack@ubuntu-ctlr:~/devstack$ stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d bash root@ubuntu-ctlr:~/devstack# ip route default via 192.168.100.9 dev qg-c5919234-7c 10.2.0.0/24 via 192.168.100.12 dev qg-c5919234-7c 192.168.100.0/24 dev qg-c5919234-7c proto kernel scope link src 192.168.100.20 root@ubuntu-ctlr:~/devstack# root@ubuntu-ctlr:~/devstack# root@ubuntu-ctlr:~/devstack# root@ubuntu-ctlr:~/devstack# ifconfig loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) qg-c5919234-7c Link encap:Ethernet HWaddr fa:16:3e:b7:2c:72 inet addr:192.168.100.20 Bcast:192.168.100.255 Mask:255.255.255.0 inet6 addr: fe80::f816:3eff:feb7:2c72/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:81 errors:0 dropped:3 overruns:0 frame:0 TX packets:74 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:5334 (5.3 KB) TX bytes:5801 (5.8 KB) sg-23b90333-cc Link encap:Ethernet HWaddr fa:16:3e:87:bb:ac inet addr:10.2.0.8 Bcast:10.2.0.255 Mask:255.255.255.0 inet6 addr: fe80::f816:3eff:fe87:bbac/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:2770 errors:0 dropped:0 overruns:0 frame:0 TX packets:45 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:219841 (219.8 KB) TX bytes:4028 (4.0 KB) root@ubuntu-ctlr:~/devstack# ip route default via 192.168.100.9 dev qg-c5919234-7c 10.2.0.0/24 via 192.168.100.12 dev qg-c5919234-7c 192.168.100.0/24 dev qg-c5919234-7c proto kernel scope link src 192.168.100.20 root@ubuntu-ctlr:~/devstack# root@ubuntu-ctlr:~/devstack# exit exit stack@ubuntu-ctlr:~/devstack$ sudo ip netns snat-6a6fdb6e-8284-4439-b5aa-f574d91948ae qrouter-6a6fdb6e-8284-4439-b5aa-f574d91948ae snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d qdhcp-a0e5b756-7b19-42b0-8e53-4f23f1adcc72 snat-9e989be2-b0af-4f0b-8532-a6f307fb40b4 fip-205f29cd-359c-4f7c-b29e-d276d199640e qrouter-9e989be2-b0af-4f0b-8532-a6f307fb40b4 qdhcp-03a725ab-04b5-4071-884f-00d7a7549e12 stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d bash root@ubuntu-ctlr:~/devstack# ip route 10.2.0.0/24 dev qr-d26ef7c2-18 proto kernel scope link src 10.2.0.1 169.254.109.46/31 dev rfp-152504be-c proto kernel scope link src 169.254.109.46 root@ubuntu-ctlr:~/devstack# exit exit stack@ubuntu-ctlr:~/devstack$ sudo ip netns snat-6a6fdb6e-8284-4439-b5aa-f574d91948ae qrouter-6a6fdb6e-8284-4439-b5aa-f574d91948ae snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d qdhcp-a0e5b756-7b19-42b0-8e53-4f23f1adcc72 snat-9e989be2-b0af-4f0b-8532-a6f307fb40b4 fip-205f29cd-359c-4f7c-b29e-d276d199640e qrouter-9e989be2-b0af-4f0b-8532-a6f307fb40b4 qdhcp-03a725ab-04b5-4071-884f-00d7a7549e12 stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec fip-205f29cd-359c-4f7c-b29e-d276d199640e bash root@ubuntu-ctlr:~/devstack# ip route 169.254.93.94/31 dev fpr-6a6fdb6e-8 proto kernel scope link src 169.254.93.95 169.254.106.114/31 dev fpr-9e989be2-b proto kernel scope link src 169.254.106.115 169.254.109.46/31 dev fpr-152504be-c proto kernel scope link src 169.254.109.47 192.168.100.0/24 dev fg-a6777b4d-f7 proto kernel scope link src 192.168.100.11 root@ubuntu-ctlr:~/devstack# ip rule 0:from all lookup local
[Yahoo-eng-team] [Bug 1749577] [NEW] DVR: Static routes are not configured in snat-namespase for DVR Routers
Public bug reported: Static routes are not configured in snat-namespace for DVR routers. Steps to reproduce: 1. Create Network 2. Create Subnet 3. Create Router 4. Add interface to Router 5. Set gateway for the Router 6. Add a static route (next hop to the Router) 7. Go check the 'snat-namespace' if the static routes are configured in there. stack@ubuntu-ctlr:~/devstack$ neutron router-update router2-alt-demo --route destination=10.3.0.0/24,nexthop=192.168.100.20 neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead. Updated router: router2-alt-demo stack@ubuntu-ctlr:~/devstack$ stack@ubuntu-ctlr:~/devstack$ stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d bash root@ubuntu-ctlr:~/devstack# ip route default via 192.168.100.9 dev qg-c5919234-7c 10.2.0.0/24 via 192.168.100.12 dev qg-c5919234-7c 192.168.100.0/24 dev qg-c5919234-7c proto kernel scope link src 192.168.100.20 root@ubuntu-ctlr:~/devstack# root@ubuntu-ctlr:~/devstack# root@ubuntu-ctlr:~/devstack# root@ubuntu-ctlr:~/devstack# ifconfig loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) qg-c5919234-7c Link encap:Ethernet HWaddr fa:16:3e:b7:2c:72 inet addr:192.168.100.20 Bcast:192.168.100.255 Mask:255.255.255.0 inet6 addr: fe80::f816:3eff:feb7:2c72/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:81 errors:0 dropped:3 overruns:0 frame:0 TX packets:74 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:5334 (5.3 KB) TX bytes:5801 (5.8 KB) sg-23b90333-cc Link encap:Ethernet HWaddr fa:16:3e:87:bb:ac inet addr:10.2.0.8 Bcast:10.2.0.255 Mask:255.255.255.0 inet6 addr: fe80::f816:3eff:fe87:bbac/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:2770 errors:0 dropped:0 overruns:0 frame:0 TX packets:45 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:219841 (219.8 KB) TX bytes:4028 (4.0 KB) root@ubuntu-ctlr:~/devstack# ip route default via 192.168.100.9 dev qg-c5919234-7c 10.2.0.0/24 via 192.168.100.12 dev qg-c5919234-7c 192.168.100.0/24 dev qg-c5919234-7c proto kernel scope link src 192.168.100.20 root@ubuntu-ctlr:~/devstack# root@ubuntu-ctlr:~/devstack# exit exit stack@ubuntu-ctlr:~/devstack$ sudo ip netns snat-6a6fdb6e-8284-4439-b5aa-f574d91948ae qrouter-6a6fdb6e-8284-4439-b5aa-f574d91948ae snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d qdhcp-a0e5b756-7b19-42b0-8e53-4f23f1adcc72 snat-9e989be2-b0af-4f0b-8532-a6f307fb40b4 fip-205f29cd-359c-4f7c-b29e-d276d199640e qrouter-9e989be2-b0af-4f0b-8532-a6f307fb40b4 qdhcp-03a725ab-04b5-4071-884f-00d7a7549e12 stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d bash root@ubuntu-ctlr:~/devstack# ip route 10.2.0.0/24 dev qr-d26ef7c2-18 proto kernel scope link src 10.2.0.1 169.254.109.46/31 dev rfp-152504be-c proto kernel scope link src 169.254.109.46 root@ubuntu-ctlr:~/devstack# exit exit stack@ubuntu-ctlr:~/devstack$ sudo ip netns snat-6a6fdb6e-8284-4439-b5aa-f574d91948ae qrouter-6a6fdb6e-8284-4439-b5aa-f574d91948ae snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d qdhcp-a0e5b756-7b19-42b0-8e53-4f23f1adcc72 snat-9e989be2-b0af-4f0b-8532-a6f307fb40b4 fip-205f29cd-359c-4f7c-b29e-d276d199640e qrouter-9e989be2-b0af-4f0b-8532-a6f307fb40b4 qdhcp-03a725ab-04b5-4071-884f-00d7a7549e12 stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec fip-205f29cd-359c-4f7c-b29e-d276d199640e bash root@ubuntu-ctlr:~/devstack# ip route 169.254.93.94/31 dev fpr-6a6fdb6e-8 proto kernel scope link src 169.254.93.95 169.254.106.114/31 dev fpr-9e989be2-b proto kernel scope link src 169.254.106.115 169.254.109.46/31 dev fpr-152504be-c proto kernel scope link src 169.254.109.47 192.168.100.0/24 dev fg-a6777b4d-f7 proto kernel scope link src 192.168.100.11 root@ubuntu-ctlr:~/devstack# ip rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default 2852019551: from all iif fpr-6a6fdb6e-8 lookup 2852019551 2852022899: from all iif fpr-9e989be2-b lookup 2852022899 2852023599: from all iif fpr-152504be-c lookup 2852023599 root@ubuntu-ctlr:~/devstack# ip route s t 2852019551 default via 192.168.100.9 dev fg-a6777b4d-f7 10.3.0.0/24 via 192.168.100.20 dev fg-a6777b4d-f7 root@ubuntu-ctlr:~/devstack# ** Affects: neutron Importance: Undecided Status: New ** Tags:
[Yahoo-eng-team] [Bug 1667877] Re: [RFE] Allow DVR for E/W while leaving N/S centralized
** Changed in: neutron Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1667877 Title: [RFE] Allow DVR for E/W while leaving N/S centralized Status in neutron: Fix Released Bug description: Use Case OpenStack is deployed in an L3 fabric so the external network cannot be extended to all compute nodes. Even though this means SNAT and floating IP traffic (North/South) will be run through a network node with external network access, the operator still wants the east/west routing offload offered by DVR. So even though the topology does not allow for the N/S DVR direct routing, we want to have a way to still take advantage of the E/W direct routing. Potential Solution == Provide a Configurable option to configure Floatingips for DVR based routers to reside on Compute Node or on Network Node. Also proactively check the status of the agent on the destination node and if the agent health is down, then configure the Floatingip on the Network Node. Provide a configuration Option in neutron.conf such as DVR_FLOATINGIP_CENTRALIZED = 'enforced/circumstantial' If DVR_FLOATINGIP_CENTRALIZED is configured as 'enforced' all floatingip will be configured on the Network NOde. If the DVR_FLOATINGIP_CENTRALIZED is configured as 'circumstantial' based on the agent health the floatingip will be configured either in the compute node or on the Network Node. If this option is not configured, the Floatingip will be distributed for all bound ports and for just the unbound ports the floatingip will be implemented in the Network Node. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1667877/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1635554] Re: Delete Router / race condition
In that case we should close this bug. ** Changed in: neutron Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1635554 Title: Delete Router / race condition Status in neutron: Invalid Bug description: When deleting a router the logfile is filled up. CentOS7 Newton(RDO) 2016-10-21 09:45:02.526 16200 DEBUG neutron.agent.linux.utils [-] Exit code: 0 execute /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:140 2016-10-21 09:45:02.526 16200 WARNING neutron.agent.l3.namespaces [-] Namespace qrouter-8cf5-5c5c-461c-84f3-c8abeca8f79a does not exist. Skipping delete 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent [-] Error while deleting router 8cf5-5c5c-461c-84f3-c8abeca8f79a 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 357, in _safe_router_removed 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent self._router_removed(router_id) 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in _router_removed 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent ri.delete(self) 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 381, in delete 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent self.destroy_state_change_monitor(self.process_monitor) 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 325, in destroy_state_change_monitor 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent pm = self._get_state_change_monitor_process_manager() 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 296, in _get_state_change_monitor_process_manager 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent default_cmd_callback=self._get_state_change_monitor_callback()) 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 299, in _get_state_change_monitor_callback 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent ha_device = self.get_ha_device_name() 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 137, in get_ha_device_name 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent return (HA_DEV_PREFIX + self.ha_port['id'])[:self.driver.DEV_NAME_LEN] 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent TypeError: 'NoneType' object has no attribute '__getitem__' 2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent 2016-10-21 09:45:02.528 16200 DEBUG neutron.agent.l3.agent [-] Finished a router update for 8cf5-5c5c-461c-84f3-c8abeca8f79a _process_router_update /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:504 See full log http://paste.openstack.org/show/586656/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1635554/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1712795] Re: Fail to startup neutron-l3-agent
Right now there is no bug fixes are support for mitaka branch. Since this bug is not seen in the current master and stable branch, so I would close this bug. ** Changed in: neutron Status: Incomplete => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1712795 Title: Fail to startup neutron-l3-agent Status in neutron: Invalid Bug description: When try to neutron-l3-agent, it's log output: 2017-08-17 08:00:09.601 2381 ERROR oslo.messaging._drivers.impl_rabbit [req-aa7132bf-38d3-4e1f-9158-8743e2c5d163 - - - - -] AMQP server on 192.168.25.1:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds. 2017-08-17 08:00:10.610 2381 ERROR oslo.messaging._drivers.impl_rabbit [req-aa7132bf-38d3-4e1f-9158-8743e2c5d163 - - - - -] AMQP server on 192.168.25.2:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 20 seconds. 2017-08-17 08:00:30.640 2381 INFO oslo.messaging._drivers.impl_rabbit [req-aa7132bf-38d3-4e1f-9158-8743e2c5d163 - - - - -] Reconnected to AMQP server on 192.168.25.1:5672 via [amqp] client 2017-08-17 08:00:30.724 2381 INFO eventlet.wsgi.server [-] (2381) wsgi starting up on http:/var/lib/neutron/keepalived-state-change 2017-08-17 08:00:30.766 2381 INFO neutron.agent.l3.agent [-] L3 agent started 2017-08-17 08:00:30.770 2381 INFO neutron.agent.l3.agent [-] Agent has just been revived. Doing a full sync. 2017-08-17 08:00:35.352 2381 INFO oslo_rootwrap.client [-] Spawned new rootwrap daemon process with pid=25789 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task [req-104a5ce9-3d9d-4367-8bb1-edb0880ef96f - - - - -] Error during L3NATAgentWithStateReport.periodic_sync_routers_task 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task Traceback (most recent call last): 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in run_periodic_tasks 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task task(self, context) 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 545, in periodic_sync_routers_task 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task self.fetch_and_sync_all_routers(context, ns_manager) 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 579, in fetch_and_sync_all_routers 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task r['id'], r.get(l3_constants.HA_ROUTER_STATE_KEY)) 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py", line 120, in check_ha_state_for_router 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task if ri and current_state != TRANSLATION_MAP[ri.ha_state]: 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 76, in ha_state 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task ha_state_path = self.keepalived_manager.get_full_config_file_path( 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task AttributeError: 'NoneType' object has no attribute 'get_full_config_file_path' 2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.linux.utils [-] Exit code: 1; Stdin: ; Stdout: ; Stderr: Cannot open network namespace "snat-9ed03dce-1c07-4b65-abe1-ca4f0e8f5d04": No such file or directory 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent [-] Failed to process compatible router '9ed03dce-1c07-4b65-abe1-ca4f0e8f5d04' 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 501, in _process_router_update 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 438, in _process_router_if_compatible 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent self._process_added_router(router) 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 443, in _process_added_router 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent self._router_added(router['id'], router) 2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 350, in _router_added 2017-08-17
[Yahoo-eng-team] [Bug 1718788] [NEW] DVR: Migrate centralized unbound floatingip to the respective host when the port is bound
Public bug reported: When unbound ports are associated with floatingIP in DVR, it implements the floatingIP in the dvr_snat node under the snat_namespace. When the private ports are bound to a specific host, the floatingIPs are not moved or migrated to their respective hosts. This can be reproduced by 1. Create a network 2. Create a subnet 3. Create a router and associate the subnet to the router 4. Assign a gateway to the router. 5. Then create a port on the given network with a specific IP. 6. Now create a FloatingIP on the external network. 7. Associate the FloatingIP to the created port. 8. At this point the port is not bound and so the floatingIP gets implemented in the Snat_namespace in the dvr_snat node. 9. Then within a few seconds, we create a VM with the given port-id instead of network-id. 10. Now when the VM is built then the port gets bound. 11. Now the floatingIP is not seen on the host where the VM resides. Theoretically the FloatingIP should be migrated to the host where it is currently bound. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: Confirmed ** Tags: l3-dvr-backlog ** Changed in: neutron Status: New => Confirmed ** Changed in: neutron Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1718788 Title: DVR: Migrate centralized unbound floatingip to the respective host when the port is bound Status in neutron: Confirmed Bug description: When unbound ports are associated with floatingIP in DVR, it implements the floatingIP in the dvr_snat node under the snat_namespace. When the private ports are bound to a specific host, the floatingIPs are not moved or migrated to their respective hosts. This can be reproduced by 1. Create a network 2. Create a subnet 3. Create a router and associate the subnet to the router 4. Assign a gateway to the router. 5. Then create a port on the given network with a specific IP. 6. Now create a FloatingIP on the external network. 7. Associate the FloatingIP to the created port. 8. At this point the port is not bound and so the floatingIP gets implemented in the Snat_namespace in the dvr_snat node. 9. Then within a few seconds, we create a VM with the given port-id instead of network-id. 10. Now when the VM is built then the port gets bound. 11. Now the floatingIP is not seen on the host where the VM resides. Theoretically the FloatingIP should be migrated to the host where it is currently bound. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1718788/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1718585] Re: set floatingip status to DOWN during creation
** Changed in: neutron Status: New => Opinion -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1718585 Title: set floatingip status to DOWN during creation Status in neutron: Opinion Bug description: floatingip status is not reliable as it is set to active during creation itself [1] rather than waiting for agent [2] to update it once agent finishes adding SNAT/DNAT rules. [1] https://github.com/openstack/neutron/blob/master/neutron/db/l3_db.py#L1234 [2] https://github.com/openstack/neutron/blob/master/neutron/agent/l3/agent.py#L131 User can check floatingip status after creation and can initiate data traffic before agent finishes processing floatingip resulting in connection failures. Also fixing this can help tempest tests to initiate connection only after agent has finished floatingip processing and avoid failures. Also floatingip status has to be properly updated during migration of router. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1718585/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1717302] [NEW] Tempest floatingip scenario tests failing on DVR Multinode setup with HA
Public bug reported: neutron.tests.tempest.scenario.test_floatingip.FloatingIpSameNetwork and neutron.tests.tempest.scenario.test_floatingip.FloatingIpSeparateNetwork are failing on every patch. This trace is seen on the node-2 l3-agent. Sep 13 07:16:43.404250 ubuntu-xenial-2-node-rax-dfw-10909819-895688 neutron-keepalived-state-change[5461]: ERROR neutron.agent.linux.ip_lib [-] Failed sending gratuitous ARP to 172.24.5.3 on qg-bf79c157-e2 in namespace qrouter-796b8715-ca01-43ad-bc08-f81a0b4db8cc: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address : ProcessExecutionError: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address ERROR neutron.agent.linux.ip_lib Traceback (most recent call last): ERROR neutron.agent.linux.ip_lib File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 1082, in _arping ERROR neutron.agent.linux.ip_lib ip_wrapper.netns.execute(arping_cmd, extra_ok_codes=[1]) ERROR neutron.agent.linux.ip_lib File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 901, in execute ERROR neutron.agent.linux.ip_lib log_fail_as_error=log_fail_as_error, **kwargs) ERROR neutron.agent.linux.ip_lib File "/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 151, in execute ERROR neutron.agent.linux.ip_lib raise ProcessExecutionError(msg, returncode=returncode) ERROR neutron.agent.linux.ip_lib ProcessExecutionError: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address ERROR neutron.agent.linux.ip_lib ERROR neutron.agent.linux.ip_lib If this is a DVR router, then the GARP should not go through the qg interface for the floatingIP. More information can be seen here. http://logs.openstack.org/43/500143/5/check/gate-tempest-dsvm-neutron- dvr-multinode-scenario-ubuntu-xenial- nv/0a58fce/logs/subnode-2/screen-q-l3.txt.gz?level=TRACE#_Sep_13_07_16_47_864052 ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog l3-ha ** Summary changed: - Tempest floatingip scenario tests failing on DVR Multinode setup + Tempest floatingip scenario tests failing on DVR Multinode setup with HA -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1717302 Title: Tempest floatingip scenario tests failing on DVR Multinode setup with HA Status in neutron: New Bug description: neutron.tests.tempest.scenario.test_floatingip.FloatingIpSameNetwork and neutron.tests.tempest.scenario.test_floatingip.FloatingIpSeparateNetwork are failing on every patch. This trace is seen on the node-2 l3-agent. Sep 13 07:16:43.404250 ubuntu-xenial-2-node-rax-dfw-10909819-895688 neutron-keepalived-state-change[5461]: ERROR neutron.agent.linux.ip_lib [-] Failed sending gratuitous ARP to 172.24.5.3 on qg-bf79c157-e2 in namespace qrouter-796b8715-ca01-43ad-bc08-f81a0b4db8cc: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address : ProcessExecutionError: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot assign requested address ERROR neutron.agent.linux.ip_lib Traceback (most recent call last): ERROR neutron.agent.linux.ip_lib File "/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 1082, in _arping ERROR
[Yahoo-eng-team] [Bug 1716829] [NEW] Centralized floatingips not configured right with DVR and HA
Public bug reported: Centralized floatingips are not not configured right with DVR and HA. add_centralized_floatingip and remove_centralized_floatingip should be over-ridden in 'dvr_edge_ha_router.py' to configure the 'vips'. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog ** Changed in: neutron Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1716829 Title: Centralized floatingips not configured right with DVR and HA Status in neutron: In Progress Bug description: Centralized floatingips are not not configured right with DVR and HA. add_centralized_floatingip and remove_centralized_floatingip should be over-ridden in 'dvr_edge_ha_router.py' to configure the 'vips'. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1716829/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1712728] [NEW] DVR: get_router_cidrs in dvr_edge_router not returning the centralized_floating_ip cidr
Public bug reported: get_router_cidrs over-ridden in dvr_edge_router is not returing the centralized_floating_ip cidrs. So the consequence is the DNAT rules are left over in the snat namespace when the centralized_floating_ips are removed. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1712728 Title: DVR: get_router_cidrs in dvr_edge_router not returning the centralized_floating_ip cidr Status in neutron: New Bug description: get_router_cidrs over-ridden in dvr_edge_router is not returing the centralized_floating_ip cidrs. So the consequence is the DNAT rules are left over in the snat namespace when the centralized_floating_ips are removed. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1712728/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1702790] [NEW] DVR Router update task fails when agent restarts
Public bug reported: When there is a DVR router with gateway enabled, and if the agent restarts, then the router_update fails and you can see Error log in the l3_agent.log. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1702790 Title: DVR Router update task fails when agent restarts Status in neutron: In Progress Bug description: When there is a DVR router with gateway enabled, and if the agent restarts, then the router_update fails and you can see Error log in the l3_agent.log. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1702790/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1702769] [NEW] Binding info for DVR port not found error seen when notify_l2pop_port_wiring is called with DVR routers
Public bug reported: A recent change in upstream Icd4cd4e3f735e88299e86468380c5f786e7628fe, might have introduced this problem. Here the 'get_bound_port_context' is being called for non-HA ports and so with the given context, it is not able to retrieve the port-binding and throws the error. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: Confirmed ** Tags: l3-dvr-backlog ** Changed in: neutron Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan) ** Changed in: neutron Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1702769 Title: Binding info for DVR port not found error seen when notify_l2pop_port_wiring is called with DVR routers Status in neutron: Confirmed Bug description: A recent change in upstream Icd4cd4e3f735e88299e86468380c5f786e7628fe, might have introduced this problem. Here the 'get_bound_port_context' is being called for non-HA ports and so with the given context, it is not able to retrieve the port-binding and throws the error. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1702769/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1701288] [NEW] In scale testing RPC timeout error seen in the ovs_neutron_agent when update_device_list is called with DVR routers
Public bug reported: At large scale testing when trying to deploy around 8000 VMs with DVR routers, we are seeing an RPC Timeout error in ovs_neutron_agent. This RPC Timeout error occurs when the ovs_neutron_agent tries to bind the vif port. On further analysis it seems that the update_port_status is taking a lot more time at large scale to return and so the ovs_neutron_agent timesout waiting for the message. Looking into the update_port_status code path, after the port status update occurs it calls the update_port_postcommit call. Since L2pop is enabled by default with DVR, the update_port_postcommit calls _create_agent_fdb entries for the agent, if this is the first port associated with the agent. In _create_agent_fdb it tries to retrieve all the PortInfo associated with network and this DB call is very expensive and sometimes we have seen it take upto to 3900s at some instances. 2017-06-15 17:48:30.651 9320 DEBUG neutron.agent.linux.utils [req-51df1df5-8a51-4679-938b-895545a225c2 - - - - -] Exit code: 0 execute /opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/agent/linux/utils.py:146 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-ece15133-1294-46c0-b0b5-cab785d4314b - - - - -] Error while processing VIF ports 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last): 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 2044, in rpc_loop 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent port_info, ovs_restarted) 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/osprofiler/profiler.py", line 154, in wrapper 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent return f(*args, **kwargs) 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 1648, in process_network_ports 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent failed_devices['added'] |= self._bind_devices(need_binding_devices) 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py", line 888, in _bind_devices 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent self.conf.host) 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/agent/rpc.py", line 181, in update_device_list 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent agent_id=agent_id, host=host) 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/common/rpc.py", line 185, in call 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent time.sleep(wait) 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent self.force_reraise() 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent six.reraise(self.type_, self.value, self.tb) 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent File "/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/common/rpc.py", line 162, in call 2017-06-15 17:48:38.420 9320 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent return self._original_context.call(ctxt, method, **kwargs) 2017-06-15 17:48:38.420 9320 ERROR
[Yahoo-eng-team] [Bug 1695101] [NEW] DVR Router ports and gateway ports are not bound to any host and no snat namespace created
Public bug reported: In the Pike cycle there were some refactoring to the DVR db classes and resource handler mixin. This lead to the regression where it was not creating the SNAT namespace for the DVR routers if it has gateway configured. The only namespace seen was the fipnamespace. This was the patch set that caused the regression. https://review.openstack.org/#/c/457592/5 On further debugging it was found that the snat ports and the distributed router ports were not host bound. The neutron was trying to bind it to a 'null' host. The '_build_routers_list' function in the l3_dvr_db.py was not called and hence the host binding was missing. We have seen a similar issue a while back, #1369012 (Fix KeyError on missing gw_port_host for L3 agent in DVR mode The issue here is the order of inheritance of the classes. If the order of inheritance of the classes are messed up, then the functions that are over-ridden are not called in the right order or skipped. So with this we have seen the same problem, where the '_build_routers_list' in the l3_db_gwmode.py was called and not the one in the 'l3_dvr_db.py' file. This is the current order of inheritance. class L3_NAT_with_dvr_db_mixin(l3_db.L3_NAT_db_mixin, l3_attrs_db.ExtraAttributesMixin, DVRResourceOperationHandler, _DVRAgentInterfaceMixin): If the order is shuffled, it works fine and here is the shuffled order. class L3_NAT_with_dvr_db_mixin(DVRResourceOperationHandler, _DVRAgentInterfaceMixin, l3_attrs_db.ExtraAttributesMixin, l3_db.L3_NAT_db_mixin): This seems to fix the problem. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: Confirmed ** Tags: l3-dvr-backlog ** Tags added: l3-dvr-backlog ** Changed in: neutron Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan) ** Changed in: neutron Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1695101 Title: DVR Router ports and gateway ports are not bound to any host and no snat namespace created Status in neutron: Confirmed Bug description: In the Pike cycle there were some refactoring to the DVR db classes and resource handler mixin. This lead to the regression where it was not creating the SNAT namespace for the DVR routers if it has gateway configured. The only namespace seen was the fipnamespace. This was the patch set that caused the regression. https://review.openstack.org/#/c/457592/5 On further debugging it was found that the snat ports and the distributed router ports were not host bound. The neutron was trying to bind it to a 'null' host. The '_build_routers_list' function in the l3_dvr_db.py was not called and hence the host binding was missing. We have seen a similar issue a while back, #1369012 (Fix KeyError on missing gw_port_host for L3 agent in DVR mode The issue here is the order of inheritance of the classes. If the order of inheritance of the classes are messed up, then the functions that are over-ridden are not called in the right order or skipped. So with this we have seen the same problem, where the '_build_routers_list' in the l3_db_gwmode.py was called and not the one in the 'l3_dvr_db.py' file. This is the current order of inheritance. class L3_NAT_with_dvr_db_mixin(l3_db.L3_NAT_db_mixin, l3_attrs_db.ExtraAttributesMixin, DVRResourceOperationHandler, _DVRAgentInterfaceMixin): If the order is shuffled, it works fine and here is the shuffled order. class L3_NAT_with_dvr_db_mixin(DVRResourceOperationHandler, _DVRAgentInterfaceMixin, l3_attrs_db.ExtraAttributesMixin, l3_db.L3_NAT_db_mixin): This seems to fix the problem. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1695101/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1667877] [NEW] [RFE] DVR support for Configuring Floatingips in Network Node or in the Compute Node based on Config option.
Public bug reported: Provide a Configurable option to configure Floatingips for DVR based routers to reside on Compute Node or on Network Node. Also proactively check the status of the agent on the destination node and if the agent health is down, then configure the Floatingip on the Network Node. Provide a configuration Option in neutron.conf such as DVR_FLOATINGIP_CENTRALIZED = 'enforced/circumstantial' If DVR_FLOATINGIP_CENTRALIZED is configured as 'enforced' all floatingip will be configured on the Network NOde. If the DVR_FLOATINGIP_CENTRALIZED is configured as 'circumstantial' based on the agent health the floatingip will be configured either in the compute node or on the Network Node. If this option is not configured, the Floatingip will be distributed for all bound ports and for just the unbound ports the floatingip will be implemented in the Network Node. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog ** Summary changed: - [RFE] DVR support for Configurable Floatingips in Network Node or in the Compute Node. + [RFE] DVR support for Configuring Floatingips in Network Node or in the Compute Node based on Config option. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1667877 Title: [RFE] DVR support for Configuring Floatingips in Network Node or in the Compute Node based on Config option. Status in neutron: New Bug description: Provide a Configurable option to configure Floatingips for DVR based routers to reside on Compute Node or on Network Node. Also proactively check the status of the agent on the destination node and if the agent health is down, then configure the Floatingip on the Network Node. Provide a configuration Option in neutron.conf such as DVR_FLOATINGIP_CENTRALIZED = 'enforced/circumstantial' If DVR_FLOATINGIP_CENTRALIZED is configured as 'enforced' all floatingip will be configured on the Network NOde. If the DVR_FLOATINGIP_CENTRALIZED is configured as 'circumstantial' based on the agent health the floatingip will be configured either in the compute node or on the Network Node. If this option is not configured, the Floatingip will be distributed for all bound ports and for just the unbound ports the floatingip will be implemented in the Network Node. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1667877/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1524020] Re: DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac-address changes occur
** Changed in: neutron Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1524020 Title: DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac- address changes occur Status in neutron: Fix Released Status in neutron kilo series: Fix Released Bug description: DVR arp update (dvr_vmarp_table_update) and dvr_update_router_add_vm called for every update_port if the mac_address changes or when update_devic_up is true. These functions should be called from _notify_l3_agent_port_update, only when a host binding for a service port changes or when a mac_address for the service port changes. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1524020/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1554876] Re: router not found warning logs in the L3 agent
** Changed in: neutron Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1554876 Title: router not found warning logs in the L3 agent Status in neutron: Fix Released Bug description: The L3 agent during a normal tempest run will be filled with warnings like the following: 2016-03-08 10:10:30.465 18962 WARNING neutron.agent.l3.agent [-] Info for router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router cleanup 2016-03-08 10:10:34.197 18962 WARNING neutron.agent.l3.agent [-] Info for router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router cleanup 2016-03-08 10:10:35.535 18962 WARNING neutron.agent.l3.agent [-] Info for router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router cleanup 2016-03-08 10:10:43.025 18962 WARNING neutron.agent.l3.agent [-] Info for router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router cleanup 2016-03-08 10:10:47.029 18962 WARNING neutron.agent.l3.agent [-] Info for router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router cleanup This is completely normal as routers are deleted from the server during the data retrieval process of the L3 agent and should not be at the warning level. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1554876/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1631513] [NEW] DVR: Fix race conditions when trying to add default gateway for fip gateway port.
Public bug reported: There seems to be a race condition when trying to add default gateway route in fip namespace for the fip agent gateway port. The way it happens is at high scale testing, when there is a router update that is currently happening for the Router-A which has a floatingip, a fip namespace is getting created and gateway ports plugged to the external bridge in the context of the fip namespace. While it is getting created, if there is another router update for the same Router-A, then it calls 'update-gateway-port' and tries to set the default gateway and fails. We do find a log message in the l3-agent with 'Failed to process compatible router' and also a TRACE in the l3-agent. Traceback (most recent call last): File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 501, in _process_router_update self._process_router_if_compatible(router) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 440, in _process_router_if_compatible self._process_updated_router(router) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 454, in _process_updated_router ri.process(self) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py", line 538, in process super(DvrLocalRouter, self).process(agent) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_router_base.py", line 31, in process super(DvrRouterBase, self).process(agent) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/common/utils.py", line 396, in call self.logger(e) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/common/utils.py", line 393, in call return func(*args, **kwargs) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/router_info.py", line 989, in process self.process_external(agent) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py", line 491, in process_external self.create_dvr_fip_interfaces(ex_gw_port) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py", line 522, in create_dvr_fip_interfaces self.fip_ns.update_gateway_port(fip_agent_port) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_fip_ns.py", line 243, in update_gateway_port ipd.route.add_gateway(gw_ip) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 690, in add_gateway self._as_root([ip_version], tuple(args)) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 361, in _as_root use_root_namespace=use_root_namespace) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 94, in _as_root log_fail_as_error=self.log_fail_as_error) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 103, in _execute log_fail_as_error=log_fail_as_error) File "/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 140, in execute raise RuntimeError(msg) ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog mitaka-backport-potential newton-backport-potential ** Summary changed: - Fix race conditions when trying to add default gateway for fip gateway port. + DVR: Fix race conditions when trying to add default gateway for fip gateway port. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1631513 Title: DVR: Fix race conditions when trying to add default gateway for fip gateway port. Status in neutron: New Bug description: There seems to be a race condition when trying to add default gateway route in fip namespace for the fip agent gateway port. The way it happens is at high scale testing, when there is a router update that is currently happening for the Router-A which has a floatingip, a fip namespace is getting created and gateway ports plugged to the external bridge in the context of the fip namespace. While it is getting
[Yahoo-eng-team] [Bug 1593354] Re: SNAT HA failed because of missing nat rule in snat namespace iptable
I did verify it in Mitaka and I don't see any issues with the 'sg' port and related rules with respect to failover. So we can close this issue as we discussed last week. ** Changed in: neutron Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1593354 Title: SNAT HA failed because of missing nat rule in snat namespace iptable Status in neutron: Invalid Bug description: I have a mitaka openstack deployment with neutron DVR enabled. When I try to test the snat HA failover I found that even though the snat namespace was created on the other backup node, it doesn't has any nat rule in snat namespace iptable. And run "ip a" in the sant namespace you will find the sg port is missing. Here is what I found on the second neutron network node sandy-pistachio:/opt/openstack # ip netns qrouter-e25b81f9-8810-4654-9be0-ebac09c700fb qdhcp-abe36e89-f7a5-4cbd-a7e4-852d80ed92d6 snat-e25b81f9-8810-4654-9be0-ebac09c700fb sandy-pistachio:/opt/openstack # ip netns exec snat-e25b81f9-8810-4654-9be0-ebac09c700fb ip a 1: lo:mtu 65536 qdisc noqueue state UNKNOWN group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 70: qg-cc3b2f8c-b7: mtu 1500 qdisc noqueue state UNKNOWN group default link/ether fa:16:3e:cb:27:cd brd ff:ff:ff:ff:ff:ff inet 10.240.117.98/28 brd 10.240.117.111 scope global qg-cc3b2f8c-b7 valid_lft forever preferred_lft forever inet6 fe80::f816:3eff:fecb:27cd/64 scope link valid_lft forever preferred_lft forever sandy-pistachio:/opt/openstack # ip netns exec snat-e25b81f9-8810-4654-9be0-ebac09c700fb iptables -L -n -v -t nat Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain INPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes) pkts bytes target prot opt in out source destination Here are the package information: provo-pistachio:/opt/openstack # zypper info openstack-neutron Loading repository data... Reading installed packages... Information for package openstack-neutron: -- Repository: Mitaka Name: openstack-neutron Version: 8.1.1~a0~dev32-2.1 Arch: noarch Vendor: obs://build.opensuse.org/Cloud:OpenStack Installed: Yes Status: up-to-date Installed Size: 235.1 KiB Summary: OpenStack Network Description: Neutron is a virtual network service for Openstack. Just like OpenStack Nova provides an API to dynamically request and configure virtual servers, Neutron provides an API to dynamically request and configure virtual networks. These networks connect "interfaces" from other OpenStack services (e.g., vNICs from Nova VMs). The Neutron API supports extensions to provide advanced network capabilities (e.g., QoS, ACLs, network monitoring, etc) provo-pistachio:/opt/openstack # zypper info openstack-neutron-openvswitch-agent Loading repository data... Reading installed packages... Information for package openstack-neutron-openvswitch-agent: Repository: Mitaka Name: openstack-neutron-openvswitch-agent Version: 8.1.1~a0~dev32-2.1 Arch: noarch Vendor: obs://build.opensuse.org/Cloud:OpenStack Installed: Yes Status: up-to-date Installed Size: 14.9 KiB Summary: OpenStack Network - Open vSwitch Description: This package provides the OpenVSwitch Agent. provo-pistachio:/opt/openstack # zypper info openstack-neutron-l3-agent Loading repository data... Reading installed packages... Information for package openstack-neutron-l3-agent: --- Repository: Mitaka Name: openstack-neutron-l3-agent Version: 8.1.1~a0~dev32-2.1 Arch: noarch Vendor: obs://build.opensuse.org/Cloud:OpenStack Installed: Yes Status: up-to-date Installed Size: 24.7 KiB Summary: OpenStack Network Service (Neutron) - L3 Agent Description: This package provides the L3 Agent. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1593354/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More
[Yahoo-eng-team] [Bug 1476469] Re: with DVR, a VM can't use floatingIP and VPN at the same time
VPN is a centralized service and not distributed one. The VPN service is only running in the SNAT Namespace and not on the router or fip namespace. So the fip traffic flowing through the fip namespace or router namespace may not go through the IPsec driver that is running in SNAT Namespace. This is working as per design. If we need to make the VPN for DVR routers to work with FIP, then we need to first work on running distributed VPN service. Until then I would not recommend doing it. ** Changed in: neutron Status: Confirmed => Opinion -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1476469 Title: with DVR, a VM can't use floatingIP and VPN at the same time Status in neutron: Opinion Bug description: Now VPN Service is available for Distributed Routers by patch #https://review.openstack.org/#/c/143203/, but there is another problem, with DVR, a VM can't use floatingIP and VPN at the same time. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1476469/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1609217] Re: DVR: dvr router should not exist in not-binded network node
** Changed in: neutron Status: In Progress => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1609217 Title: DVR: dvr router should not exist in not-binded network node Status in neutron: Invalid Bug description: ENV: stable/mitaka hosts: compute1 (nova-compute, l3-agent (dvr), metedate-agent) compute2 (nova-compute, l3-agent (dvr), metedate-agent) network1 (l3-agent (dvr_snat), metedata-agent, dhcp-agent) network2 (l3-agent(dvr_snat), metedata-agent, dhcp-agent) How to reproduce? (scenario 1) set: dhcp_agents_per_network = 2 1. create a DVR router: neutron router-create --ha False --distributed True test1 2. Create a network & subnet with dhcp enabled. neutron net-create test1 neutron subnet-create --enable-dhcp test1 --name test1 192.168.190.0/24 3. Attach the router and subnet neutron router-interface-add test1 subnet=test1 Then the router test1 will exist in both network1 and network2. But in the DB routerl3agentbindings, there is only one record for DVR router to one l3 agent. http://paste.openstack.org/show/547695/ And for another scenario 2: change the network2 node deployment to only run metedata-agent, dhcp-agent. Both in the qdhcp-namespace and the VM could ping each other. So qrouter-namespace in the not-binded network node is not used, and should not exist. Code: The function in following position should not return the DVR router id in scenario 1. https://github.com/openstack/neutron/blob/master/neutron/db/l3_dvrscheduler_db.py#L263 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1609217/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1614337] Re: L3 agent fails on FIP when DVR and HA both enabled in router
** Changed in: neutron Status: Confirmed => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1614337 Title: L3 agent fails on FIP when DVR and HA both enabled in router Status in neutron: Invalid Bug description: I have a vlan-based Neutron configuration. My tenant networks are vlans, and my shared external network (br-ex) is a flat network. Neutron is configured for DVR+SNAT mode. In testing floating IPs, I've run into issues with my neutron router, and I've traced it back to a single scenario: when the router is both distributed AND ha. To be clear, I've tested all four possibilities: "--distributed False --ha False" == works "--distributed True --ha False" == works "--distributed False --ha True" == works "--distributed True --ha True" == fails * I can reproduce this again and again, just by deleting the router I have (which implies first clearing its gateway, and removing any associated ports), then re-creating the router in any of the four configurations above. Then I boot some VMs, associate a FIP to any one of them, and attempt to reach the FIP. Results are the same whether I create the router on the CLI or from within Horizon. * Expected result is that I should be able to associate a floating IP to a running VM and then ping that floating IP (and ultimately other kinds of activity, such as SSH access to the VM). * Actual result is that the floating IP is completely unreachable from other valid IPs within same L2 space. Additionally, in /var/log/neutron/l3-agent.log on the compute node hosting the VM whose associated FIP I can't reach, I find this: 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent [-] Failed to process compatible router '13356ddb-8e36-4f54-b8b2-6a62a5aecf86' 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent Traceback (most recent call last): 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 501, in _process_router_update 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._process_router_if_compatible(router) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 440, in _process_router_if_compatible 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self._process_updated_router(router) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 454, in _process_updated_router 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent ri.process(self) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_edge_ha_router.py", line 92, in process 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(DvrEdgeHaRouter, self).process(agent) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py", line 488, in process 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(DvrLocalRouter, self).process(agent) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_router_base.py", line 30, in process 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(DvrRouterBase, self).process(agent) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 386, in process 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent super(HaRouter, self).process(agent) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 385, in call 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self.logger(e) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self.force_reraise() 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent six.reraise(self.type_, self.value, self.tb) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File "/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 382, in call 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent return func(*args, **kwargs) 2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent File
[Yahoo-eng-team] [Bug 1611964] [NEW] SNAT redirect rules should be removed only on Gateway clear.
Public bug reported: SNAT redirect rules should be removed only on Gateway clear and not for a gateway move or gateway reschedule. This would cause the snat_node unreachable by the dvr service ports on the originating node. How to reproduce it. 1. Create a two network node setup (dvr_snat) 2. Create a network 3. Create a subnet 4. Create a router and attach the subnet to the router. 5. Set gateway to the router. 6. Now try to reschedule the router to the secondary node or do a manaul move to a second node. 7. In this case the 'external_gateway_removed" is called through 'external_gateway_updated' function and tries to call snat_redirect_remove. 8. After you move the snat, the router namespace will not have the routing rule for the 'csnat' port. 9. It clears up and you only see the base rules. Expected: root@ubuntu-ctlr:~/devstack# ip rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default 167772161: from 10.0.0.1/24 lookup 167772161 root@ubuntu-ctlr:~/devstack# ip route s t 167772161 default via 10.0.0.9 dev qr-18deeb39-3b But Actual: root@ubuntu-ctlr:~/devstack# ip rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1611964 Title: SNAT redirect rules should be removed only on Gateway clear. Status in neutron: New Bug description: SNAT redirect rules should be removed only on Gateway clear and not for a gateway move or gateway reschedule. This would cause the snat_node unreachable by the dvr service ports on the originating node. How to reproduce it. 1. Create a two network node setup (dvr_snat) 2. Create a network 3. Create a subnet 4. Create a router and attach the subnet to the router. 5. Set gateway to the router. 6. Now try to reschedule the router to the secondary node or do a manaul move to a second node. 7. In this case the 'external_gateway_removed" is called through 'external_gateway_updated' function and tries to call snat_redirect_remove. 8. After you move the snat, the router namespace will not have the routing rule for the 'csnat' port. 9. It clears up and you only see the base rules. Expected: root@ubuntu-ctlr:~/devstack# ip rule 0:from all lookup local 32766:from all lookup main 32767:from all lookup default 167772161:from 10.0.0.1/24 lookup 167772161 root@ubuntu-ctlr:~/devstack# ip route s t 167772161 default via 10.0.0.9 dev qr-18deeb39-3b But Actual: root@ubuntu-ctlr:~/devstack# ip rule 0:from all lookup local 32766:from all lookup main 32767:from all lookup default To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1611964/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1611513] [NEW] ip_lib: Add support for 'Flush' command in iproute
Public bug reported: This would be enhancement to the ip_lib iproute library to provide additional support for the 'Flush' command that is not available right now. This is a dependency for a fix in DVR to cleanup the gateway rules. Ref: Bug: #1599287 ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog ** Summary changed: - ip_lib: Add support for 'Flush' command for iproute + ip_lib: Add support for 'Flush' command in iproute -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1611513 Title: ip_lib: Add support for 'Flush' command in iproute Status in neutron: In Progress Bug description: This would be enhancement to the ip_lib iproute library to provide additional support for the 'Flush' command that is not available right now. This is a dependency for a fix in DVR to cleanup the gateway rules. Ref: Bug: #1599287 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1611513/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1599287] [NEW] Cleanup snat redirect rules when agent restarts after stale snat namespace is cleaned.
Public bug reported: When the L3 agent is dead, if the gateway is removed, the snat namespace and its rules are not properly cleaned when agent restarts. Even though the patch https://review.openstack.org/#/c/326729/ addresses the cleanup of the snat namespace, it does not remove the redirect rules and the gateway device from the router namespace when gateway is disabled. When agent restarts the agent does not get the gateway data from the server, and so it is not possible for the agent to clean it properly. In order to clean the snat redirect rules, the gateway data should be cached to the local file system and reused later when necessary. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog ** Tags added: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1599287 Title: Cleanup snat redirect rules when agent restarts after stale snat namespace is cleaned. Status in neutron: In Progress Bug description: When the L3 agent is dead, if the gateway is removed, the snat namespace and its rules are not properly cleaned when agent restarts. Even though the patch https://review.openstack.org/#/c/326729/ addresses the cleanup of the snat namespace, it does not remove the redirect rules and the gateway device from the router namespace when gateway is disabled. When agent restarts the agent does not get the gateway data from the server, and so it is not possible for the agent to clean it properly. In order to clean the snat redirect rules, the gateway data should be cached to the local file system and reused later when necessary. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1599287/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1583694] [NEW] [RFE] DVR support for Allowed_address_pair port that are bound to multiple ACTIVE VM ports used by Octavia
Public bug reported: DVR support for Allowed_address_pair ports with FloatingIP that are unbound and assgined to Multiple VMs that are active. Problem Statement: When FloatingIP is asssigned to Allowed_address_pair port and assigned to multiple VMs that are ACTIVE and connected to DVR (Distributed Virtual Router) routers, the FloatingIP is not functional. The use case here is to provide redundancy to the VMs that are serviced by the DVR routers. This feature works good for Legacy Routers ( Centralized Routers). Theory: Distributed Virtual Routers were designed for scalability and performance and to reduce the load on the single network node. Distributed Virtual Routers are created on each Compute node dynamically on demand and removed when not required. Distributed Virtual Routers heavily depend on the port binding to identify the requirement of a DVR service on a particular node. Today we only create/update/delete floatingip based on the router and the host in which the floatingip service is required. So the 'host' part is very critical for the operation of the DVR. In the above mentioned use case, we are dealing with Allowed_address_pair port, which is unbound to any specific host and are also assigned to multiple VMs that are ACTIVE at the same time. We have a work around today to inherit the parent VMs port binding properties for the allowed_address_pair port if the parent VMs port is ACTIVE. This has a limitation, that we assume that there would be only one "ACTIVE" VM port with the allowed_address_pair port for this to work. The reason for this is, if we have multiple "ACTIVE" VM port associated with the same allowed_address_pair port, and if the allowed_address_pair port has a FloatingIP associated with it, we can't provide the FloatingIP service on all the nodes were the VM's port is ACTIVE. This would create an issue because we will be seeing the same FloatingIP being advertised(GARP) from all nodes, and so the users on the external network will get confused on where the actual "ACTIVE" port is. Why is it working with Legacy Routers: In the case of legacy routers, the routers are always located a the network node and the DNAT is also done at the router_namespace in the Network node. They don't depend on the host-binding, since all the traffic have to flow through the centralized router in the network node. Also in the case of centralized routers, there is not issue of Floatingip GARP, since it is always going to be coming in through a single node. So in the background, the allowed_address_pair port MAC is being dynamically switched from one VM to another VM by the keepalived that runs in the VM. So neutron does not need to know about any of those and it works as expected. Why it is not working with DVR Routers: 1. Allowed_address_pair does not have host-binding. 2. If we were to inherit from the VMs host-binding, there are multiple VMs that are ACTIVE, so we can't have a single host-binding for these allowed_address_pair ports. 3. Even if we ignore the port_binding on the allowed_address_pair port and try to start providing the plumbing for the FloatingIP on multiple nodes based on the VMs it is assoicated with, there are issues with the same FloatingIP being GARP from different compute nodes that would confuse. How we can make it to work with DVR: Option 1: Neutron should have a some visibility on the state of the VM port, when the switch between ACTIVE and STANDBY happens. Today it is done by the keepalived on the VM and so it is not being logged anywhere. If the keepalived can log the event in neutron port, then it can be used by the neutron to determine when to allow FloatingIP traffic and block FloatingIP traffic for a particular node, and then send the GARP from the respective node. There is some delay introduced in this as well. (Desired) Low-hanging fruit. Option 2: Option 2 basically negates the Distributed nature of DVR and makes it centralized for North-South. The other option is to have the FloatingIP functionality centralized for such features. But this would be more complex, since we need to introduce config options for agents and floatingip. Also in this case, we can't have both the local floatingip and centralized floatingip support for the same node. A compute node can only have either localized floatingip or centralized floatingip. Complex ( Negates the purpose of DVR) References: Some references to the patches that we have already to support a single use case for the Allowed_address_pair with FloatingIP in DVR. https://review.openstack.org/#/c/254439/ https://review.openstack.org/#/c/301410/ https://review.openstack.org/#/c/304905/ ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog lbaas neutron -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1583694 Title: [RFE] DVR support for
[Yahoo-eng-team] [Bug 1578866] Re: test_user_update_own_password failing intermittently
** Also affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1578866 Title: test_user_update_own_password failing intermittently Status in OpenStack Identity (keystone): Confirmed Status in neutron: New Bug description: test_user_update_own_password is failing intermittently on a variety of jobs stack trace: Traceback (most recent call last): File "tempest/api/identity/v2/test_users.py", line 71, in test_user_update_own_password self.non_admin_users_client.token) File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/testtools/testcase.py", line 480, in assertRaises self.assertThat(our_callable, matcher) File "/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/testtools/testcase.py", line 493, in assertThat raise mismatch_error testtools.matchers._impl.MismatchError: > returned {u'token': {u'expires': u'2016-05-06T00:13:53Z', u'issued_at': u'2016-05-05T23:13:54.00Z', u'audit_ids': [u'mbdiQZcNT5GxEUebXZqKOA', u'BAlcCwKLS9Co8C3jg2vfAw'], u'id': u'gABXK9Oyhuw7yBJJehrIIGlzIB8VTbgnM_M5Cve9q0BEHeZ2xNohJ_lkVqp7kicVbNgZ93p2dcLHfUfXWCcPvO4BWkTIry1mAGSvhzeLI7SYxSS6CBpeGK0FH3Uf_5vhHTCWFvcDvKOSzajGImeN7GaYts91H1zsXV7B1HRs0xN-4LADokI'}, u'metadata': {u'roles': [], u'is_admin': 0}, u'serviceCatalog': [], u'user': {u'roles_links': [], u'username': u'tempest-IdentityUsersTest-972219078', u'name': u'tempest-IdentityUsersTest-972219078', u'roles': [], u'id': u'97a1836c5a2c40c99575e46aa37b8b50'}} example failures: http://logs.openstack.org/17/311617/1/gate/gate-tempest-dsvm-neutron-linuxbridge/084f25d/logs/testr_results.html.gz and http://logs.openstack.org/91/312791/2/check/gate-tempest-dsvm- full/88d9fff/logs/testr_results.html.gz To manage notifications about this bug go to: https://bugs.launchpad.net/keystone/+bug/1578866/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1569918] [NEW] Allowed_address_pair fixed_ip configured with FloatingIP after getting associated with a VM port does not work with DVR routers
Public bug reported: Allowed_address_pair fixed_ip when configured with FloatingIP after the port is associated with the VM port is not reachable from DVR router. The current code only supports adding in the proper ARP update and port host binding inheritence for the Allowed_address_pair port only if the port has a FloatingIP configured before it is associated with a VM port. When the floatingIP is added later, it fails. How to reproduce. 1. Create networks 2. Create vrrp-net. 3. Create vrrp-subnet. 4. Create a DVR router. 5. Attach the vrrp-subnet to the router. 6. Create a VM on the vrrp-subnet 7. Create a VRRP port. 8. Attach the VRRP port with the VM. 9. Now assign a FloatingIP to the VRRP port. 10. Now check the ARP table entry in the router_namespace and also the VRRP port details. The VRRP port is still unbound and so the DVR cannot handle unbound ports. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1569918 Title: Allowed_address_pair fixed_ip configured with FloatingIP after getting associated with a VM port does not work with DVR routers Status in neutron: New Bug description: Allowed_address_pair fixed_ip when configured with FloatingIP after the port is associated with the VM port is not reachable from DVR router. The current code only supports adding in the proper ARP update and port host binding inheritence for the Allowed_address_pair port only if the port has a FloatingIP configured before it is associated with a VM port. When the floatingIP is added later, it fails. How to reproduce. 1. Create networks 2. Create vrrp-net. 3. Create vrrp-subnet. 4. Create a DVR router. 5. Attach the vrrp-subnet to the router. 6. Create a VM on the vrrp-subnet 7. Create a VRRP port. 8. Attach the VRRP port with the VM. 9. Now assign a FloatingIP to the VRRP port. 10. Now check the ARP table entry in the router_namespace and also the VRRP port details. The VRRP port is still unbound and so the DVR cannot handle unbound ports. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1569918/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1566046] [NEW] Fix TypeError when trying to update an arp entry for ports with allowed_address_pairs on DVR router
Public bug reported: TypeError is seen when trying to update an arp entry for ports with allowed_address_pairs on DVR router. This was seen in the master branch while I was testing the allowed_address_pair with floatingips on DVR router. plugin.update_arp_entry_for_dvr_service_port(context, port) ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00m File "/opt/stack/neutron/neutron/db/l3_dvr_db.py", line 775, in update_arp_entry_for_dvr_service_port ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00mself.l3_rpc_notifier.add_arp_entry) ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00m File "/opt/stack/neutron/neutron/db/l3_dvr_db.py", line 729, in _generate_arp_table_and_notify_agent ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00mip_address = fixed_ip['ip_address'] ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00mTypeError: string indices must be integers How to reproduce it. 1. Create a vrrp-network 2. Create a vrrp-subnet 3. Create a dvr router 4. Attach the vrrp-subnet to the router 5. Create security group rules for the vrrp-net and add rules to it. 6. Now create a VM on the vrrp-subnet 8. Now create a vrrp-port (allowed_address_pair) on the vrrp-subnet 9. Associate a floatingip to the vrrp-port. 10. Now update the VM port with the allowed_address_pair IP. You should see this in the neutron-server logs. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: New ** Tags: l3-dvr-backlog ** Changed in: neutron Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1566046 Title: Fix TypeError when trying to update an arp entry for ports with allowed_address_pairs on DVR router Status in neutron: New Bug description: TypeError is seen when trying to update an arp entry for ports with allowed_address_pairs on DVR router. This was seen in the master branch while I was testing the allowed_address_pair with floatingips on DVR router. plugin.update_arp_entry_for_dvr_service_port(context, port) ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00m File "/opt/stack/neutron/neutron/db/l3_dvr_db.py", line 775, in update_arp_entry_for_dvr_service_port ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00mself.l3_rpc_notifier.add_arp_entry) ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00m File "/opt/stack/neutron/neutron/db/l3_dvr_db.py", line 729, in _generate_arp_table_and_notify_agent ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00mip_address = fixed_ip['ip_address'] ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager ^[[01;35m^[[00mTypeError: string indices must be integers How to reproduce it. 1. Create a vrrp-network 2. Create a vrrp-subnet 3. Create a dvr router 4. Attach the vrrp-subnet to the router 5. Create security group rules for the vrrp-net and add rules to it. 6. Now create a VM on the vrrp-subnet 8. Now create a vrrp-port (allowed_address_pair) on the vrrp-subnet 9. Associate a floatingip to the vrrp-port. 10. Now update the VM port with the allowed_address_pair IP. You should see this in the neutron-server logs. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1566046/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1564776] [NEW] DVR l3 agent should check for snat namespace existence before adding or deleting anything from the namespace
Public bug reported: Check for snat_namespace existence in the node before any operation in the namespace. Today we check the self.snatnamespace which may or may not reflect the exact state of the system. If the snat_namespace is accidentally deleted and if we try to remove the gateway from the router, the agent throws in a bunch of error messages and the agent goes in loop constantly spewing error messages. Here is the link to the error message. http://paste.openstack.org/show/492700/ This can be easily reproduced. 1. Create a network 2. Create a subnet 3. Create a router ( dvr) 4. Attach the subnet to the router. 5. Configure default gateway to the router. 6. Now verify the namespaces in the 'dvr_snat' node. 7. You should see a. snat_namespace b. router_namespace c. dhcp namespace. 8. Now delete the snat_namespace. 9. Try to remove the gateway from the router. 10. Watch the L3 agent logs ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1564776 Title: DVR l3 agent should check for snat namespace existence before adding or deleting anything from the namespace Status in neutron: New Bug description: Check for snat_namespace existence in the node before any operation in the namespace. Today we check the self.snatnamespace which may or may not reflect the exact state of the system. If the snat_namespace is accidentally deleted and if we try to remove the gateway from the router, the agent throws in a bunch of error messages and the agent goes in loop constantly spewing error messages. Here is the link to the error message. http://paste.openstack.org/show/492700/ This can be easily reproduced. 1. Create a network 2. Create a subnet 3. Create a router ( dvr) 4. Attach the subnet to the router. 5. Configure default gateway to the router. 6. Now verify the namespaces in the 'dvr_snat' node. 7. You should see a. snat_namespace b. router_namespace c. dhcp namespace. 8. Now delete the snat_namespace. 9. Try to remove the gateway from the router. 10. Watch the L3 agent logs To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1564776/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1564575] [NEW] DVR router namespaces are deleted when we manually move a DVR router from one SNAT_node to another SNAT_node even though there are active VMs in the node
Public bug reported: DVR router namespaces are deleted when we manually move the router from on dvr_snat node to another dvr_snat node. It should be only deleting the snat_namespace and not the router_namespace, since there are 'dhcp' ports and 'vm' ports still serviced by DVR. How to reproduce: Configure a two node setup: 1. I have one node with Controller, compute and networking node with dhcp running in dvr_snat mode. 2. I have another node with compute and networking node without dhcp running in dvr_snat mode. 3. Now create network 4. Create a subnet 5. Create a router and attach the subnet to the router. 6. Also set a gateway to the router. 7. Now you should see that there are three namespaces in the first node. a. snat_namespace b. qrouter_namespace c. dhcp_namespace 8. Now create a VM on the first node. 9. Now try to remove the router from the first agent and assign it to the second agent in the second node. neutron l3-agent-router-remove agent-id router-id This currently removes both the snat_namespace and the router_namespace when there is still a valid vm and dhcp port. Suspect that checking for available DVR service ports might be causing an issue here. Will try to find out the root cause. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog mitaka-rc-potential -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1564575 Title: DVR router namespaces are deleted when we manually move a DVR router from one SNAT_node to another SNAT_node even though there are active VMs in the node Status in neutron: New Bug description: DVR router namespaces are deleted when we manually move the router from on dvr_snat node to another dvr_snat node. It should be only deleting the snat_namespace and not the router_namespace, since there are 'dhcp' ports and 'vm' ports still serviced by DVR. How to reproduce: Configure a two node setup: 1. I have one node with Controller, compute and networking node with dhcp running in dvr_snat mode. 2. I have another node with compute and networking node without dhcp running in dvr_snat mode. 3. Now create network 4. Create a subnet 5. Create a router and attach the subnet to the router. 6. Also set a gateway to the router. 7. Now you should see that there are three namespaces in the first node. a. snat_namespace b. qrouter_namespace c. dhcp_namespace 8. Now create a VM on the first node. 9. Now try to remove the router from the first agent and assign it to the second agent in the second node. neutron l3-agent-router-remove agent-id router-id This currently removes both the snat_namespace and the router_namespace when there is still a valid vm and dhcp port. Suspect that checking for available DVR service ports might be causing an issue here. Will try to find out the root cause. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1564575/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1563879] [NEW] [RFE] DVR should route packets to Instances behind the L2 Gateway
Public bug reported: L2 Gateway bridges the neutron network with the hardware based VxLAN gateways. The DVR routers in neutron could not forward traffic to an instance that is behind the VxLAN gateways since it could not 'ARP' for those instances. DVR currently has prepopulated ARP entries for all instances created with DVR serviceable port. But somehow we should be able to populate the ARP entries of instances behind the VxLAN gateway on all DVR nodes and so the traffic can flow between them. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog ** Tags added: l3-dvr-backlog ** Summary changed: - [RFE] DVR should route packets to Instances on the L2 Gateway + [RFE] DVR should route packets to Instances behind the L2 Gateway -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1563879 Title: [RFE] DVR should route packets to Instances behind the L2 Gateway Status in neutron: New Bug description: L2 Gateway bridges the neutron network with the hardware based VxLAN gateways. The DVR routers in neutron could not forward traffic to an instance that is behind the VxLAN gateways since it could not 'ARP' for those instances. DVR currently has prepopulated ARP entries for all instances created with DVR serviceable port. But somehow we should be able to populate the ARP entries of instances behind the VxLAN gateway on all DVR nodes and so the traffic can flow between them. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1563879/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1562110] [NEW] link-lock-address allocater for DVR has a limit of 256 address pairs per node
Public bug reported: The current 'link-lock-address allocator for DVR routers has a limit fo 256 routers per node. This should be configurable and not just limit to 256 routers per node. ** Affects: neutron Importance: Undecided Status: Confirmed ** Tags: l3-dvr-backlog ** Changed in: neutron Status: New => Confirmed -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1562110 Title: link-lock-address allocater for DVR has a limit of 256 address pairs per node Status in neutron: Confirmed Bug description: The current 'link-lock-address allocator for DVR routers has a limit fo 256 routers per node. This should be configurable and not just limit to 256 routers per node. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1562110/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1499045] Re: get_snat_port_for_internal_port called twice when an interface is added or removed by the l3 agent in the case of DVR routers.
** Changed in: neutron Status: In Progress => Opinion -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1499045 Title: get_snat_port_for_internal_port called twice when an interface is added or removed by the l3 agent in the case of DVR routers. Status in neutron: Opinion Bug description: get_snat_port_for_internal_port retrieves the internal snat port created for each router interface added to a DVR router. But this function is called twice in the L3 agent code. for every interface add or delete to the router, it is called by the 'dvr_local_router.py' and again it is called by the 'dvr_edge_router.py'. This can be reduced to a single call to improve the controll plane performance. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1499045/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1538369] Re: re factor add_router_interface in l3_dvr_db.py
** Changed in: neutron Status: New => Opinion -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1538369 Title: re factor add_router_interface in l3_dvr_db.py Status in neutron: Opinion Bug description: lot of code is repeated in add_router_interface in l3_db.py and l3_dvr_db.py It would be nice to re factor the code and have one common func _add_router_interface which should be called while add_router_interface in l3_db.py and l3_dvr_db.py To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1538369/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1558097] [NEW] DVR SNAT HA - Documentation for Networking guide
Public bug reported: DVR SNAT HA - Documentation for Networking guide for Mitaka. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1558097 Title: DVR SNAT HA - Documentation for Networking guide Status in neutron: New Bug description: DVR SNAT HA - Documentation for Networking guide for Mitaka. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1558097/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1554392] Re: Set extra route for DVR might cause error
This is a known issue, since the router does not have an external network interface in the router namespace and if you try to configure an extra route pointing to the next hop which does not have a corresponding interface in the router namespace. This was a descision that we made since we don't want to complicate it too much, but not adding the external routes in the router namespace and only add it in the snat_namespace. ** Changed in: neutron Status: New => Opinion -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1554392 Title: Set extra route for DVR might cause error Status in neutron: Opinion Bug description: With a DVR router. I have external network: 172.24.4.0/24 internal network: 10.0.0.0/24 I want to set an extra route for it, so I execute the following command: neutron router-update router1 --route destination=20.0.0.0/24,nexthop=172.24.4.6 But I get this error at the output of neutron-l3-agent. ERROR neutron.agent.linux.utils [-] Exit code: 2; Stdin: ; Stdout: ; Stderr: RTNETLINK answers: Network is unreachable The reason for it is that the DVR router will set extra route to snat and qrouter namespace. However, qrouter namespace will not have the route to external network, so error is reported when l3-agent try to add a route with nexthop to external network to qroute namespace. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1554392/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1549511] [NEW] "test_volume_backed_live_migration" test failures seen in the gate
Public bug reported: Recently we have seen the "Test_volume_backed_live_migration" fail with Multinode gate setup. This test failure is seen in nova/neutron etc., http://logs.openstack.org/17/258417/6/check/gate-tempest-dsvm-multinode- full/0d516d3/console.html#_2016-02-24_17_43_48_123 ** Affects: nova Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1549511 Title: "test_volume_backed_live_migration" test failures seen in the gate Status in OpenStack Compute (nova): New Bug description: Recently we have seen the "Test_volume_backed_live_migration" fail with Multinode gate setup. This test failure is seen in nova/neutron etc., http://logs.openstack.org/17/258417/6/check/gate-tempest-dsvm- multinode-full/0d516d3/console.html#_2016-02-24_17_43_48_123 To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1549511/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1541714] Re: DVR routers are not created on a compute node that runs agent in 'dvr' mode
It was an invalid user configuration. The "dvr"node was not configured with the right agent mode, and so this issue was seen. Please ignore this bug. ** Changed in: neutron Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1541714 Title: DVR routers are not created on a compute node that runs agent in 'dvr' mode Status in neutron: Invalid Bug description: DVR routers are not created on a compute node that is running L3 agent in "dvr" mode. This might have been introduced by the latest patch that changed the scheduling behavior. https://review.openstack.org/#/c/254837/ Steps to reproduce: 1. Stack up two nodes. ( dvr_snat node) and (dvr node) 2. Create a Network 3. Create a Subnet 4. Create a Router 5. Add Subnet to the Router 6. Create a VM on the "dvr_snat" node. Everything works fine here. We can see the router-namespace, snat-namespace and the dhcp-namespace. 7. Now Create a VM and force the VM to be created on the second node ( dvr node). - nova boot --flavor xyz --image abc --net net-id yyy-id --availability-zone nova:dvr-node myinstance2 Now see the image is created in the second node. But the router namespace is missing in the second node. The router is scheduled to the dvr-snat node, but not to the compute node. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1541714/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1541714] [NEW] DVR routers are not created on a compute node that runs agent in 'dvr' mode
Public bug reported: DVR routers are not created on a compute node that is running L3 agent in "dvr" mode. This might have been introduced by the latest patch that changed the scheduling behavior. https://review.openstack.org/#/c/254837/ Steps to reproduce: 1. Stack up two nodes. ( dvr_snat node) and (dvr node) 2. Create a Network 3. Create a Subnet 4. Create a Router 5. Add Subnet to the Router 6. Create a VM on the "dvr_snat" node. Everything works fine here. We can see the router-namespace, snat-namespace and the dhcp-namespace. 7. Now Create a VM and force the VM to be created on the second node ( dvr node). - nova boot --flavor xyz --image abc --net net-id yyy-id --availability-zone nova:dvr-node myinstance2 Now see the image is created in the second node. But the router namespace is missing in the second node. The router is scheduled to the dvr-snat node, but not to the compute node. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1541714 Title: DVR routers are not created on a compute node that runs agent in 'dvr' mode Status in neutron: New Bug description: DVR routers are not created on a compute node that is running L3 agent in "dvr" mode. This might have been introduced by the latest patch that changed the scheduling behavior. https://review.openstack.org/#/c/254837/ Steps to reproduce: 1. Stack up two nodes. ( dvr_snat node) and (dvr node) 2. Create a Network 3. Create a Subnet 4. Create a Router 5. Add Subnet to the Router 6. Create a VM on the "dvr_snat" node. Everything works fine here. We can see the router-namespace, snat-namespace and the dhcp-namespace. 7. Now Create a VM and force the VM to be created on the second node ( dvr node). - nova boot --flavor xyz --image abc --net net-id yyy-id --availability-zone nova:dvr-node myinstance2 Now see the image is created in the second node. But the router namespace is missing in the second node. The router is scheduled to the dvr-snat node, but not to the compute node. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1541714/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1535928] [NEW] Duplicate IPtables rule detected warning message seen in L3 agent
Public bug reported: In recent L3 agent logs in the gate we have been seeing this warning message associated with the DVR router jobs. Right now none of the jobs are failing, but we need to see why this warning message is showing up in the logs or it might be due to some hidden issues. http://logs.openstack.org/89/255989/11/check/gate-tempest-dsvm-neutron- dvr/e3464a5/logs/screen-q-l3.txt.gz?level=WARNING#_2016-01-18_13_34_52_764 ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog ** Summary changed: - Duplicate IPtables rule detected warning message seen in L3 agent for DVR Routers + Duplicate IPtables rule detected warning message seen in L3 agent -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1535928 Title: Duplicate IPtables rule detected warning message seen in L3 agent Status in neutron: New Bug description: In recent L3 agent logs in the gate we have been seeing this warning message associated with the DVR router jobs. Right now none of the jobs are failing, but we need to see why this warning message is showing up in the logs or it might be due to some hidden issues. http://logs.openstack.org/89/255989/11/check/gate-tempest-dsvm- neutron- dvr/e3464a5/logs/screen-q-l3.txt.gz?level=WARNING#_2016-01-18_13_34_52_764 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1535928/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1524020] [NEW] DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac-address changes occur
Public bug reported: DVR arp update (dvr_vmarp_table_update) and dvr_update_router_add_vm called for every update_port if the mac_address changes or when update_devic_up is true. These functions should be called from _notify_l3_agent_port_update, only when a host binding for a service port changes or when a mac_address for the service port changes. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog ** Summary changed: - DVR Arp update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac-address changes occur + DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac-address changes occur -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1524020 Title: DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac- address changes occur Status in neutron: New Bug description: DVR arp update (dvr_vmarp_table_update) and dvr_update_router_add_vm called for every update_port if the mac_address changes or when update_devic_up is true. These functions should be called from _notify_l3_agent_port_update, only when a host binding for a service port changes or when a mac_address for the service port changes. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1524020/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1515360] [NEW] Add more verbose to Tempest Test Errors that causes "SSHTimeout" seen in CVR and DVR
Public bug reported: Today "SSHTimeout" Errors are seen both in CVR ( Centralized Virtual Routers) and DVR ( Distributed Virtual Routers). The frequency of occurence is more on DVR than the CVR. But the problem here, is the error statement that is returned and the data that is dumped. SSHTimeout may have occured due to several reasons, since in all our tempest test we are trying to ssh to the VM using the public IP ( FloatingIP) 1. VM did not come up 2. VM does not have a private IP address 3. Security rules in the VM was not applied properly 4. Setting up of Floating IP 5. DNAT rules in the Router Namespace. 6. Scheduling. 7. Namespace Errors etc., We need a way to identify through the tempest test exactly were and what went wrong. ** Affects: neutron Importance: Undecided Status: New ** Tags: gate-failure l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1515360 Title: Add more verbose to Tempest Test Errors that causes "SSHTimeout" seen in CVR and DVR Status in neutron: New Bug description: Today "SSHTimeout" Errors are seen both in CVR ( Centralized Virtual Routers) and DVR ( Distributed Virtual Routers). The frequency of occurence is more on DVR than the CVR. But the problem here, is the error statement that is returned and the data that is dumped. SSHTimeout may have occured due to several reasons, since in all our tempest test we are trying to ssh to the VM using the public IP ( FloatingIP) 1. VM did not come up 2. VM does not have a private IP address 3. Security rules in the VM was not applied properly 4. Setting up of Floating IP 5. DNAT rules in the Router Namespace. 6. Scheduling. 7. Namespace Errors etc., We need a way to identify through the tempest test exactly were and what went wrong. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1515360/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1513678] [NEW] At scale router scheduling takes a long time with DVR routers with multiple compute nodes hosting thousands of VMs
Public bug reported: At scale when we have 100s of compute Node and 1000s of VM in networks that are routed by Distributed Virtual Router, we are seeing a control plane performance issue. It takes a while for all the routers to be schedule in the Nodes. The _schedule_router calls _get_candidates, and it internally calls get_l3_agent_candidates. In the case of the DVR Routers, all the active agents are passed to the get_l3_agent_candidates which iterates through the agents and for each agent it tries to find out if there are any dvr_service ports available in the routed subnet. This might be taking lot more time. So we need to figure out the issue and reduce the time taken for the scheduling. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1513678 Title: At scale router scheduling takes a long time with DVR routers with multiple compute nodes hosting thousands of VMs Status in neutron: In Progress Bug description: At scale when we have 100s of compute Node and 1000s of VM in networks that are routed by Distributed Virtual Router, we are seeing a control plane performance issue. It takes a while for all the routers to be schedule in the Nodes. The _schedule_router calls _get_candidates, and it internally calls get_l3_agent_candidates. In the case of the DVR Routers, all the active agents are passed to the get_l3_agent_candidates which iterates through the agents and for each agent it tries to find out if there are any dvr_service ports available in the routed subnet. This might be taking lot more time. So we need to figure out the issue and reduce the time taken for the scheduling. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1513678/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1512199] Re: change vm fixed ips will cause unable to communicate to vm in other network
Not able to reproduce I could see the arp table update on the router namespaces on both nodes. I tried to modify the ports on both the subnet 10.2.0.X and 10.0.0.X. In this example I have change the 10.2.0.4 to 10.2.0.25 and 10.0.0.8 10.0.0.20. In both cases I saw that the arp entry was updated. There is one thing that is true on both our testing is, the VM is not able to get the new IP until I reboot the VM. ( This might be filed as a different bug in nova) ARP output from Node 2: root@ubuntu-new-compute:~/devstack# arp -a ? (10.2.0.4) at fa:16:3e:7e:0b:48 [ether] PERM on qr-b25bad4f-5f ? (10.0.0.6) at fa:16:3e:7a:78:fe [ether] PERM on qr-66c29926-29 ? (10.2.0.3) at fa:16:3e:b6:19:da [ether] PERM on qr-b25bad4f-5f ? (10.0.0.2) at fa:16:3e:91:1a:d2 [ether] PERM on qr-b2b8c9a4-68 ? (10.0.0.2) at fa:16:3e:91:1a:d2 [ether] PERM on qr-66c29926-29 ? (10.0.0.6) at fa:16:3e:7a:78:fe [ether] PERM on qr-b2b8c9a4-68 ? (10.2.0.25) at fa:16:3e:7e:0b:48 [ether] PERM on qr-b25bad4f-5f ( changed arp info) ? (10.0.0.7) at fa:16:3e:5d:12:fd [ether] PERM on qr-66c29926-29 ? (10.2.0.2) at fa:16:3e:b6:84:91 [ether] PERM on qr-b25bad4f-5f ? (10.0.0.8) at fa:16:3e:a1:cc:87 [ether] PERM on qr-66c29926-29 ? (10.0.0.8) at fa:16:3e:a1:cc:87 [ether] PERM on qr-b2b8c9a4-68 ? (10.0.0.20) at fa:16:3e:a1:cc:87 [ether] PERM on qr-66c29926-29 ( changed arp info) ? (10.0.0.7) at fa:16:3e:5d:12:fd [ether] PERM on qr-b2b8c9a4-68 ? (10.0.0.3) at fa:16:3e:fd:a1:d6 [ether] PERM on qr-66c29926-29 root@ubuntu-new-compute:~/devstack# ARP Info from Node 1: root@ubuntu-ctlr:~/devstack# arp -a ? (10.0.0.3) at fa:16:3e:fd:a1:d6 [ether] PERM on qr-66c29926-29 ? (10.2.0.3) at fa:16:3e:b6:19:da [ether] PERM on qr-b25bad4f-5f ? (10.0.0.7) at fa:16:3e:5d:12:fd [ether] PERM on qr-66c29926-29 ? (10.0.0.2) at fa:16:3e:91:1a:d2 [ether] PERM on qr-b2b8c9a4-68 ? (10.0.0.2) at fa:16:3e:91:1a:d2 [ether] PERM on qr-66c29926-29 ? (10.2.0.4) at fa:16:3e:7e:0b:48 [ether] PERM on qr-b25bad4f-5f ? (10.0.0.6) at fa:16:3e:7a:78:fe [ether] PERM on qr-66c29926-29 ? (10.2.0.25) at fa:16:3e:7e:0b:48 [ether] PERM on qr-b25bad4f-5f ? (10.2.0.5) at on qr-b25bad4f-5f ? (10.0.0.5) at on qr-66c29926-29 ? (10.0.0.20) at fa:16:3e:a1:cc:87 [ether] PERM on qr-66c29926-29 ? (10.0.0.8) at fa:16:3e:a1:cc:87 [ether] PERM on qr-66c29926-29 ? (10.2.0.2) at fa:16:3e:b6:84:91 [ether] PERM on qr-b25bad4f-5f ? (10.0.0.4) at on qr-66c29926-29 root@ubuntu-ctlr:~/devstack# ** Changed in: neutron Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1512199 Title: change vm fixed ips will cause unable to communicate to vm in other network Status in neutron: Invalid Bug description: I use dvr+kilo, vxlan. The environment is like: vm2-2<- compute1 --vxlan- comupte2 ->vm2-1 vm3-1<- vm2-1<- net2 -router1- net3 ->vm3-1 vm2-2<- vm2-1(192.168.2.3) and vm2-2(192.168.2.4) are in the same net(net2 192.168.2.0/24) but not assigned to the same compute node. vm3-1 is in net3(192.168.3.0/24). net2 and net3 are connected by router1. The three vms are in default security-group. Not use firewall. 1. Using command below to change the ip of vm2-1. neutron port-update portID --fixed-ip subnet_id=subnetID,ip_address=192.168.2.10 --fixed-ip subnet_id=subnetID,ip_address=192.168.2.20 In vm2-1 using "sudo udhcpc"(carrios) to get ip, the dhcp message is correct but the ip not changed. Then reboot vm2-1. The ip of vm2-1 turned to be 192.168.2.20. 2. Using vm2-2 could ping 192.168.2.20 successfully . But vm3-1 could not ping 192.168.2.20 successfully. By capturing packets and looking for related information, the reason maybe: 1. newIP(192.168.2.20) and MAC of vm2-1 was not wrote to arp cache in the namespace of router1 in compute1 node. 2. In dvr mode, the arp request from gw port(192.168.2.1) from compute1 to vm2-1 was dropped by flowtable in compute2. So the arp request(192.168.2.1->192.168.2.20) could not arrive at vm2-1. 3. For vm2-2, the arp request(192.168.2.4->192.168.2.20) was not dropped and could connect with vm2-1. In my opinion, if both new fixed IPs of vm2-1(192.168.2.10 and 102.168.2.20) and MAC is wrote to arp cache in namespace of router1 in compute1 node, the problem will resolved. But only one ip(192.168.2.10) and MAC is wrote. BTW, if only set one fixed ip for vm2-1, it works fine. But if set two fixed ips for vm2-1, the problem above most probably happens. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1512199/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1509004] [NEW] "test_dualnet_dhcp6_stateless_from_os" failures seen in the gate
Public bug reported: "test_dualnet_dhcp6_stateless_from_os" - This test fails in the gate randomly both with DVR and non-DVR routers. http://logs.openstack.org/79/230079/27/check/gate-tempest-dsvm-neutron- full/1caed8b/logs/testr_results.html.gz http://logs.openstack.org/85/238485/1/check/gate-tempest-dsvm-neutron- dvr/1059e22/logs/testr_results.html.gz ** Affects: neutron Importance: Undecided Status: New ** Tags: ipv6 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1509004 Title: "test_dualnet_dhcp6_stateless_from_os" failures seen in the gate Status in neutron: New Bug description: "test_dualnet_dhcp6_stateless_from_os" - This test fails in the gate randomly both with DVR and non-DVR routers. http://logs.openstack.org/79/230079/27/check/gate-tempest-dsvm- neutron-full/1caed8b/logs/testr_results.html.gz http://logs.openstack.org/85/238485/1/check/gate-tempest-dsvm-neutron- dvr/1059e22/logs/testr_results.html.gz To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1509004/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1503847] [NEW] Python34 test failures in gate - Logging Error
Public bug reported: I am seeing "gate-neutron-python34" test failures again in neutron. http://logs.openstack.org/82/228582/13/check/gate-neutron- python34/5b36c34/console.html http://logs.openstack.org/82/228582/13/check/gate-neutron- python34/5b36c34/console.html#_2015-10-07_17_36_06_987 ** Affects: neutron Importance: Undecided Status: New ** Tags: py34 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1503847 Title: Python34 test failures in gate - Logging Error Status in neutron: New Bug description: I am seeing "gate-neutron-python34" test failures again in neutron. http://logs.openstack.org/82/228582/13/check/gate-neutron- python34/5b36c34/console.html http://logs.openstack.org/82/228582/13/check/gate-neutron- python34/5b36c34/console.html#_2015-10-07_17_36_06_987 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1503847/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1501873] [NEW] FIP Namespace add/delete race condition seen in DVR router log
e None None] Command: ['ip', 'netns', 'exec', 'fip-31689320-95d7-44f9-932a-cc82c1bca2b4', 'sysctl', '-w', 'net.ipv4.ip_forward=1'] Exit code: 1 Stdin: Stdout: Stderr: seting the network namespace "fip-31689320-95d7-44f9-932a-cc82c1bca2b4" failed: Invalid argument This leads to a series of failures. This failure is seen only in the gate. This can be reproduced by constantly adding and deleting floatingip to a private IP, with multiple API worker threads. For more information you can also look at the "logstash" output below. http://logs.openstack.org/82/228582/8/check/gate-tempest-dsvm-neutron- dvr/9053337/logs/screen-q-l3.txt.gz?level=TRACE#_2015-09-29_21_10_34_084 ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1501873 Title: FIP Namespace add/delete race condition seen in DVR router log Status in neutron: In Progress Bug description: FIP Namespace add/delete race conditon seen in DVR router log. This might cause the FIP functionality to fail. From the trace log it seems when this happens, a bunch of tests related to FIP functionality fails with SSH Timeout waiting for reply. Here is the output of the trace that kinds of shows the race condition. Exit code: 0 execute /opt/stack/new/neutron/neutron/agent/linux/utils.py:156 2015-09-29 21:10:33.433 7884 DEBUG neutron.agent.l3.dvr_local_router [-] Removed last floatingip, so requesting the server to delete Floatingip Agent Gateway port:{u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'device_owner': u'network:floatingip_agent_gateway', u'port_security_enabled': False, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': u'362e9033-db93-4193-9413-1073215ab326', u'prefixlen': 24, u'ip_address': u'172.24.5.9'}, {u'subnet_id': u'feb3aa76-53b1-4d4e-b136-412c747ffd30', u'prefixlen': 64, u'ip_address': u'2001:db8::a'}], u'id': u'044a8e2f-00eb-4231-b526-13cb46dcc42f', u'security_groups': [], u'binding:vif_details': {u'port_filter': True, u'ovs_hybrid_plug': True}, u'binding:vif_type': u'ovs', u'mac_address': u'fa:16:3e:7a:a6:85', u'status': u'DOWN', u'subnets': [{u'ipv6_ra_mode': None, u'cidr': u'2001:db8::/64', u'gateway_ip': u'2001:db8::2', u'id': u'feb3aa76-53b1-4d4e-b136-412c747ffd30', u'subnetpool_id': None}, {u'ipv6_ra_mode': None, u'cidr': u'172. 24.5.0/24', u'gateway_ip': u'172.24.5.1', u'id': u'362e9033-db93-4193-9413-1073215ab326', u'subnetpool_id': None}], u'binding:host_id': u'devstack-trusty-hpcloud-b5-5153724', u'dns_assignment': [{u'hostname': u'host-172-24-5-9', u'ip_address': u'172.24.5.9', u'fqdn': u'host-172-24-5-9.openstacklocal.'}, {u'hostname': u'host-2001-db8--a', u'ip_address': u'2001:db8::a', u'fqdn': u'host-2001-db8--a.openstacklocal.'}], u'device_id': u'646bb18b-da52-4ead-a635-012c72c1ccf1', u'name': u'', u'admin_state_up': True, u'network_id': u'31689320-95d7-44f9-932a-cc82c1bca2b4', u'dns_name': u'', u'binding:vnic_type': u'normal', u'tenant_id': u'', u'extra_subnets': []} floating_ip_removed_dist /opt/stack/new/neutron/neutron/agent/l3/dvr_local_router.py:148 2015-09-29 21:10:34.031 7884 DEBUG neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip', 'netns', 'delete', 'fip-31689320-95d7-44f9-932a-cc82c1bca2b4'] execute_rootwrap_daemon /opt/stack/new/neutron/neutron/agent/linux/utils.py:101 2015-09-29 21:10:34.043 DEBUG neutron.agent.l3.dvr_local_router [req-33413b07-784c-469e-8a35-0e20312a157e None None] FloatingIP agent gateway port received from the plugin: {u'allowed_address_pairs': [], u'extra_dhcp_opts': [], u'device_owner': u'network:floatingip_agent_gateway', u'port_security_enabled': False, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': u'362e9033-db93-4193-9413-1073215ab326', u'prefixlen': 24, u'ip_address': u'172.24.5.9'}, {u'subnet_id': u'feb3aa76-53b1-4d4e-b136-412c747ffd30', u'prefixlen': 64, u'ip_address': u'2001:db8::a'}], u'id': u'044a8e2f-00eb-4231-b526-13cb46dcc42f', u'security_groups': [], u'binding:vif_details': {u'port_filter': True, u'ovs_hybrid_plug': True}, u'binding:vif_type': u'ovs', u'mac_address': u'fa:16:3e:7a:a6:85', u'status': u'ACTIVE', u'subnets': [{u'ipv6_ra_mode': None, u'cidr': u'172.24.5.0/24', u'gateway_ip': u'172.24.5.1', u'id': u'362e9033-db93-4193-9413-1073215ab326', u'subnetpool_id': None}, {u'ipv6_ra_mode': None, u'ci dr': u'2001:db8::/64', u'gateway_ip': u'2001:db8::2', u'id': u'feb3aa76-53b1-4d4e-b136-412c747ffd30', u'subnetpool_id': None}], u'binding:host_id': u'devstack-trusty-hpcloud-b5-5153724', u'dns_assignment': [{u'hostname': u'host-172-24-5-9', u'ip_address': u'172.24.5.9', u'fqdn': u'host-172-24-5-9.openstacklocal.'}, {u'hostname': u'host-2001-db8--a', u'ip_address': u'200
[Yahoo-eng-team] [Bug 1501086] [NEW] ARP entries dropped by DVR routers when the qr device is not ready or present
Public bug reported: The ARP entries are dropped by DVR routers when the 'qr' device does not exist in the namespace. There are two ways in the L3 agent the ARP entries are updated. Once when an internal csnat port is created, then arp entries added from the 'dvr_local_router' by calling the "set_subnet_arp_info" which in turn calls the "_update_arp_entry". There is another time, when an arp update "rpc" message comes from the Server to the agent as "add_arp_entry" or "delete_arp_entry" which inturn calls "_update_arp_entry". We have seen log traces that shows that the arp update message comes before the "qr" device is ready. So we get to drop those arp message. We need to kind of cache those arp messages and update the router- namespace when the "qr" device is ready. If you see the message below, we are checking for the device and throwing a warning message that the device is not ready, but the arp entries are not saved anywere. They are dropped. 2015-09-24 18:45:30.150 WARNING neutron.agent.l3.dvr_local_router [req- 0565ce3a-905d-43fa-a6f3-1a07df6c6c2b None None] Arp operation add failed for device qr-b672ffde-cd, since the device does not exist anymore. The device might have been concurrently deleted or not created yet. If you see here the internal_network 'qr' device is added later. 2015-09-24 18:45:30.367 DEBUG neutron.agent.l3.router_info [req- 7e5722e4-5fef-4889-9372-8cf1218522a2 None None] adding internal network: prefix(qr-), port(b672ffde-cd80-49eb-9817-58436fa8e8fd) _internal_network_added /opt/stack/new/neutron/neutron/agent/l3/router_info.py:300 ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog ** Changed in: neutron Status: New => Confirmed ** Changed in: neutron Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1501086 Title: ARP entries dropped by DVR routers when the qr device is not ready or present Status in neutron: In Progress Bug description: The ARP entries are dropped by DVR routers when the 'qr' device does not exist in the namespace. There are two ways in the L3 agent the ARP entries are updated. Once when an internal csnat port is created, then arp entries added from the 'dvr_local_router' by calling the "set_subnet_arp_info" which in turn calls the "_update_arp_entry". There is another time, when an arp update "rpc" message comes from the Server to the agent as "add_arp_entry" or "delete_arp_entry" which inturn calls "_update_arp_entry". We have seen log traces that shows that the arp update message comes before the "qr" device is ready. So we get to drop those arp message. We need to kind of cache those arp messages and update the router- namespace when the "qr" device is ready. If you see the message below, we are checking for the device and throwing a warning message that the device is not ready, but the arp entries are not saved anywere. They are dropped. 2015-09-24 18:45:30.150 WARNING neutron.agent.l3.dvr_local_router [req-0565ce3a-905d-43fa-a6f3-1a07df6c6c2b None None] Arp operation add failed for device qr-b672ffde-cd, since the device does not exist anymore. The device might have been concurrently deleted or not created yet. If you see here the internal_network 'qr' device is added later. 2015-09-24 18:45:30.367 DEBUG neutron.agent.l3.router_info [req- 7e5722e4-5fef-4889-9372-8cf1218522a2 None None] adding internal network: prefix(qr-), port(b672ffde-cd80-49eb-9817-58436fa8e8fd) _internal_network_added /opt/stack/new/neutron/neutron/agent/l3/router_info.py:300 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1501086/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1499787] [NEW] Static routes are attempted to add to SNAT Namespace of DVR routers without checking for Router Gateway.
Public bug reported: In DVR routers static routes are now only added to snat namespace. But before adding to snat namespace, the routers are not checked for the existence of gateway. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1499787 Title: Static routes are attempted to add to SNAT Namespace of DVR routers without checking for Router Gateway. Status in neutron: New Bug description: In DVR routers static routes are now only added to snat namespace. But before adding to snat namespace, the routers are not checked for the existence of gateway. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1499787/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1499785] [NEW] Static routes are not added to the qrouter namespace for DVR routers
Public bug reported: Static routes are not added to the qrouter namespace when routers are added. Initially it used to be configuring the routes in the qrouter namespace but not in the SNAT namespace. A recent patch caused this regression in moving the routes from qrouter namespace to SNAT namespace. 2bb48eb58ad28a629dd12c434b83680aa3f240a4 ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1499785 Title: Static routes are not added to the qrouter namespace for DVR routers Status in neutron: New Bug description: Static routes are not added to the qrouter namespace when routers are added. Initially it used to be configuring the routes in the qrouter namespace but not in the SNAT namespace. A recent patch caused this regression in moving the routes from qrouter namespace to SNAT namespace. 2bb48eb58ad28a629dd12c434b83680aa3f240a4 To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1499785/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1499045] [NEW] get_snat_port_for_internal_port called twice when an interface is added or removed by the l3 agent in the case of DVR routers.
Public bug reported: get_snat_port_for_internal_port retrieves the internal snat port created for each router interface added to a DVR router. But this function is called twice in the L3 agent code. for every interface add or delete to the router, it is called by the 'dvr_local_router.py' and again it is called by the 'dvr_edge_router.py'. This can be reduced to a single call to improve the controll plane performance. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1499045 Title: get_snat_port_for_internal_port called twice when an interface is added or removed by the l3 agent in the case of DVR routers. Status in neutron: In Progress Bug description: get_snat_port_for_internal_port retrieves the internal snat port created for each router interface added to a DVR router. But this function is called twice in the L3 agent code. for every interface add or delete to the router, it is called by the 'dvr_local_router.py' and again it is called by the 'dvr_edge_router.py'. This can be reduced to a single call to improve the controll plane performance. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1499045/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1419175] Re: Cannot find device "qr-" error message found in logtrace with DVR routers while trying to update arp entry
** Summary changed: - DVR qrouter created without OVS qr device + Cannot find device "qr-" error message found in logtrace with DVR routers while trying to update arp entry ** Changed in: neutron Status: Expired => Confirmed ** Changed in: neutron Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan) ** Tags added: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1419175 Title: Cannot find device "qr-" error message found in logtrace with DVR routers while trying to update arp entry Status in neutron: In Progress Bug description: We have running stable/juno with DVR enabled. During tests, we created router, gateway and instance. There is one qrouter on one compute node was created with RuntimeError: Command: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-086cf9e6-4c43-4b65-b623-fbd5d593f687', 'ip', '-4', 'neigh', 'replace', '10.100.100.13', 'lladdr', 'fa:16:3e:84:fe:e4', 'nud', 'permanent', 'dev', 'qr-00d7d90b-01'] Exit code: 1 Stdout: '' Stderr: 'Cannot find device "qr-00d7d90b-01"\n' 2015-02-05 20:48:11.834 27031 ERROR neutron.agent.l3_agent [req-2c71f61b-c036-4d90-bcfd-75ffdd5340ff None] DVR: Failed updating arp entry 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent Traceback (most recent call last): 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/l3_agent.py", line 1719, in _update_arp_entry 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent device.neigh.add(net.version, ip, mac) 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 515, in add 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent options=[ip_version]) 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 247, in _as_root 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent kwargs.get('use_root_namespace', False)) 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 79, in _as_root 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent log_fail_as_error=self.log_fail_as_error) 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 91, in _execute 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent log_fail_as_error=log_fail_as_error) 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent File "/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 82, in execute 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent raise RuntimeError(m) 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent RuntimeError: As the result, all future router update failed as well. When the router was removed, the qrouter namespace was left on the compute node as well because of error: 2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent Stderr: 'Cannot find device "qr-00d7d90b-01"\n' Logs also can be read at: http://paste.openstack.org/show/168348/ To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1419175/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1496578] [NEW] SNAT port not found for the given internal port error message seen when gateway is removed for DVR routers.
Public bug reported: Recently the logstash logs showed traces about "SNAT port not found for the given internal port". http://logs.openstack.org/22/219422/13/check/gate-tempest-dsvm-neutron- dvr/e5243b2/logs/screen-q-l3.txt.gz?level=TRACE#_2015-09-15_12_28_08_880 By analyzing the failure it seems when a gateway is removed, the "get_snat_port_for_internal_port" is called without the cache value. This bug was introduced by the patch shown below. Icc099c1a97e3e68eeaf4690bc83167ba30d8099a ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog ** Changed in: neutron Status: New => Confirmed ** Changed in: neutron Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1496578 Title: SNAT port not found for the given internal port error message seen when gateway is removed for DVR routers. Status in neutron: In Progress Bug description: Recently the logstash logs showed traces about "SNAT port not found for the given internal port". http://logs.openstack.org/22/219422/13/check/gate-tempest-dsvm- neutron- dvr/e5243b2/logs/screen-q-l3.txt.gz?level=TRACE#_2015-09-15_12_28_08_880 By analyzing the failure it seems when a gateway is removed, the "get_snat_port_for_internal_port" is called without the cache value. This bug was introduced by the patch shown below. Icc099c1a97e3e68eeaf4690bc83167ba30d8099a To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1496578/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1493524] [NEW] IPv6 support for DVR routers
Public bug reported: This bug would capture all the IPv6 related work on DVR routers going forward. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1493524 Title: IPv6 support for DVR routers Status in neutron: In Progress Bug description: This bug would capture all the IPv6 related work on DVR routers going forward. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1493524/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1475011] [NEW] FloatingIPsTestJson tests fail with DVR routers
Public bug reported: FloatingIPsTestJSON tests fail with DVR routers. In this test suite test_associate_already_associated_floating_ip and test_associate_disassociate_floating_ip are the tests that are failing with Internal Server Error when trying to delete the floatingip_agent_gateway_port. Floatingip_agent_gateway_port calls _delete_port and so the ML2PLugin throws attribute not found error with recent changes. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: New ** Tags: l3-dvr-backlog ** Changed in: neutron Assignee: (unassigned) = Swaminathan Vasudevan (swaminathan-vasudevan) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1475011 Title: FloatingIPsTestJson tests fail with DVR routers Status in neutron: New Bug description: FloatingIPsTestJSON tests fail with DVR routers. In this test suite test_associate_already_associated_floating_ip and test_associate_disassociate_floating_ip are the tests that are failing with Internal Server Error when trying to delete the floatingip_agent_gateway_port. Floatingip_agent_gateway_port calls _delete_port and so the ML2PLugin throws attribute not found error with recent changes. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1475011/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1456755] Re: Could not retrieve gateway port for subnet
*** This bug is a duplicate of bug 1404823 *** https://bugs.launchpad.net/bugs/1404823 ** This bug is no longer a duplicate of bug 1456756 Could not retrieve gateway port for subnet ** This bug has been marked a duplicate of bug 1404823 router-interface-add port succeed but does not add corresponding flows -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1456755 Title: Could not retrieve gateway port for subnet Status in OpenStack Neutron (virtual network service): New Bug description: There is this trace at error level in the server logs when DVR is enabled by default: [req-179a109e-456a-4743-8395-58b2f322afe2 None None] Could not retrieve gateway port for subnet {'ipv6_ra_mode': None, 'allocation_pools': [{'start': u'10.100.0.2', 'end': u'10.100.0.14'}], 'host_routes': [], 'ipv6_address_mode': None, 'cidr': u'10.100.0.0/28', 'id': u'8a47789b-452d-4ac7-a85b-9e57838456f0', 'subnetpool_id': None, 'name': u'', 'enable_dhcp': True, 'network_id': u'85389a7f-8f50-405c-a19c-c4ad7b35e9ff', 'tenant_id': u'ec2ad8998456415ea6e8f9a217b5c1d8', 'dns_nameservers': [], 'gateway_ip': u'10.100.0.1', 'ip_version': 4L, 'shared': False} This is the logstash query: http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiQ291bGQgbm90IHJldHJpZXZlIGdhdGV3YXkgcG9ydCBmb3Igc3VibmV0XCIgQU5EIGJ1aWxkX25hbWU6XCJjaGVjay10ZW1wZXN0LWRzdm0tbmV1dHJvbi1kdnJcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTQzMjA2MDU2NzU0MSwibW9kZSI6IiIsImFuYWx5emVfZmllbGQiOiIifQ== To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1456755/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1465434] [NEW] DVR issues with supporting multiple subnets per network on DVR routers
Public bug reported: DVR today has issues with supporting multiple subnets per network on its routers. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1465434 Title: DVR issues with supporting multiple subnets per network on DVR routers Status in OpenStack Neutron (virtual network service): New Bug description: DVR today has issues with supporting multiple subnets per network on its routers. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1465434/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1426165] Re: DVR: Device or resource busy error seen when fip namespace is being deleted
Let us go ahead and close this bug. ** Changed in: neutron Status: New = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1426165 Title: DVR: Device or resource busy error seen when fip namespace is being deleted Status in OpenStack Neutron (virtual network service): Invalid Bug description: How to reproduce - 1. Assign 2 routers with network/subnet/etc sharing the same external network for FIPs to a single agent/host. 2. Disassociate all FIPs 3. FIP namespace should be deleted but the following trace is seen instead 2015-02-26 15:38:34.457 ^[[00;32mDEBUG neutron.agent.l3.dvr_fip_ns [^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mDVR: destroy fip ns: fip-6473ee45-f14f-4b86-a7da-678845a10c08^[[00m ^[[00;33mfrom (pid=6207) destroy /opt/stack/neutron/neutron/agent/l3/dvr_fip_ns.py:153^[[00m 2015-02-26 15:38:34.457 ^[[00;32mDEBUG neutron.agent.linux.utils [^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mRunning command: ['sudo', '/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'delete', 'fip-6473ee45-f14f-4b86-a7da-678845a10c08']^[[00m ^[[00;33mfrom (pid=6207) create_process /opt/stack/neutron/neutron/agent/linux/utils.py:50^[[00m 2015-02-26 15:38:34.651 ^[[01;31mERROR neutron.agent.linux.utils [^[[00;36m-^[[01;31m] ^[[01;35m^[[01;31m Command: ['sudo', '/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'delete', 'fip-6473ee45-f14f-4b86-a7da-678845a10c08'] Exit code: 1 Stdout: Stderr: Cannot remove /var/run/netns/fip-6473ee45-f14f-4b86-a7da-678845a10c08: Device or resource busy ^[[00m 2015-02-26 15:38:34.652 ^[[01;31mERROR neutron.agent.l3.dvr_fip_ns [^[[00;36m-^[[01;31m] ^[[01;35m^[[01;31mFailed trying to delete namespace: fip-6473ee45-f14f-4b86-a7da-678845a10c08^[[00m ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mTraceback (most recent call last): ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00m File /opt/stack/neutron/neutron/agent/l3/dvr_fip_ns.py, line 157, in destroy ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mip_wrapper.netns.delete(ns) ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00m File /opt/stack/neutron/neutron/agent/linux/ip_lib.py, line 541, in delete ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mself._as_root('delete', name, use_root_namespace=True) ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00m File /opt/stack/neutron/neutron/agent/linux/ip_lib.py, line 250, in _as_root ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mkwargs.get('use_root_namespace', False)) ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00m File /opt/stack/neutron/neutron/agent/linux/ip_lib.py, line 72, in _as_root ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mlog_fail_as_error=self.log_fail_as_error) ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00m File /opt/stack/neutron/neutron/agent/linux/ip_lib.py, line 84, in _execute ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mlog_fail_as_error=log_fail_as_error) ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00m File /opt/stack/neutron/neutron/agent/linux/utils.py, line 86, in execute ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mraise RuntimeError(m) ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mRuntimeError: ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mCommand: ['sudo', '/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'delete', 'fip-6473ee45-f14f-4b86-a7da-678845a10c08'] ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mExit code: 1 ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mStdout: ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00mStderr: Cannot remove /var/run/netns/fip-6473ee45-f14f-4b86-a7da-678845a10c08: Device or resource busy ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00m ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns ^[[01;35m^[[00m To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1426165/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1398446] Re: Nova compute failed to delete VM port with DVR
** Changed in: neutron Status: In Progress = Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1398446 Title: Nova compute failed to delete VM port with DVR Status in OpenStack Neutron (virtual network service): Invalid Bug description: This defect is hard to reproduce, only happens when I have more than 3 compute node with DVR enabled. With the following script, run several times, I can see one VM in ERROR state. neutron net-create demo-net netdemoid=$(neutron net-list | awk '{if($4=='demo-net'){print $2;}}') neutron subnet-create demo-net 10.100.100.0/24 --name demo-subnet subnetdemoid=$(neutron subnet-list | awk '{if($4=='demo-subnet'){print $2;}}') neutron router-create demo-router routerdemoid=$(neutron router-list | awk '{if($4=='demo-router'){print $2;}}') exnetid=$(neutron net-list | awk '{if($4=='ext-net'){print $2;}}') for i in `seq 1 10`; do #boot vm, and create floating ip nova boot --image cirros --flavor m1.tiny --nic net-id=$netdemoid cirrosdemo${i} cirrosdemoid[i]=$(nova list | awk '{if($4=='cirrosdemo${i}'){print $2;}}') output=$(neutron floatingip-create $exnetid) echo $output floatipid[i]=$(echo $output | awk '{if($2==id){print $4;}}') floatip[i]=$(echo $output | awk '{if($2==floating_ip_address){print $4;}}')a done # Setup router neutron router-gateway-set $routerdemoid $exnetid neutron router-interface-add demo-router $subnetdemoid #wait for VM to be running sleep 30 for i in `seq 1 10`; do cirrosfix=$(nova list | awk '{if($4=='cirrosdemo${i}'){print $12;}}') cirrosfixip=${cirrosfix#*=} output=$(neutron port-list | grep ${cirrosfixip}) echo $output portid=$(echo $output | awk '{print $2;}') neutron floatingip-associate --fixed-ip-address $cirrosfixip ${floatipid[i]} $portid neutron floatingip-delete ${floatipid[i]} nova delete ${cirrosdemoid[i]} done neutron router-interface-delete demo-router $subnetdemoid neutron router-gateway-clear demo-router $netdemoid neutron router-delete demo-router neutron subnet-delete $subnetdemoid neutron net-delete $netdemoid Looking at log file: 2014-11-20 17:25:56.258 31042 DEBUG neutron.openstack.common.lockutils [req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] Got semaphore db-access lock /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/openstack/common/lockutils.py:168 2014-11-20 17:25:56.424 31042 ERROR neutron.api.v2.resource [req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] delete failed 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource Traceback (most recent call last): 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/api/v2/resource.py, line 87, in resource 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource result = method(request=request, **args) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/api/v2/base.py, line 476, in delete 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource obj_deleter(request.context, id, **kwargs) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py, line 1036, in delete_port 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource router_info = l3plugin.dvr_deletens_if_no_vm(context, id) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/db/l3_dvrscheduler_db.py, line 195, in dvr_deletens_if_no_vm 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource port_host) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/db/agents_db.py, line 136, in _get_agent_by_type_and_host 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource Agent.host == host).one() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py, line 2369, in one 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource ret = list(self) 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py, line 2411, in _iter_ 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource self.session._autoflush() 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File /opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py, line 1198, in _autoflush 2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource
[Yahoo-eng-team] [Bug 1431077] [NEW] TRACE: attribute error when trying to fetch the router.snat_namespace.name
Public bug reported: TRACE seen in the vpn-agent log when configured with DVR router. A recent refactoring to the agent have introduced this problem. http://logs.openstack.org/71/130471/6/check/check-tempest-dsvm-neutron- dvr/10208dc/logs/screen-q-vpn.txt.gz?level=TRACE 2015-03-11 14:09:03.570 ERROR neutron.agent.l3.agent [req-1c27f913-7f3c-40ff-8b86-f915fdde4be9 None None] Failed to process compatible router '25160ab1-c55e-424a-b209-a98f6b2bf769' 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent Traceback (most recent call last): 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 895, in _process_router_update 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent self._process_router_if_compatible(router) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 843, in _process_router_if_compatible 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent self._process_added_router(router) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 854, in _process_added_router 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent adv_svc.AdvancedService.after_router_added, ri) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/event_observers.py, line 40, in notify 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent getattr(observer, method_name)(*args, **kwargs) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/vpn_service.py, line 61, in after_router_added 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent device.sync(self.context, [ri.router]) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py, line 431, in inner 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent return f(*args, **kwargs) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/device_drivers/ipsec.py, line 773, in sync 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent self._delete_vpn_processes(sync_router_ids, router_ids) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/device_drivers/ipsec.py, line 795, in _delete_vpn_processes 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent self.ensure_process(process_id) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/device_drivers/ipsec.py, line 643, in ensure_process 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent namespace = self.get_namespace(process_id) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/device_drivers/ipsec.py, line 535, in get_namespace 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent return router.snat_namespace.name 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent AttributeError: 'NoneType' object has no attribute 'name ** Affects: neutron Importance: Undecided Status: New ** Tags: vpnaas ** Tags removed: neutron-vpnaas ** Tags added: vpnaas -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1431077 Title: TRACE: attribute error when trying to fetch the router.snat_namespace.name Status in OpenStack Neutron (virtual network service): New Bug description: TRACE seen in the vpn-agent log when configured with DVR router. A recent refactoring to the agent have introduced this problem. http://logs.openstack.org/71/130471/6/check/check-tempest-dsvm- neutron-dvr/10208dc/logs/screen-q-vpn.txt.gz?level=TRACE 2015-03-11 14:09:03.570 ERROR neutron.agent.l3.agent [req-1c27f913-7f3c-40ff-8b86-f915fdde4be9 None None] Failed to process compatible router '25160ab1-c55e-424a-b209-a98f6b2bf769' 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent Traceback (most recent call last): 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 895, in _process_router_update 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent self._process_router_if_compatible(router) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 843, in _process_router_if_compatible 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent self._process_added_router(router) 2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent File
[Yahoo-eng-team] [Bug 1423422] [NEW] FloatingIP Agent Gateway Port is created for Non-DVR Routers
Public bug reported: FloatingIP Agent Gateway Port is only required for the DVR Routers. A recent patch to remove the RPC dependency to create the FloatingIP Agent Gateway Port has introduced a bug that creates FloatingIP Agent Gateway Port for non DVR routers. Change-Id: Ieaa79c8bf2b1e03bc352f9252ce22286703e3715 This might generate an error when trying to get the L3 agent information from a Compute Node that is not running an L3 Agent in DVR mode in a Multi Node Scenario. This issue may not be visible in a Single Node deployment. In a single node deployment we might see FloatingIP Agent Gateway Ports for Legacy routers which are not utilized. Only DVR routers require L3 agent to be present in the Compute Node. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: New ** Tags: l3-dvr-backlog ** Changed in: neutron Assignee: (unassigned) = Swaminathan Vasudevan (swaminathan-vasudevan) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1423422 Title: FloatingIP Agent Gateway Port is created for Non-DVR Routers Status in OpenStack Neutron (virtual network service): New Bug description: FloatingIP Agent Gateway Port is only required for the DVR Routers. A recent patch to remove the RPC dependency to create the FloatingIP Agent Gateway Port has introduced a bug that creates FloatingIP Agent Gateway Port for non DVR routers. Change-Id: Ieaa79c8bf2b1e03bc352f9252ce22286703e3715 This might generate an error when trying to get the L3 agent information from a Compute Node that is not running an L3 Agent in DVR mode in a Multi Node Scenario. This issue may not be visible in a Single Node deployment. In a single node deployment we might see FloatingIP Agent Gateway Ports for Legacy routers which are not utilized. Only DVR routers require L3 agent to be present in the Compute Node. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1423422/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1421886] [NEW] FloatingIP agent gateway port should delete the FIP Agent gateway port based on the host and the external network id when there are multiple external networks.
Public bug reported: FloatingIP Agent Gateway port should be deleted based on the host and also based on the External network id. In the Multiple external network scenario what happens is there might be more than one FloatingIP Agent Gateway Port on the same host and so it has to be deleted based on the External Network ID. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1421886 Title: FloatingIP agent gateway port should delete the FIP Agent gateway port based on the host and the external network id when there are multiple external networks. Status in OpenStack Neutron (virtual network service): New Bug description: FloatingIP Agent Gateway port should be deleted based on the host and also based on the External network id. In the Multiple external network scenario what happens is there might be more than one FloatingIP Agent Gateway Port on the same host and so it has to be deleted based on the External Network ID. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1421886/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1421497] [NEW] Gateway clear generates a TRACE - AttributeError in get_int_device_name in DVR routers
Public bug reported: A recent change in the agent code have introduced this problem. When a Gateway is cleared from the router, even though there are no existing floating IPs, the external_gateway_removed function in agent.py is calling process_floatingips. That may be the reason for this failure. Stderr: RTNETLINK answers: No such process 2015-02-11 23:12:15.307 2809 ERROR neutron.agent.l3.dvr [-] DVR: removed snat failed 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Traceback (most recent call last): 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr File /opt/stack/new/neutron/neutron/agent/l3/dvr.py, line 197, in _snat_redirect_remove 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr ns_ipd.route.delete_gateway(table=snat_idx) 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr File /opt/stack/new/neutron/neutron/agent/linux/ip_lib.py, line 415, in delete_gateway 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr self._as_root(*args) 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr File /opt/stack/new/neutron/neutron/agent/linux/ip_lib.py, line 253, in _as_root 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr kwargs.get('use_root_namespace', False)) 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr File /opt/stack/new/neutron/neutron/agent/linux/ip_lib.py, line 83, in _as_root 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr log_fail_as_error=self.log_fail_as_error) 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr File /opt/stack/new/neutron/neutron/agent/linux/ip_lib.py, line 95, in _execute 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr log_fail_as_error=log_fail_as_error) 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr File /opt/stack/new/neutron/neutron/agent/linux/utils.py, line 83, in execute 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr raise RuntimeError(m) 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr RuntimeError: 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Command: ['sudo', '/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-1cfe7654-a669-4f73-a21d-d5110d7c0297', 'ip', 'route', 'del', 'default', 'dev', 'qr-467e8832-93', 'table', '547711270'] 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Exit code: 2 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Stdout: 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Stderr: RTNETLINK answers: No such process 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr 2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr 2015-02-11 23:12:18.846 2809 ERROR neutron.agent.l3.agent [-] 'NoneType' object has no attribute 'get_int_device_name' 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent Traceback (most recent call last): 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/common/utils.py, line 342, in call 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent return func(*args, **kwargs) 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 602, in process_router 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent self._process_external(ri) 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 565, in _process_external 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent self._process_external_gateway(ri) 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 503, in _process_external_gateway 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent self.external_gateway_removed(ri, ri.ex_gw_port, interface_name) 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 905, in external_gateway_removed 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent ri, ex_gw_port) 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 694, in _get_external_device_interface_name 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent fip_int = ri.fip_ns.get_int_device_name(ri.router_id) 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent AttributeError: 'NoneType' object has no attribute 'get_int_device_name' 2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent Traceback (most recent call last): File /usr/local/lib/python2.7/dist-packages/eventlet/greenpool.py, line 82, in _spawn_n_impl func(*args, **kwargs) File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 1137, in _process_router_update self._router_removed(update.id) File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 409, in _router_removed self.process_router(ri) File
[Yahoo-eng-team] [Bug 1421011] [NEW] Remove unused RPC methods from the L3_rpc
Public bug reported: Remove unsued RPC methods from the L3_rpc. The get_snat_router_interface_ports is defined but not currently used by any agents. So it need to be cleaned. ** Affects: neutron Importance: Undecided Assignee: Swaminathan Vasudevan (swaminathan-vasudevan) Status: In Progress ** Tags: l3-dvr-backlog ** Tags added: l3-dvr-backlog ** Changed in: neutron Assignee: (unassigned) = Swaminathan Vasudevan (swaminathan-vasudevan) -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1421011 Title: Remove unused RPC methods from the L3_rpc Status in OpenStack Neutron (virtual network service): In Progress Bug description: Remove unsued RPC methods from the L3_rpc. The get_snat_router_interface_ports is defined but not currently used by any agents. So it need to be cleaned. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1421011/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1417386] [NEW] AttributeError: _oslo_messaging_localcontext errors found in neutron l3-agent logs
Public bug reported: This TRACE is seen in many places in the neutron l3-agent logs from the jenkins logs. 2015-02-02 23:29:13.916 ERROR oslo_messaging.rpc.dispatcher [req-ce57a6b0-04fc-41dd-a114-5b69c0ebcf6d FloatingIPsNegativeTestJSON-267704990 FloatingIPsNegativeTestJSON-594738290] Exception during message handling: _oslo_messaging_localcontext_fdc8b1dcad1246c49200370f4281e5d5 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last): 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher File /usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py, line 142, in _dispatch_and_reply 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher executor_callback)) 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher File /usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py, line 188, in _dispatch 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher localcontext.clear_local_context() 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher File /usr/local/lib/python2.7/dist-packages/oslo_messaging/localcontext.py, line 55, in clear_local_context 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher delattr(_STORE, _KEY) 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher AttributeError: _oslo_messaging_localcontext_fdc8b1dcad1246c49200370f4281e5d5 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher I am not sure if there is any other similar bugs that have been reported. ** Affects: neutron Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1417386 Title: AttributeError: _oslo_messaging_localcontext errors found in neutron l3-agent logs Status in OpenStack Neutron (virtual network service): New Bug description: This TRACE is seen in many places in the neutron l3-agent logs from the jenkins logs. 2015-02-02 23:29:13.916 ERROR oslo_messaging.rpc.dispatcher [req-ce57a6b0-04fc-41dd-a114-5b69c0ebcf6d FloatingIPsNegativeTestJSON-267704990 FloatingIPsNegativeTestJSON-594738290] Exception during message handling: _oslo_messaging_localcontext_fdc8b1dcad1246c49200370f4281e5d5 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last): 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher File /usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py, line 142, in _dispatch_and_reply 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher executor_callback)) 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher File /usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py, line 188, in _dispatch 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher localcontext.clear_local_context() 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher File /usr/local/lib/python2.7/dist-packages/oslo_messaging/localcontext.py, line 55, in clear_local_context 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher delattr(_STORE, _KEY) 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher AttributeError: _oslo_messaging_localcontext_fdc8b1dcad1246c49200370f4281e5d5 2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher I am not sure if there is any other similar bugs that have been reported. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1417386/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1415522] [NEW] DVR Tempest Job check-tempest-dsvm-neutron-dvr not stable when compared to the neutron job
Public bug reported: DVR Tempest Job check-tempest-dsvm-neutron-dvr is unstable when compared to the legacy router job. This is very critical to make the DVR job gating. So we need to find out the actual subtest that is causing the failure. ** Affects: neutron Importance: Undecided Status: New ** Tags: l3-dvr-backlog ** Tags added: l3-dvr-backlog -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1415522 Title: DVR Tempest Job check-tempest-dsvm-neutron-dvr not stable when compared to the neutron job Status in OpenStack Neutron (virtual network service): New Bug description: DVR Tempest Job check-tempest-dsvm-neutron-dvr is unstable when compared to the legacy router job. This is very critical to make the DVR job gating. So we need to find out the actual subtest that is causing the failure. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1415522/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp