[Yahoo-eng-team] [Bug 1716401] Re: FWaaS: Ip tables rules do not get updated in case of distributed virtual routers (DVR)

2019-10-01 Thread Swaminathan Vasudevan
*** This bug is a duplicate of bug 1845557 ***
https://bugs.launchpad.net/bugs/1845557

This bug is also a duplicate of
https://bugs.launchpad.net/neutron/+bug/1845557

** This bug is no longer a duplicate of bug 1715395
   FWaaS: Firewall creation fails in case of distributed routers (Pike)
** This bug has been marked a duplicate of bug 1845364
   [fullstack] Race condition when updating the router port information and 
updating the network MTU

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1716401

Title:
  FWaaS: Ip tables rules do not get updated in case of distributed
  virtual routers (DVR)

Status in neutron:
  New

Bug description:
  I have set up an HA/DVR deployment of OpenStack Pike on Ubuntu 16.04
  and enabled FWaaS v1. After applying the Fix from Bug #1715395,
  firewall rules get created in case of HA/DVR, but updates do not have
  any effect, e.g. when you disassociate a firewall from a distributed
  router.

  Use Case:
  1. Set up an HA/DVP deployment of OpenStack Pike.

  2. Create a firewall rule.
  $ neutron firewall-rule-create --name test-rule --protocol icmp --action 
reject
  Created a new firewall_rule:
  ++--+
  | Field  | Value|
  ++--+
  | action | reject   |
  | description|  |
  | destination_ip_address |  |
  | destination_port   |  |
  | enabled| True |
  | firewall_policy_id |  |
  | id | 6c2516cb-b69d-46b6-958e-e47c1cf1709e |
  | ip_version | 4|
  | name   | test-rule|
  | position   |  |
  | project_id | ed2d2efd86dd40e7a45491d8502318d3 |
  | protocol   | icmp |
  | shared | False|
  | source_ip_address  |  |
  | source_port|  |
  | tenant_id  | ed2d2efd86dd40e7a45491d8502318d3 |
  ++--+

  3. Create a firewall policy.
  $ neutron firewall-policy-create --firewall-rules test-rule test-policy
  Created a new firewall_policy:
  ++--+
  | Field  | Value|
  ++--+
  | audited| False|
  | description|  |
  | firewall_rules | 6c2516cb-b69d-46b6-958e-e47c1cf1709e |
  | id | 53a8d733-e81c-4113-9354-d40b5b426e00 |
  | name   | test-policy  |
  | project_id | ed2d2efd86dd40e7a45491d8502318d3 |
  | shared | False|
  | tenant_id  | ed2d2efd86dd40e7a45491d8502318d3 |
  ++--+

  4. Create a firewall.
  $  neutron firewall-create --name test-firewall test-policy
  Created a new firewall:
  ++--+
  | Field  | Value|
  ++--+
  | admin_state_up | True |
  | description|  |
  | firewall_policy_id | 53a8d733-e81c-4113-9354-d40b5b426e00 |
  | id | a468caca-c555-4f89-adbc-bcdbb06a3fca |
  | name   | test-firewall|
  | project_id | ed2d2efd86dd40e7a45491d8502318d3 |
  | router_ids |  |
  | status | INACTIVE |
  | tenant_id  | ed2d2efd86dd40e7a45491d8502318d3 |
  ++--+

  5. Assign the firewall to a distributed router.
  $ neutron firewall-update --router demo-router test-firewall
  Updated firewall: test-firewall

  6. Spawn a virtual machine and assign a floating ip.

  7. Check namespaces on the compute node hosting the virtual machine.
  $ ip netns
  fip-4a3959c3-b011-4bd0-8f4f-f405be92d9ac
  qrouter-09a379b5-907f-4e3e-b29a-8701b82f2641

  8. Check ip tables rules in the router's namespace.
  $ ip netns exec qrouter-09a379b5-907f-4e3e-b29a-8701b82f2641 iptables -n -L -v
  Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
   pkts 

[Yahoo-eng-team] [Bug 1845557] [NEW] DVR: FWaaS rules created for a router after the FIP and VM created, not applied to routers rfp port on router-update

2019-09-26 Thread Swaminathan Vasudevan
Public bug reported:

This was seen in Rocky.

When network, subnet, router and a VM instance created with a FloatingIP
before attaching FireWall rules to the router, causes the Firewall rules
not to be applied to the 'rfp' port for north-south routing when using
Firewall-as-Service in legacy 'iptables' mode.

After applying the Firewall rules to the Router, it is expected that the
router-update would trigger adding the Firewall rules to the existing
routers, but the logic is not right.

Any new VMs added to the subnet on a new compute host, gets the Firewall
rules applied to the 'rfp' interface.

So the only way to get around this problem is to restart the 'l3-agent'.
Once the 'l3-agent' is restarted, the Firewall rules are applied again.

This is also true when Firewall rules are removed after the VM and
routers are in place, since the update is not handled properly, the
firewall rules may stay there until we reboot the l3-agent.

How to reproduce this problem:

This is FWaaS v2 with legacy 'iptables':

1. Create a Network
2. Create a Subnet
3. Create a Router (DVR)
4. Attach the Subnet to the router.
5. Assign the gateway to the router.
6. Create a VM on the given private network.
7. Create a FloatingIP and associate the FloatingIP to the VM's private IP.
8. Now the VM, router, fipnamespace are all in place.
9. Now create Firwall rules 
 neutron firewall-rule-create --protocol icmp --action allow --name allow-icmp
 neutron firewall-rule-create --protocol tcp --destination-port 80 --action 
deny --name deny-http
 neutron firewall-rule-create --protocol tcp --destination-port 22 --action 
allow --name allow-ssh
10. Then create firewall policy
  neutron firewall-policy-create --firewall-rules "allow-icmp deny-http 
allow-ssh" policy-fw
11. Create a firewall
   neutron firewall-create policy-fw --name user-fw
12. Check if the firewall was created:
   neutron firewall-show user-fw
13. If the firewall was created after the router have been created, based on 
the documentation you need to manually update the router.
  $ neutron firewall-update —router  —router  

14. After the update we would expect that all existing router-1 and router-2 to 
have the firewall rules.

But we don't see if configured for the router-1 that was created before the 
firewall was created.
And so the VM is not protected by the Firewall rules.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: Confirmed


** Tags: fwaas l3-dvr-backlog

** Changed in: neutron
 Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan)

** Changed in: neutron
   Status: New => Confirmed

** Summary changed:

- DVR: FWaaS rules created for a router after the FIP and VM created not 
applied to routers rfp port
+ DVR: FWaaS rules created for a router after the FIP and VM created, not 
applied to routers rfp port on router-update

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1845557

Title:
  DVR: FWaaS rules created for a router after the FIP and VM created,
  not applied to routers rfp port on router-update

Status in neutron:
  Confirmed

Bug description:
  This was seen in Rocky.

  When network, subnet, router and a VM instance created with a
  FloatingIP before attaching FireWall rules to the router, causes the
  Firewall rules not to be applied to the 'rfp' port for north-south
  routing when using Firewall-as-Service in legacy 'iptables' mode.

  After applying the Firewall rules to the Router, it is expected that
  the router-update would trigger adding the Firewall rules to the
  existing routers, but the logic is not right.

  Any new VMs added to the subnet on a new compute host, gets the
  Firewall rules applied to the 'rfp' interface.

  So the only way to get around this problem is to restart the
  'l3-agent'. Once the 'l3-agent' is restarted, the Firewall rules are
  applied again.

  This is also true when Firewall rules are removed after the VM and
  routers are in place, since the update is not handled properly, the
  firewall rules may stay there until we reboot the l3-agent.

  How to reproduce this problem:

  This is FWaaS v2 with legacy 'iptables':

  1. Create a Network
  2. Create a Subnet
  3. Create a Router (DVR)
  4. Attach the Subnet to the router.
  5. Assign the gateway to the router.
  6. Create a VM on the given private network.
  7. Create a FloatingIP and associate the FloatingIP to the VM's private IP.
  8. Now the VM, router, fipnamespace are all in place.
  9. Now create Firwall rules 
   neutron firewall-rule-create --protocol icmp --action allow --name allow-icmp
   neutron firewall-rule-create --protocol tcp --destination-port 80 --action 
deny --name deny-http
   neutron firewall-rule-create --protocol tcp --destination-port 22 --action 
allow --name allow-ssh
  10. Then create firewall po

[Yahoo-eng-team] [Bug 1840979] Re: [L2] [opinion] update the port DB status directly in agent-side

2019-08-22 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: New => Opinion

** Changed in: neutron
   Importance: Undecided => Wishlist

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1840979

Title:
  [L2] [opinion] update the port DB status directly in agent-side

Status in neutron:
  Opinion

Bug description:
  When ovs-agent done processing the port, it will call neutron-server to make 
some DB update.
  Especially when restart the ovs-agent, all ports in one agent will do such 
RPC and DB update again to make port status consistent. When a large number of 
concurrent agent restart happen, neutron-server may not work fine.
  So how about making the following DB updating locally in neutron agent side 
directly? It may have some mechanism driver notification, IMO, this can also be 
done in agent-side.

  def update_device_down(self, context, device, agent_id, host=None):
  cctxt = self.client.prepare()
  return cctxt.call(context, 'update_device_down', device=device,
agent_id=agent_id, host=host)

  def update_device_up(self, context, device, agent_id, host=None):
  cctxt = self.client.prepare()
  return cctxt.call(context, 'update_device_up', device=device,
agent_id=agent_id, host=host)

  def update_device_list(self, context, devices_up, devices_down,
  ret = cctxt.call(context, 'update_device_list',

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1840979/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1824571] Re: l3agent can't create router if there are multiple external networks

2019-07-08 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1824571

Title:
  l3agent can't create router if there are multiple external networks

Status in neutron:
  Fix Released

Bug description:
  In case there are more than one external network the l3 agent unable
  to create routers with the following error:

  2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent Traceback (most 
recent call last):
  2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
 line 701, in _process_routers_if_compatible
  2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent 
self._process_router_if_compatible(router)
  2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
 line 548, in _process_router_if_compatible
  2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent target_ex_net_id 
= self._fetch_external_net_id()
  2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent   File 
"/var/lib/kolla/venv/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
 line 376, in _fetch_external_net_id
  2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent raise 
Exception(msg)
  2019-04-12 17:33:18.844 103 ERROR neutron.agent.l3.agent Exception: The 
'gateway_external_network_id' option must be configured for this agent as 
Neutron has more than one external network.

  It happens in DVR scenario on both dvr and dvr_snat agents and it
  started after upgraded from Rocky to Stein, before the upgrade it
  worked fine. The gateway_external_network_id is not set in my config,
  because I want the l3 agent to be able to use multiple external
  networks.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1824571/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1824566] Re: DVR-Nexthop static routes are not getting configured in FIP Namespace when disassociating and reassociating a FloatingIP in Ocata

2019-07-08 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1824566

Title:
  DVR-Nexthop static routes are not getting configured in FIP Namespace
  when disassociating and reassociating a FloatingIP in Ocata

Status in neutron:
  Fix Released

Bug description:
  Nexthop static routes for external network are not getting configured in the 
FIP Namespace table, after disassociating and re-associating a FloatingIP.
  This is seen in Ocata and Newton. Not seen in Pike and later branches.

  Steps to reproduce this problem.
  1. Deploy the devstack cloud with DVR routers
  2. Create a VM
  3. Assign a FloatingIP to the VM.
  4. Now configure a Nexthop static route for the external Network.
  5. Make sure the Nexthop routes are seen in the SNAT Namespace and in the FIP 
Namespace under router  specific lookup table-id.
  6. Now Disassociate the floatingIP.
  7. Make sure that the Nexthop routes are cleared from the FIP Namespace if 
this the only FloatingIP, under the router specific lookup table-id.
  8. Now re-associate the FloatingIP.
  9. Now you will see the 'Nexthop static routes' will be missing in the FIP 
Namespaces router specific lookup table-id.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1824566/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1823314] Re: ha router sometime goes in standby mode in all controllers

2019-07-08 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1823314

Title:
  ha router sometime goes in standby mode in all controllers

Status in neutron:
  Fix Released

Bug description:
  Sometimes when 2 HA routers are created for same tenant in very short
  time, it may happen that both routers will have same vr_id assigned
  thus it will be same application for keepalived and only one of those
  routers will be active on some hosts.

  When I spotted it it looked like:

  [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-2
  
+--+--++---+--+
  | id   | host | 
admin_state_up | alive | ha_state |
  
+--+--++---+--+
  | 0d654b7c-da42-4847-a24f-6d1df804ca3b | controller-1.localdomain | True  
 | :-)   | standby  |
  | 242e1e81-7e4e-466e-8354-a9c46982ff88 | controller-0.localdomain | True  
 | :-)   | active   |
  | 3d241b02-031a-4623-a179-88e1953b3889 | controller-2.localdomain | True  
 | :-)   | standby  |
  
+--+--++---+--+
  [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-1
  
+--+--++---+--+
  | id   | host | 
admin_state_up | alive | ha_state |
  
+--+--++---+--+
  | 3d241b02-031a-4623-a179-88e1953b3889 | controller-2.localdomain | True  
 | :-)   | standby  |
  | 0d654b7c-da42-4847-a24f-6d1df804ca3b | controller-1.localdomain | True  
 | :-)   | standby  |
  | 242e1e81-7e4e-466e-8354-a9c46982ff88 | controller-0.localdomain | True  
 | :-)   | standby  |
  
+--+--++---+--+

  
  And in db it looks like:

  MariaDB [ovs_neutron]> select * from router_extra_attributes;
  
+--+-+++--+-+
  | router_id| distributed | service_router | ha | 
ha_vr_id | availability_zone_hints |
  
+--+-+++--+-+
  | 6ba430d7-2f9d-4e8e-a59f-4d4fb5644a8e |   0 |  0 |  1 |  
  1 | []  |
  | ace64e85-5f3b-4815-aeae-3b54c75ef5eb |   0 |  0 |  1 |  
  1 | []  |
  | cd6b61e1-60c9-47da-8866-169ca29ece20 |   1 |  0 |  0 |  
  0 | []  |
  
+--+-+++--+-+
  3 rows in set (0.01 sec)

  MariaDB [ovs_neutron]> select * from ha_router_vrid_allocations;
  +--+---+
  | network_id   | vr_id |
  +--+---+
  | 45aaae94-ce16-412d-bd74-b3812b16ff6f | 1 |
  +--+---+
  1 row in set (0.01 sec)

  So indeed there is possible race during such creation of 2 different
  routers in very short time.

  But when I then created another router, it was created properly with
  new vr_id and all worked fine for it:

  [stack@undercloud-0 ~]$ neutron l3-agent-list-hosting-router router-3
  
+--+--++---+--+
  | id   | host | 
admin_state_up | alive | ha_state |
  
+--+--++---+--+
  | 0d654b7c-da42-4847-a24f-6d1df804ca3b | controller-1.localdomain | True  
 | :-)   | standby  |
  | 242e1e81-7e4e-466e-8354-a9c46982ff88 | controller-0.localdomain | True  
 | :-)   | active   |
  | 3d241b02-031a-4623-a179-88e1953b3889 | controller-2.localdomain | True  
 | :-)   | standby  |
  
+--+--++---+--+

  MariaDB [ovs_neutron]> select * from ha_router_vrid_allocations;
  +--+---+
  | network_id   | vr_id |
  +--+---+
  | 45aaae94-ce16-412d-bd74-b3812b16ff6f | 1 |
  | 45aaae94-ce16-412d-bd74-b3812b16ff6f | 2 |
  +--+---+

  
  I 

[Yahoo-eng-team] [Bug 1815676] Re: DVR: External process monitor for keepalived should be removed when external gateway is removed for DVR HA routers

2019-05-29 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1815676

Title:
  DVR: External process monitor for keepalived should be removed when
  external gateway is removed for DVR HA routers

Status in neutron:
  Invalid

Bug description:
  External process monitor for keepalived state change should be removed when 
the External Gateway is removed for DVR HA routers.
  We have seen under certain conditions when the SNAT namespace is missing, the 
External process Monitor is try to respawn the keepalived state change monitor 
process within the namespace.
  But the External process monitor does not check for the SNAT namespace and it 
is up to the process that calls it.

  The 'delete' ha-router takes care of cleaning the external process
  monitor subscription for the keepalived state change, but the external
  gateway remove function is not calling this function.

  This is how I was able to reproduce the problem.

  But this is how I was able to reproduce.
  Create HA/DVR routers
  Delete the SNAT Namespace of the routers.
  Also delete the PID files for the ip_monitor under 
/opt/stack/data/neutron/external/pids/ip_monitor pid

  Once deleted I was able to see the log message in the
  neutron-l3.service logs.

  `
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR 
neutron.agent.linux.external_process [-] ip_monitor for router with uuid
  04fabe76-9316-4270-a99f-4f0ccffb8feb not found. The process should not have 
died
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: WARNING 
neutron.agent.linux.external_process [-] Respawning ip_monitor for uui
  d 04fabe76-9316-4270-a99f-4f0ccffb8feb
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG 
neutron.agent.linux.utils [-] Unable to access /opt/stack/data/neutron/e
  xternal/pids/04fabe76-9316-4270-a99f-4f0ccffb8feb.monitor.pid {{(pid=12153) 
get_value_from_file /opt/stack/neutron/neutron/agent/linux/utils
  .py:250}}
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG 
neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip',
  'netns', 'exec', 'snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', 
'neutron-keepalived-state-change', '--router_id=04fabe76-9316-4270-a99f-4f0ccf
  fb8feb', '--namespace=snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', 
'--conf_dir=/opt/stack/data/neutron/ha_confs/04fabe76-9316-4270-a99f-4f0cc
  ffb8feb', '--monitor_interface=ha-4af17105-bd', 
'--monitor_cidr=169.254.0.1/24', 
'--pid_file=/opt/stack/data/neutron/external/pids/04fabe76-
  9316-4270-a99f-4f0ccffb8feb.monitor.pid', 
'--state_path=/opt/stack/data/neutron', '--user=1000', '--group=1004'] 
{{(pid=12153) execute_rootw
  rap_daemon /opt/stack/neutron/neutron/agent/linux/utils.py:103}}
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR 
neutron.agent.linux.utils [-] Exit code: 1; Stdin: ; Stdout: ; Stderr: C
  annot open network namespace "snat-04fabe76-9316-4270-a99f-4f0ccffb8feb": No 
such file or directory
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]:
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG 
oslo_concurrency.lockutils [-] Lock "_check_child_processes" released by
  "neutron.agent.linux.external_process._check_child_processes" :: held 0.007s 
{{(pid=12153) inner /usr/local/lib/python2.7/dist-packages/osl
  o_concurrency/lockutils.py:285}}
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: Traceback (most 
recent call last):
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: File 
"/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 460
  , in fire_timers

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1815676/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1824566] [NEW] DVR-Nexthop static routes are not getting configured in FIP Namespace when disassociating and reassociating a FloatingIP in Ocata

2019-04-12 Thread Swaminathan Vasudevan
Public bug reported:

Nexthop static routes for external network are not getting configured in the 
FIP Namespace table, after disassociating and re-associating a FloatingIP.
This is seen in Ocata and Newton. Not seen in Pike and later branches.

Steps to reproduce this problem.
1. Deploy the devstack cloud with DVR routers
2. Create a VM
3. Assign a FloatingIP to the VM.
4. Now configure a Nexthop static route for the external Network.
5. Make sure the Nexthop routes are seen in the SNAT Namespace and in the FIP 
Namespace under router  specific lookup table-id.
6. Now Disassociate the floatingIP.
7. Make sure that the Nexthop routes are cleared from the FIP Namespace if this 
the only FloatingIP, under the router specific lookup table-id.
8. Now re-associate the FloatingIP.
9. Now you will see the 'Nexthop static routes' will be missing in the FIP 
Namespaces router specific lookup table-id.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: ocata-backport-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1824566

Title:
  DVR-Nexthop static routes are not getting configured in FIP Namespace
  when disassociating and reassociating a FloatingIP in Ocata

Status in neutron:
  New

Bug description:
  Nexthop static routes for external network are not getting configured in the 
FIP Namespace table, after disassociating and re-associating a FloatingIP.
  This is seen in Ocata and Newton. Not seen in Pike and later branches.

  Steps to reproduce this problem.
  1. Deploy the devstack cloud with DVR routers
  2. Create a VM
  3. Assign a FloatingIP to the VM.
  4. Now configure a Nexthop static route for the external Network.
  5. Make sure the Nexthop routes are seen in the SNAT Namespace and in the FIP 
Namespace under router  specific lookup table-id.
  6. Now Disassociate the floatingIP.
  7. Make sure that the Nexthop routes are cleared from the FIP Namespace if 
this the only FloatingIP, under the router specific lookup table-id.
  8. Now re-associate the FloatingIP.
  9. Now you will see the 'Nexthop static routes' will be missing in the FIP 
Namespaces router specific lookup table-id.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1824566/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1821815] [NEW] Gate jobs are failing for stable/ocata

2019-03-26 Thread Swaminathan Vasudevan
Public bug reported:

Some Gate jobs are failing for stable/ocata, is there any known issues
with the stable/ocata branch.

See the patch for details.
https://review.openstack.org/#/c/640176/
https://review.openstack.org/#/c/642363/

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: gate-failure

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1821815

Title:
  Gate jobs are failing for stable/ocata

Status in neutron:
  New

Bug description:
  Some Gate jobs are failing for stable/ocata, is there any known issues
  with the stable/ocata branch.

  See the patch for details.
  https://review.openstack.org/#/c/640176/
  https://review.openstack.org/#/c/642363/

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1821815/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1816698] [NEW] DVR-HA: Removing a router from an agent, does not clear the namespaces on the agent

2019-02-19 Thread Swaminathan Vasudevan
Public bug reported:

Removing an active or a standby ha-router from an agent, does not clear the 
router namespace and the Snat namespaces.
This leads to sometimes having two Active HA routers and two 'ha-interface' in 
the snat namespace for DVR routers.
This can be very easily reproduced.

1. Create a HA-DVR router. ( minimum two node setup with 'dvr_snat' agent mode)
2. Attach interface to the router
3. Attach gateway to the router.
4. Now check the l3-agent-list-hosting-router for router.
5. Then remove the router from one of the agent ( l3-agent-router-remove )
6. Expected result is router namespace and snat namespace to be removed. ( But 
it is removed).
7. At the minimum we should clear the HA interfaces for that agent so that the 
HA router does not get into Active mode again.

This bug might have been introduced by this patch.
https://review.openstack.org/#/c/522362/7

This bug is seen since Ocata/Pike and probably also in master branch.

** Affects: neutron
 Importance: High
 Status: Confirmed


** Tags: l3-dvr-backlog l3-ha

** Changed in: neutron
   Status: New => Confirmed

** Changed in: neutron
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1816698

Title:
  DVR-HA: Removing a router from an agent, does not clear the namespaces
  on the agent

Status in neutron:
  Confirmed

Bug description:
  Removing an active or a standby ha-router from an agent, does not clear the 
router namespace and the Snat namespaces.
  This leads to sometimes having two Active HA routers and two 'ha-interface' 
in the snat namespace for DVR routers.
  This can be very easily reproduced.

  1. Create a HA-DVR router. ( minimum two node setup with 'dvr_snat' agent 
mode)
  2. Attach interface to the router
  3. Attach gateway to the router.
  4. Now check the l3-agent-list-hosting-router for router.
  5. Then remove the router from one of the agent ( l3-agent-router-remove )
  6. Expected result is router namespace and snat namespace to be removed. ( 
But it is removed).
  7. At the minimum we should clear the HA interfaces for that agent so that 
the HA router does not get into Active mode again.

  This bug might have been introduced by this patch.
  https://review.openstack.org/#/c/522362/7

  This bug is seen since Ocata/Pike and probably also in master branch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1816698/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1815676] [NEW] DVR: External process monitor for keepalived should be removed when external gateway is removed for DVR HA routers

2019-02-12 Thread Swaminathan Vasudevan
Public bug reported:

External process monitor for keepalived state change should be removed when the 
External Gateway is removed for DVR HA routers.
We have seen under certain conditions when the SNAT namespace is missing, the 
External process Monitor is try to respawn the keepalived state change monitor 
process within the namespace.
But the External process monitor does not check for the SNAT namespace and it 
is up to the process that calls it.

The 'delete' ha-router takes care of cleaning the external process
monitor subscription for the keepalived state change, but the external
gateway remove function is not calling this function.

This is how I was able to reproduce the problem.

But this is how I was able to reproduce.
Create HA/DVR routers
Delete the SNAT Namespace of the routers.
Also delete the PID files for the ip_monitor under 
/opt/stack/data/neutron/external/pids/ip_monitor pid

Once deleted I was able to see the log message in the neutron-l3.service
logs.

`
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR 
neutron.agent.linux.external_process [-] ip_monitor for router with uuid
04fabe76-9316-4270-a99f-4f0ccffb8feb not found. The process should not have died
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: WARNING 
neutron.agent.linux.external_process [-] Respawning ip_monitor for uui
d 04fabe76-9316-4270-a99f-4f0ccffb8feb
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG 
neutron.agent.linux.utils [-] Unable to access /opt/stack/data/neutron/e
xternal/pids/04fabe76-9316-4270-a99f-4f0ccffb8feb.monitor.pid {{(pid=12153) 
get_value_from_file /opt/stack/neutron/neutron/agent/linux/utils
.py:250}}
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG 
neutron.agent.linux.utils [-] Running command (rootwrap daemon): ['ip',
'netns', 'exec', 'snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', 
'neutron-keepalived-state-change', '--router_id=04fabe76-9316-4270-a99f-4f0ccf
fb8feb', '--namespace=snat-04fabe76-9316-4270-a99f-4f0ccffb8feb', 
'--conf_dir=/opt/stack/data/neutron/ha_confs/04fabe76-9316-4270-a99f-4f0cc
ffb8feb', '--monitor_interface=ha-4af17105-bd', 
'--monitor_cidr=169.254.0.1/24', 
'--pid_file=/opt/stack/data/neutron/external/pids/04fabe76-
9316-4270-a99f-4f0ccffb8feb.monitor.pid', 
'--state_path=/opt/stack/data/neutron', '--user=1000', '--group=1004'] 
{{(pid=12153) execute_rootw
rap_daemon /opt/stack/neutron/neutron/agent/linux/utils.py:103}}
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR 
neutron.agent.linux.utils [-] Exit code: 1; Stdin: ; Stdout: ; Stderr: C
annot open network namespace "snat-04fabe76-9316-4270-a99f-4f0ccffb8feb": No 
such file or directory
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]:
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: DEBUG 
oslo_concurrency.lockutils [-] Lock "_check_child_processes" released by
"neutron.agent.linux.external_process._check_child_processes" :: held 0.007s 
{{(pid=12153) inner /usr/local/lib/python2.7/dist-packages/osl
o_concurrency/lockutils.py:285}}
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: Traceback (most 
recent call last):
Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: File 
"/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 460
, in fire_timers

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1815676

Title:
  DVR: External process monitor for keepalived should be removed when
  external gateway is removed for DVR HA routers

Status in neutron:
  New

Bug description:
  External process monitor for keepalived state change should be removed when 
the External Gateway is removed for DVR HA routers.
  We have seen under certain conditions when the SNAT namespace is missing, the 
External process Monitor is try to respawn the keepalived state change monitor 
process within the namespace.
  But the External process monitor does not check for the SNAT namespace and it 
is up to the process that calls it.

  The 'delete' ha-router takes care of cleaning the external process
  monitor subscription for the keepalived state change, but the external
  gateway remove function is not calling this function.

  This is how I was able to reproduce the problem.

  But this is how I was able to reproduce.
  Create HA/DVR routers
  Delete the SNAT Namespace of the routers.
  Also delete the PID files for the ip_monitor under 
/opt/stack/data/neutron/external/pids/ip_monitor pid

  Once deleted I was able to see the log message in the
  neutron-l3.service logs.

  `
  Oct 04 23:43:39 ubuntu-18-ctlr-rocky neutron-l3-agent[12153]: ERROR 
neutron.agent.linux.external_process [-] ip_monitor for router with uuid
  04fabe76-9316-4270-a99f-4f0ccffb8feb not found. The process should not have 
died

[Yahoo-eng-team] [Bug 1814002] [NEW] Packets getting lost during SNAT with too many connections using the same source and destination on Network Node

2019-01-30 Thread Swaminathan Vasudevan
Public bug reported:

Probably we have a problem with SNAT, with too many connections using the same 
source / destination, on the network nodes.
 
We have reproduced the bug with DNS requests, but we assume that it affects 
other packages as well.
 
When we send a lot of DNS requests, we see that sometimes a packet does not 
pass through the NAT and simply "gets lost".

 
In addition, we can see in the conntrack table that the who "insert_failed" 
increases.
 
ip netns exec snat-848819dc-efa2-45d9-9bc3-d96f093fa87a conntrack -S | grep 
insert_failed | grep -v insert_failed=0
cpu=0   searched=1166140 found=5587918 new=6659 invalid=5 ignore=0 delete=27726 
delete_list=27712 insert=6645 insert_failed=14 drop=0 early_drop=0 error=0 
search_restart=0
cpu=2   searched=12015 found=64626 new=2467 invalid=0 ignore=0 delete=15205 
delete_list=15204 insert=2466 insert_failed=1 drop=0 early_drop=0 error=0 
search_restart=0
cpu=3   searched=1348502 found=6097345 new=4093 invalid=0 ignore=0 delete=23200 
delete_list=23173 insert=4066 insert_failed=27 drop=0 early_drop=0 error=0 
search_restart=0
cpu=4   searched=1068516 found=5398514 new=3299 invalid=0 ignore=0 delete=14144 
delete_list=14126 insert=3281 insert_failed=18 drop=0 early_drop=0 error=0 
search_restart=0
cpu=5   searched=2280948 found=9908854 new=6770 invalid=0 ignore=0 delete=17224 
delete_list=17185 insert=6731 insert_failed=39 drop=0 early_drop=0 error=0 
search_restart=0
cpu=6   searched=1123341 found=5264368 new=9749 invalid=0 ignore=0 delete=17272 
delete_list=17247 insert=9724 insert_failed=25 drop=0 early_drop=0 error=0 
search_restart=0
cpu=7   searched=1553934 found=7234262 new=8734 invalid=0 ignore=0 delete=15658 
delete_list=15634 insert=8710 insert_failed=24 drop=0 early_drop=0 error=0 
search_restart=0

This might be a generic problem with conntrack and linux. 
We suspect that we encounter the following "limitation / bug" in the kernel:
https://github.com/torvalds/linux/blob/24de3d377539e384621c5b8f8f8d8d01852dddc8/net/netfilter/nf_nat_core.c#L290-L291
 
There seems to be a workaround to alleviate this behavior by setting the 
-random-fully flag in iptables. Unfortunately, this is only available since 
iptables 1.6.2.

Also this is not currently supported in neutron for the SNAT rules, it
just uses the --to-source.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1814002

Title:
  Packets getting lost during SNAT with too many connections using the
  same source and destination on Network Node

Status in neutron:
  New

Bug description:
  Probably we have a problem with SNAT, with too many connections using the 
same source / destination, on the network nodes.
   
  We have reproduced the bug with DNS requests, but we assume that it affects 
other packages as well.
   
  When we send a lot of DNS requests, we see that sometimes a packet does not 
pass through the NAT and simply "gets lost".

   
  In addition, we can see in the conntrack table that the who "insert_failed" 
increases.
   
  ip netns exec snat-848819dc-efa2-45d9-9bc3-d96f093fa87a conntrack -S | grep 
insert_failed | grep -v insert_failed=0
  cpu=0   searched=1166140 found=5587918 new=6659 invalid=5 ignore=0 
delete=27726 delete_list=27712 insert=6645 insert_failed=14 drop=0 early_drop=0 
error=0 search_restart=0
  cpu=2   searched=12015 found=64626 new=2467 invalid=0 ignore=0 delete=15205 
delete_list=15204 insert=2466 insert_failed=1 drop=0 early_drop=0 error=0 
search_restart=0
  cpu=3   searched=1348502 found=6097345 new=4093 invalid=0 ignore=0 
delete=23200 delete_list=23173 insert=4066 insert_failed=27 drop=0 early_drop=0 
error=0 search_restart=0
  cpu=4   searched=1068516 found=5398514 new=3299 invalid=0 ignore=0 
delete=14144 delete_list=14126 insert=3281 insert_failed=18 drop=0 early_drop=0 
error=0 search_restart=0
  cpu=5   searched=2280948 found=9908854 new=6770 invalid=0 ignore=0 
delete=17224 delete_list=17185 insert=6731 insert_failed=39 drop=0 early_drop=0 
error=0 search_restart=0
  cpu=6   searched=1123341 found=5264368 new=9749 invalid=0 ignore=0 
delete=17272 delete_list=17247 insert=9724 insert_failed=25 drop=0 early_drop=0 
error=0 search_restart=0
  cpu=7   searched=1553934 found=7234262 new=8734 invalid=0 ignore=0 
delete=15658 delete_list=15634 insert=8710 insert_failed=24 drop=0 early_drop=0 
error=0 search_restart=0

  This might be a generic problem with conntrack and linux. 
  We suspect that we encounter the following "limitation / bug" in the kernel:
  
https://github.com/torvalds/linux/blob/24de3d377539e384621c5b8f8f8d8d01852dddc8/net/netfilter/nf_nat_core.c#L290-L291
   
  There seems to be a workaround to alleviate this behavior by setting the 
-random-fully flag in iptables. Unfortunately, this is only available since 
iptables 1.6.2.

  Also this is not currently 

[Yahoo-eng-team] [Bug 1804136] Re: Industry Standard approach for DVR E/W routing issue of port/mac movement by vlan based mac learning

2018-11-19 Thread Swaminathan Vasudevan
So if i understand your recommendation are you suggesting that we
completely ignore the HOST MAC change that we make today and just use
VLAN+MAC learning, so that the packets will get out of the host with its
own MAC.

What will happen to the switch learning that would happen in the
intermediate physical switches that would connect the hosts.?

** Changed in: neutron
   Status: New => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1804136

Title:
  Industry Standard approach for DVR E/W routing issue of port/mac
  movement by vlan based mac learning

Status in neutron:
  Opinion

Bug description:
  Problem statement:

  In the current implementation of DVR E/W Routing when the DVR instance
  having same mac running in multiple compute node will create mac
  movement in the br-int bridge. The way we addressed this issue doesn't
  follow any l2/l3 standard. I am proposing a simpler solution for this.

  Proposal: Keep br-int as vlan+mac based learning switch. And, set DVR
  port connected with br-int as tagged.

  Scenario: Please refer https://assafmuller.com/2015/04/ for a
  diagrammatic view. Say, blue host running in left compute node trying
  to reach orange host running in right compute node. Both the compute
  node running DVR and do E/W routing.  Blue host subnet vlan is 10, and
  Orange host subnet vlan is 20.

  Packet Forwarding:

  1. When vlan based mac learning happens in both br-int bridges, there will be 
two entries with same DVR mac one with vlan 10 and other with 20. Thus no 
mac-movement issue will not arise.
   
  2. When packets send by blue host having vlan 10 reaches the left-dvr, it 
will route the packet and it send out with vlan 20 to Orange host.

  3. br-int in right side will also have two mac entries for the same
  MAC one for vlan 10 and another for vlan 20.

  4. Since DVR has access to connected to both vlans, packets from
  blue/orange host have to hop only the DVR in its compute node.

  Please review this proposal will it work and simplify the DVR E/W
  routing.

  Thanks
  Subbu
  iimksu...@gmail.com

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1804136/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1606741] Re: Metadata service for instances is unavailable when the l3-agent on the compute host is dvr_snat mode

2018-10-26 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: In Progress => Confirmed

** Changed in: neutron
   Status: Confirmed => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1606741

Title:
  Metadata service for instances is unavailable when the l3-agent on the
  compute host  is dvr_snat mode

Status in neutron:
  Won't Fix

Bug description:
  In my mitaka environment, there are five nodes here, including
  controller, network1, network2, computer1, computer2 node. I start
  l3-agents with dvr_snat mode in all network and compute nodes and set
  enable_metadata_proxy to true in l3-agent.ini. It works well for most
  neutron services unless the metadata proxy service. When I run command
  "curl http://169.254.169.254; in an instance booting from cirros, it
  returns "curl: couldn't connect to host" and the instance can't fetch
  metadata in its first booting.

  * Pre-conditions: start l3-agent with dvr_snat mode in all computer
  and network nodes and set enable_metadata_proxy to true in
  l3-agent.ini.

  * Step-by-step reproduction steps:
  1.create a network and a subnet under this network;
  2.create a router;
  3.add the subnet to the router
  4.create an instance with cirros (or other images) on this subnet
  5.open the console for this instance and run command 'curl 
http://169.254.169.254' in bash, waiting for result.

  * Expected output: this command should return the true metadata info
  with the command  'curl http://169.254.169.254'

  * Actual output:  the command actually returns "curl: couldn't connect
  to host"

  * Version:
    ** Mitaka
    ** All hosts are centos7

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1606741/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1797037] [NEW] Extra routes configured on routers are not set in the router namespace and snat namespace with DVR-HA routers

2018-10-10 Thread Swaminathan Vasudevan
Public bug reported:

When DVR routers are configured for HA and if we try to add an extra
route to the DVR routers, the extra route is not set in the router
namespace or in the snat namespace.

Configure for HA and DVR
1. Create Router
2. Attach Interface
3. Try to add an extra route with destination and nexthop.
4. You can see the routes in the router dict, but it is missing in the router 
namespace on the 'dvr-snat' node.
The routes are handled properly on the compute nodes that are running as 'dvr' 
or 'dvr_no_external' agent modes.

** Affects: neutron
 Importance: Medium
 Status: Confirmed


** Tags: l3-dvr-backlog l3-ha

** Changed in: neutron
   Status: New => Confirmed

** Changed in: neutron
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1797037

Title:
  Extra routes configured on routers are not set in the router namespace
  and snat namespace with DVR-HA routers

Status in neutron:
  Confirmed

Bug description:
  When DVR routers are configured for HA and if we try to add an extra
  route to the DVR routers, the extra route is not set in the router
  namespace or in the snat namespace.

  Configure for HA and DVR
  1. Create Router
  2. Attach Interface
  3. Try to add an extra route with destination and nexthop.
  4. You can see the routes in the router dict, but it is missing in the router 
namespace on the 'dvr-snat' node.
  The routes are handled properly on the compute nodes that are running as 
'dvr' or 'dvr_no_external' agent modes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1797037/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1716782] Re: DVR multinode job has linuxbridge agent mech driver defined

2018-08-29 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1716782

Title:
  DVR multinode job has linuxbridge agent mech driver defined

Status in neutron:
  Fix Released

Bug description:
  There's an ML2 port binding error being generated in the DVR multinode
  job.

  http://logs.openstack.org/50/502850/2/check/gate-grenade-dsvm-neutron-
  dvr-multinode-ubuntu-xenial-
  nv/9d7ab88/logs/screen-q-svc.txt.gz#_Sep_12_16_36_00_230807

  DEBUG neutron.plugins.ml2.drivers.mech_agent [None req-a68ce697-14fd-
  497a-9bb6-b55a899a8d54 None None] Port
  1f89c2a7-de09-49f7-a298-37b22d37192c on network
  3608f5e9-c01b-4791-9a18-90461a614fa8 not bound, no agent of type Linux
  bridge agent registered on host ubuntu-xenial-2-node-rax-dfw-10898269
  {{(pid=26134) bind_port
  /opt/stack/new/neutron/neutron/plugins/ml2/drivers/mech_agent.py:102}}

  ERROR neutron.plugins.ml2.managers [None req-a68ce697-14fd-497a-
  9bb6-b55a899a8d54 None None] Failed to bind port
  1f89c2a7-de09-49f7-a298-37b22d37192c on host ubuntu-xenial-2-node-rax-
  dfw-10898269 for vnic_type normal using segments []

  This is happening because by default, devstack enables the linuxbridge
  mechanism driver when DVR mode != legacy, even though the linuxbridge
  agent doesn't work with DVR.

  Devstack needs to change to not enable the driver in this case.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1716782/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1779194] [NEW] neutron-lbaas haproxy agent, when configured with allow_automatic_lbaas_agent_failover = True, after failover, when the failed agent restarts or reconnects to Rabb

2018-06-28 Thread Swaminathan Vasudevan
Public bug reported:

When we configure two or more lbaas haproxy agents with high
availability by setting the  allow_automatic_lbaas_agent_failover to
True for failover, then the LBaaS fails over to an available active
agent, either when the agent is not responsive or the agent lost
connection with RabitMQ.

This works exactly as per the expectation.

But when the dead agent comes up active and when it trys to re-sync the
state with the server, the agent finds the LBaaS configured or
associated with that agent is an 'Orphan' and tries to clean up the
Orphan LBaaS.

In the process of cleaning it up, it tries to unplug the VIF port, which
affects the other agent that is hosting the LBaaS.

When the VIF port is unplugged, the port device_owner changes and it
causes other issues.

So there should be a check before the VIF port is removed, to make sure,
if there is an active agent using the port. In that case the VIF port
should not be unplugged.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: lbaas

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1779194

Title:
  neutron-lbaas haproxy agent, when configured with
  allow_automatic_lbaas_agent_failover = True,  after failover, when the
  failed agent restarts or reconnects to RabbitMQ, it tries to unplug
  the vif port without checking if it is used by other agent

Status in neutron:
  New

Bug description:
  When we configure two or more lbaas haproxy agents with high
  availability by setting the  allow_automatic_lbaas_agent_failover to
  True for failover, then the LBaaS fails over to an available active
  agent, either when the agent is not responsive or the agent lost
  connection with RabitMQ.

  This works exactly as per the expectation.

  But when the dead agent comes up active and when it trys to re-sync
  the state with the server, the agent finds the LBaaS configured or
  associated with that agent is an 'Orphan' and tries to clean up the
  Orphan LBaaS.

  In the process of cleaning it up, it tries to unplug the VIF port,
  which affects the other agent that is hosting the LBaaS.

  When the VIF port is unplugged, the port device_owner changes and it
  causes other issues.

  So there should be a check before the VIF port is removed, to make
  sure, if there is an active agent using the port. In that case the VIF
  port should not be unplugged.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1779194/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1778643] [NEW] DVR: Fip gateway port is tagged as DEAD port by OVS when external_bridge is configured

2018-06-26 Thread Swaminathan Vasudevan
Public bug reported:

When external bridge is configured in Neutron, the FIP Agent Gateway
port 'fg-' is tagged as a DEAD port with Vlan id of 4095.

This issue is seen in Pike.

It seems that there was fix that recently merged in neutron shown below
https://review.openstack.org/#/c/564825/10

Based on this patch, the 4095 vlan tag for the 'qg-' is removed when
external bridge is configured. But it has not been handled for the DVR
FIP agent gateway port.

So we are seeing the port as DEAD always, when external bridge such as
'br-vlan1087' is configured.

Bridge "br-vlan1087"
Port "br-vlan1087"
Interface "br-vlan1087"
type: internal
Port "vlan1087"
Interface "vlan1087"
Port "fg-0a4a425d-d5"
tag: 4095
Interface "fg-0a4a425d-d5"
type: internal
ovs_version: "2.7.0"

** Affects: neutron
 Importance: High
 Status: Confirmed


** Tags: l3-dvr-backlog

** Changed in: neutron
   Status: New => Confirmed

** Changed in: neutron
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1778643

Title:
  DVR: Fip gateway port is tagged as DEAD port by OVS when
  external_bridge is configured

Status in neutron:
  Confirmed

Bug description:
  When external bridge is configured in Neutron, the FIP Agent Gateway
  port 'fg-' is tagged as a DEAD port with Vlan id of 4095.

  This issue is seen in Pike.

  It seems that there was fix that recently merged in neutron shown below
  https://review.openstack.org/#/c/564825/10

  Based on this patch, the 4095 vlan tag for the 'qg-' is removed when
  external bridge is configured. But it has not been handled for the DVR
  FIP agent gateway port.

  So we are seeing the port as DEAD always, when external bridge such as
  'br-vlan1087' is configured.

  Bridge "br-vlan1087"
  Port "br-vlan1087"
  Interface "br-vlan1087"
  type: internal
  Port "vlan1087"
  Interface "vlan1087"
  Port "fg-0a4a425d-d5"
  tag: 4095
  Interface "fg-0a4a425d-d5"
  type: internal
  ovs_version: "2.7.0"

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1778643/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1776984] [NEW] DVR: Self recover from the loss of 'fg' ports in FIP Namespace

2018-06-14 Thread Swaminathan Vasudevan
Public bug reported:

Sometimes we have seen the 'fg' ports within the fip-namespace either goes 
down, not created in time or getting deleted due to some race conditions.
When this happens, the code tries to recover itself after couple of exceptions 
when there is a router_update message.

But after recovery we could see that the fip-namespace is recreated and
the 'fg-' port is plugged in and active, but the 'fpr' and the 'rfp'
ports are missing which leads to the FloatingIP failure.

So we need to fix this issue, if this happens, then it should check for
all the ports within the 'fipnamespace' and recreate the necessary
plumbing.

Here is the error log we have been seeing when the 'fg' port was
missing.

http://paste.openstack.org/show/723505/

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog pike-backport-potential queens-backport-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1776984

Title:
  DVR: Self recover from the loss of 'fg' ports in FIP Namespace

Status in neutron:
  New

Bug description:
  Sometimes we have seen the 'fg' ports within the fip-namespace either goes 
down, not created in time or getting deleted due to some race conditions.
  When this happens, the code tries to recover itself after couple of 
exceptions when there is a router_update message.

  But after recovery we could see that the fip-namespace is recreated
  and the 'fg-' port is plugged in and active, but the 'fpr' and the
  'rfp' ports are missing which leads to the FloatingIP failure.

  So we need to fix this issue, if this happens, then it should check
  for all the ports within the 'fipnamespace' and recreate the necessary
  plumbing.

  Here is the error log we have been seeing when the 'fg' port was
  missing.

  http://paste.openstack.org/show/723505/

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1776984/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1776566] [NEW] DVR: FloatingIP create throws an error if the L3 agent is not running in the given host

2018-06-12 Thread Swaminathan Vasudevan
Public bug reported:

FloatingIP create throws an error if the L3 agent is not running on the given 
host for DVR Routers.
This can be reproduced by
1. Configure the global router settings to be 'Legacy' CVR routers.
2. Then configure a DVR Router by manually setting '--distributed = True' from 
CLI.
3. Create a network
4. Create a Subnet
5. Attach the subnet to the DVR router
6. Configure the Gateway for the Router.
7. Then create a VM on the created Subnet
8. Now create a FloatingIP and associate it with the VM port.
9. You would see an 'Internal Server Error' while creating the FloatingIP.

~/devstack$ neutron floatingip-associate 1cafc567-c6fc-4424-9c44-ab7d90bc6ce0 
5c95fa16-a8cc-4d93-8f31-988f692e01ae
neutron CLI is deprecated and will be removed in the future. Use openstack CLI 
instead.
Request Failed: internal server error while processing your request.


The reason is before creating the 'FloatingIP Agent Gateway Port' it checks for 
the Agent type by the given host, and it raises an Exception since the Agent is 
not running on the Compute Host.

This is basically a Test Error, but still we should handle the error
condition and not throw an Internal Server Error.

** Affects: neutron
 Importance: Low
 Status: Confirmed


** Tags: l3-dvr-backlog

** Changed in: neutron
   Status: New => Confirmed

** Changed in: neutron
   Importance: Undecided => Critical

** Changed in: neutron
   Importance: Critical => High

** Changed in: neutron
   Importance: High => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1776566

Title:
  DVR: FloatingIP create throws an error if the L3 agent is not running
  in the given host

Status in neutron:
  Confirmed

Bug description:
  FloatingIP create throws an error if the L3 agent is not running on the given 
host for DVR Routers.
  This can be reproduced by
  1. Configure the global router settings to be 'Legacy' CVR routers.
  2. Then configure a DVR Router by manually setting '--distributed = True' 
from CLI.
  3. Create a network
  4. Create a Subnet
  5. Attach the subnet to the DVR router
  6. Configure the Gateway for the Router.
  7. Then create a VM on the created Subnet
  8. Now create a FloatingIP and associate it with the VM port.
  9. You would see an 'Internal Server Error' while creating the FloatingIP.

  ~/devstack$ neutron floatingip-associate 1cafc567-c6fc-4424-9c44-ab7d90bc6ce0 
5c95fa16-a8cc-4d93-8f31-988f692e01ae
  neutron CLI is deprecated and will be removed in the future. Use openstack 
CLI instead.
  Request Failed: internal server error while processing your request.

  
  The reason is before creating the 'FloatingIP Agent Gateway Port' it checks 
for the Agent type by the given host, and it raises an Exception since the 
Agent is not running on the Compute Host.

  This is basically a Test Error, but still we should handle the error
  condition and not throw an Internal Server Error.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1776566/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774463] [NEW] RFE: Add support for IPv6 on DVR Routers for the Fast-path exit

2018-05-31 Thread Swaminathan Vasudevan
Public bug reported:

This RFE is to add support for IPv6 on DVR Routers for the Fast-Path-Exit.
Today DVR support Fast-Path-Exit through the FIP Namespace, but FIP Namespace 
does not support IPv6 addresses for the Link local address and also we don't 
have any ra proxy enabled in the FIP Namespace.
So this RFE should address those issues.

1. Update the link local address for 'rfp' and 'fpr' ports to support both IPv4 
and IPv6.
2. Enable ra proxy in the FIP Namespace and also assign IPv6 address to the FIP 
gateway port.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1774463

Title:
  RFE: Add support for IPv6 on DVR Routers for the Fast-path exit

Status in neutron:
  New

Bug description:
  This RFE is to add support for IPv6 on DVR Routers for the Fast-Path-Exit.
  Today DVR support Fast-Path-Exit through the FIP Namespace, but FIP Namespace 
does not support IPv6 addresses for the Link local address and also we don't 
have any ra proxy enabled in the FIP Namespace.
  So this RFE should address those issues.

  1. Update the link local address for 'rfp' and 'fpr' ports to support both 
IPv4 and IPv6.
  2. Enable ra proxy in the FIP Namespace and also assign IPv6 address to the 
FIP gateway port.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1774463/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1774459] [NEW] RFE: Update permanent ARP entries for allowed_address_pair IPs in DVR Routers

2018-05-31 Thread Swaminathan Vasudevan
Public bug reported:

We have a long term issue with Allowed_address_pairs IP which associated with 
unbound ports and DVR routers.
The ARP entry for the allowed_address_pair IP does not change based on the GARP 
issued by any keepalived instance.

Since DVR does the ARP table update through the control plane, and does
not allow any ARP entry to get out of the node to prevent the router
IP/MAC from polluting the network, there has been always an issue with
this.

A recent patch in master https://review.openstack.org/#/c/550676/ to
address this issue was not successful.

This patch helped in updating the ARP entry dynamically from the GARP
message. But the entry has to be Temporary(NUD - reachable). Only if it
is set to 'reachable' we were able to update it on the fly  from the
GARP message, without using any external tools.

But the problem here is, when we have VMs residing in two different
subnets (Subnet A and Subnet B) and if a VM from the Subnet B which is
on a different isolated node and is trying to ping the VRRP IP in the
Subnet A, the packet from the VM comes to the router namespace where the
ARP entry for the VRRP IP is available as reachable. While it is
reachable the VM is able to send couple of pings, and later within in 15
sec, the pings timeout.

The reason is that the Router is in turn trying to make sure that if the IP/MAC 
combination for the VRRP IP is still valid or not, since the entry in the ARP 
table is "REACHABLE" and not "PERMANENT". 
When it tries to re-ARP for the IP, the ARP entries are blocked by the DVR flow 
rules in the br-tun and so the ARP timesout and the ARP entry in the Router 
Namespace becomes incomplete.

Option A:
So the way to address this situation is to make use of some GARP sniffer 
tool/utility that would be running in the router namespace to sniff a GARP 
packet with a specific IP as a filter. If that IP is seen in the GARP message, 
the tool/utility should in-turn try to reset the ARP entry for the VRRP IP as 
permanent. ( This is one option ). This is very performance intensive and so 
not sure if it would be helpful. So we should probably make it configurable, so 
that people can use it if required.

Option B:
The other option is, instead of running it on all nodes and in all 
router-namespace, we can probably just run it on the network_node 
router_namespace, or in the network node host, and then send a message to the 
neutron that there was a change in IP/MAC somehow and then neutron will then 
communicate to all the hosts to do an ARP update for the given IP/MAC. ( Just 
an idea not sure how simple it is when compared to the former)


Any ideas or thoughts would be helpful.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1774459

Title:
  RFE: Update permanent ARP entries for allowed_address_pair IPs in DVR
  Routers

Status in neutron:
  New

Bug description:
  We have a long term issue with Allowed_address_pairs IP which associated with 
unbound ports and DVR routers.
  The ARP entry for the allowed_address_pair IP does not change based on the 
GARP issued by any keepalived instance.

  Since DVR does the ARP table update through the control plane, and
  does not allow any ARP entry to get out of the node to prevent the
  router IP/MAC from polluting the network, there has been always an
  issue with this.

  A recent patch in master https://review.openstack.org/#/c/550676/ to
  address this issue was not successful.

  This patch helped in updating the ARP entry dynamically from the GARP
  message. But the entry has to be Temporary(NUD - reachable). Only if
  it is set to 'reachable' we were able to update it on the fly  from
  the GARP message, without using any external tools.

  But the problem here is, when we have VMs residing in two different
  subnets (Subnet A and Subnet B) and if a VM from the Subnet B which is
  on a different isolated node and is trying to ping the VRRP IP in the
  Subnet A, the packet from the VM comes to the router namespace where
  the ARP entry for the VRRP IP is available as reachable. While it is
  reachable the VM is able to send couple of pings, and later within in
  15 sec, the pings timeout.

  The reason is that the Router is in turn trying to make sure that if the 
IP/MAC combination for the VRRP IP is still valid or not, since the entry in 
the ARP table is "REACHABLE" and not "PERMANENT". 
  When it tries to re-ARP for the IP, the ARP entries are blocked by the DVR 
flow rules in the br-tun and so the ARP timesout and the ARP entry in the 
Router Namespace becomes incomplete.

  Option A:
  So the way to address this situation is to make use of some GARP sniffer 
tool/utility that would be running in the router namespace to sniff a GARP 
packet with a specific IP as a filter. If that IP is seen in the GARP 

[Yahoo-eng-team] [Bug 1768919] [NEW] PCI-Passthrough fails when we have Flavor configured and provide a port with vnic_type=direct-physical

2018-05-03 Thread Swaminathan Vasudevan
Public bug reported:

PCI-Passthrough of a NIC device to the VM fails, when we have both the
Flavor configured with Alias and also provide a network port with
'vnic_type=direct-physical'.


The comment shown in the source code shown below,

https://github.com/openstack/nova/blob/644ac5ec37903b0a08891cc403c8b3b63fc2a91c/nova/compute/api.py#L812
# PCI requests come from two sources: instance flavor and
# requested_networks. The first call in below returns an
# InstancePCIRequests object which is a list of InstancePCIRequest
# objects. The second call in below creates an InstancePCIRequest
# object for each SR-IOV port, and append it to the list in the
# InstancePCIRequests object

In this case there would be two PCI-requests for the same device and
_test_pci fails when the compute tries to check for the Claims.

088d81f6653242318245b137b1ef91c7] _test_pci 
/opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:201
2018-04-30 22:17:06.058 13396 DEBUG nova.compute.claims 
[req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 
088d81f6653242318245b137b1ef91c7] pci requests: 
[InstancePCIRequest(alias_name='intel10fb',count=1,is_new=False,request_id=None,spec=[{dev_type='type-PF',product_id='10fb',vendor_id='8086'}]),
 
InstancePCIRequest(alias_name=None,count=1,is_new=False,request_id=13befe5f-478f-4f4c-aa72-78cce84d942d,spec=[{dev_type='type-PF',physical_network='physnet2'}])]
 _test_pci 
/opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:202
2018-04-30 22:17:06.059 13396 DEBUG nova.compute.claims 
[req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 
088d81f6653242318245b137b1ef91c7] PCI request stats failed  _test_pci 
/opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:206
2018-04-30 22:17:06.059 13396 DEBUG oslo_concurrency.lockutils 
[req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 
088d81f6653242318245b137b1ef91c7] Lock "compute_resources" released by 
"nova.compute.resource_tracker.instance_claim" :: held 0.059s inner 
/opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:282
2018-04-30 22:17:06.060 13396 DEBUG nova.compute.manager 
[req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 
088d81f6653242318245b137b1ef91c7] [instance: 
39ad3a47-66dc-4114-9653-fee5ee0c87dc] Insufficient compute resources: Claim pci 
failed.. 

Not sure why the Claim pci failed for the same device entry twice.

Probably if the device id is the same on both Flavor and network, then
it should only compose one entry since they both are identical.

** Affects: nova
 Importance: Undecided
 Status: New


** Tags: pci

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1768919

Title:
  PCI-Passthrough fails when we have Flavor configured and provide a
  port with vnic_type=direct-physical

Status in OpenStack Compute (nova):
  New

Bug description:
  PCI-Passthrough of a NIC device to the VM fails, when we have both the
  Flavor configured with Alias and also provide a network port with
  'vnic_type=direct-physical'.

  
  The comment shown in the source code shown below,

  
https://github.com/openstack/nova/blob/644ac5ec37903b0a08891cc403c8b3b63fc2a91c/nova/compute/api.py#L812
  # PCI requests come from two sources: instance flavor and
  # requested_networks. The first call in below returns an
  # InstancePCIRequests object which is a list of InstancePCIRequest
  # objects. The second call in below creates an InstancePCIRequest
  # object for each SR-IOV port, and append it to the list in the
  # InstancePCIRequests object

  In this case there would be two PCI-requests for the same device and
  _test_pci fails when the compute tries to check for the Claims.

  088d81f6653242318245b137b1ef91c7] _test_pci 
/opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:201
  2018-04-30 22:17:06.058 13396 DEBUG nova.compute.claims 
[req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 
088d81f6653242318245b137b1ef91c7] pci requests: 
[InstancePCIRequest(alias_name='intel10fb',count=1,is_new=False,request_id=None,spec=[{dev_type='type-PF',product_id='10fb',vendor_id='8086'}]),
 
InstancePCIRequest(alias_name=None,count=1,is_new=False,request_id=13befe5f-478f-4f4c-aa72-78cce84d942d,spec=[{dev_type='type-PF',physical_network='physnet2'}])]
 _test_pci 
/opt/stack/venv/nova-20180424T164716Z/lib/python2.7/site-packages/nova/compute/claims.py:202
  2018-04-30 22:17:06.059 13396 DEBUG nova.compute.claims 
[req-c7689c16-227a-462e-aad5-4c462036051c df7bd0a08ee64da981574d7a7d76970a 
088d81f6653242318245b137b1ef91c7] PCI request stats 

[Yahoo-eng-team] [Bug 1768917] [NEW] PCI-Passthrough documentation is incorrect while trying to pass through a NIC

2018-05-03 Thread Swaminathan Vasudevan
Public bug reported:

As per the documentation shown below

https://docs.openstack.org/nova/pike/admin/pci-passthrough.html

In order to achieve PCI passthrough of a network device, it states that
we should create a 'flavor' based on the alias and then associate a
flavor to the server create function.

Steps to follow:

Create an Alias:
[pci]
alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", 
"name":"a1" }

Create a Flavor:
[pci]
alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", 
"name":"a1" }

Add a whitelist:
[pci]
passthrough_whitelist = { "address": ":41:00.0" }

Create a Server with the Flavor:

# openstack server create --flavor m1.large --image
cirros-0.3.5-x86_64-uec --wait test-pci


With the above command, the VM creation errors out and we see a 
PortBindingFailure.

The reason for the PortBindingFailure is the 'vif_type' is always set to
'BINDING_FAILED".

The reason being, flavor does not mention about the 'vnic_type'='direct-
physical' without this information the sriov mechanism driver is not
able to bind the port.

Not sure if there is any way to specify the info in the flavor.

** Affects: nova
 Importance: Undecided
 Status: New


** Tags: pci

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1768917

Title:
  PCI-Passthrough documentation is incorrect while trying to pass
  through a NIC

Status in OpenStack Compute (nova):
  New

Bug description:
  As per the documentation shown below

  https://docs.openstack.org/nova/pike/admin/pci-passthrough.html

  In order to achieve PCI passthrough of a network device, it states
  that we should create a 'flavor' based on the alias and then associate
  a flavor to the server create function.

  Steps to follow:

  Create an Alias:
  [pci]
  alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", 
"name":"a1" }

  Create a Flavor:
  [pci]
  alias = { "vendor_id":"8086", "product_id":"154d", "device_type":"type-PF", 
"name":"a1" }

  Add a whitelist:
  [pci]
  passthrough_whitelist = { "address": ":41:00.0" }

  Create a Server with the Flavor:

  # openstack server create --flavor m1.large --image
  cirros-0.3.5-x86_64-uec --wait test-pci

  
  With the above command, the VM creation errors out and we see a 
PortBindingFailure.

  The reason for the PortBindingFailure is the 'vif_type' is always set
  to 'BINDING_FAILED".

  The reason being, flavor does not mention about the 'vnic_type
  '='direct-physical' without this information the sriov mechanism
  driver is not able to bind the port.

  Not sure if there is any way to specify the info in the flavor.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1768917/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1761260] [NEW] DVR: Add a check for the item_allocator IP before trying to release it, since we see a KeyError sometimes, when the item is not there anymore.

2018-04-04 Thread Swaminathan Vasudevan
Public bug reported:

We have seen this Traceback in Pike based installation, while trying to
cleanup a gateway with DVR routers.

2018-04-03 20:30:10.081 9672 DEBUG neutron.agent.l3.dvr_fip_ns [-] Delete FIP 
link interfaces for router: e415276a-4f37-4ee0-ba48-12d3909153c7 
delete_rtr_2_fip_link 
/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-pac
kages/neutron/agent/l3/dvr_fip_ns.py:364
2018-04-03 20:30:10.082 9672 DEBUG neutron.agent.linux.utils [-] Running 
command (rootwrap daemon): ['ip', 'netns', 'exec', 
'qrouter-e415276a-4f37-4ee0-ba48-12d3909153c7', 'ip', '-o', 'link', 'show', 
'rfp-e415276a-4'] execute_ro
otwrap_daemon 
/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info [-] 
u'e415276a-4f37-4ee0-ba48-12d3909153c7': KeyError: 
u'e415276a-4f37-4ee0-ba48-12d3909153c7'
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info Traceback (most 
recent call last):
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info   File 
"/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/common/utils.py",
 line 186, in call
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info return 
func(*args, **kwargs)
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info   File 
"/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
 line 1118, in process_delete
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info 
self._process_external_on_delete()
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info   File 
"/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
 line 890, in _process_external_on_delete
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info 
self._process_external_gateway(ex_gw_port)
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info   File 
"/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
 line 799, in _process_external_gateway
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info 
self.external_gateway_removed(self.ex_gw_port, interface_name)
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info   File 
"/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py",
 line 513, in external_gateway_removed
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info 
self.fip_ns.delete_rtr_2_fip_link(self)
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info   File 
"/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_fip_ns.py",
 line 402, in delete_rtr_2_fip_link
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info 
self.local_subnets.release(ri.router_id)
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info   File 
"/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/l3/item_allocator.py",
 line 116, in release
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info 
self.pool.add(self.allocations.pop(key))
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info KeyError: 
u'e415276a-4f37-4ee0-ba48-12d3909153c7'
2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info

Probably a check to make sure if the Key exists before release would be
a good idea.

We might also see if we can reproduce this in the master branch.

** Affects: neutron
 Importance: Low
 Status: New


** Tags: l3-dvr-backlog

** Changed in: neutron
   Importance: Undecided => Critical

** Changed in: neutron
   Importance: Critical => Low

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1761260

Title:
  DVR: Add a check for the item_allocator IP before trying to release
  it, since we see a KeyError sometimes, when the item is not there
  anymore.

Status in neutron:
  New

Bug description:
  We have seen this Traceback in Pike based installation, while trying
  to cleanup a gateway with DVR routers.

  2018-04-03 20:30:10.081 9672 DEBUG neutron.agent.l3.dvr_fip_ns [-] Delete FIP 
link interfaces for router: e415276a-4f37-4ee0-ba48-12d3909153c7 
delete_rtr_2_fip_link 
/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-pac
  kages/neutron/agent/l3/dvr_fip_ns.py:364
  2018-04-03 20:30:10.082 9672 DEBUG neutron.agent.linux.utils [-] Running 
command (rootwrap daemon): ['ip', 'netns', 'exec', 
'qrouter-e415276a-4f37-4ee0-ba48-12d3909153c7', 'ip', '-o', 'link', 'show', 
'rfp-e415276a-4'] execute_ro
  otwrap_daemon 
/opt/stack/venv/neutron-20180328T152147Z/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108
  2018-04-03 20:30:10.179 9672 ERROR neutron.agent.l3.router_info [-] 

[Yahoo-eng-team] [Bug 1759694] Re: DHCP agent doesn't respawn metadata when enable_isolated_metadata and gateway removed

2018-03-28 Thread Swaminathan Vasudevan
*** This bug is a duplicate of bug 1753540 ***
https://bugs.launchpad.net/bugs/1753540

Cherry-picked to stable/pike
https://review.openstack.org/#/c/557536/

** This bug has been marked a duplicate of bug 1753540
   When isolated metadata is enabled, metadata proxy doesn't get automatically 
started/stopped when needed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1759694

Title:
  DHCP agent doesn't respawn metadata when enable_isolated_metadata and
  gateway removed

Status in neutron:
  New

Bug description:
  Hi,

  We are running Neutron Pike with OVS and DVR.

  When enable_isolated_metadata is True and we remove the gateway port
  for a network from a router, a metadata process is not respawned to
  start serving metadata.

  How to replicate :

  [root@5c1fced0888e /]# openstack network create test_nw
  +---+--+
  | Field | Value|
  +---+--+
  | admin_state_up| UP   |
  | availability_zone_hints   |  |
  | availability_zones|  |
  | created_at| 2018-03-28T21:18:29Z |
  | description   |  |
  | dns_domain|  |
  | id| d19dabb2-f8c8-4608-8387-1f356a9f0f14 |
  | ipv4_address_scope| None |
  | ipv6_address_scope| None |
  | is_default| False|
  | is_vlan_transparent   | None |
  | mtu   | 1500 |
  | name  | test_nw  |
  | port_security_enabled | True |
  | project_id| c053ae2460e741008fa0ea908ae7da8c |
  | provider:network_type | vxlan|
  | provider:physical_network | None |
  | provider:segmentation_id  | 65035|
  | qos_policy_id | None |
  | revision_number   | 2|
  | router:external   | Internal |
  | segments  | None |
  | shared| False|
  | status| ACTIVE   |
  | subnets   |  |
  | tags  |  |
  | updated_at| 2018-03-28T21:18:30Z |
  +---+--+
  [root@5c1fced0888e /]# openstack subnet create --network 
d19dabb2-f8c8-4608-8387-1f356a9f0f14 --subnet-range 10.10.10.0/24 --gateway 
10.10.10.254 test_sn
  +-+--+
  | Field   | Value|
  +-+--+
  | allocation_pools| 10.10.10.1-10.10.10.253  |
  | cidr| 10.10.10.0/24|
  | created_at  | 2018-03-28T21:20:03Z |
  | description |  |
  | dns_nameservers |  |
  | enable_dhcp | True |
  | gateway_ip  | 10.10.10.254 |
  | host_routes |  |
  | id  | 1cd9d1f4-8c43-411b-85db-9514fe7b5e06 |
  | ip_version  | 4|
  | ipv6_address_mode   | None |
  | ipv6_ra_mode| None |
  | name| test_sn  |
  | network_id  | d19dabb2-f8c8-4608-8387-1f356a9f0f14 |
  | project_id  | c053ae2460e741008fa0ea908ae7da8c |
  | revision_number | 0|
  | segment_id  | None |
  | service_types   |  |
  | subnetpool_id   | None |
  | tags|  |
  | updated_at  | 2018-03-28T21:20:03Z |
  | use_default_subnet_pool | None 

[Yahoo-eng-team] [Bug 1758093] [NEW] DVR: RPC error handling missing for get_network_info_for_id

2018-03-22 Thread Swaminathan Vasudevan
Public bug reported:

To avoid exceptions from the l2 agent while trying to access the
'get_network_info_for_id' when the server is not updated, we need to
handle the error case when the oslo_messaging reports an error that the
API not found.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1758093

Title:
  DVR: RPC error handling missing for get_network_info_for_id

Status in neutron:
  In Progress

Bug description:
  To avoid exceptions from the l2 agent while trying to access the
  'get_network_info_for_id' when the server is not updated, we need to
  handle the error case when the oslo_messaging reports an error that
  the API not found.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1758093/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1757188] Re: some L3 HA routers does not work

2018-03-22 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1757188

Title:
  some L3 HA routers does not work

Status in neutron:
  Invalid

Bug description:
  Pike
  DVR + L3_HA
  L2population enabled

  Some of our L3 HA routers are not working correctly. They are not reachable 
from instances.
  After deep investigation, I've found that "HA port tenant " ports 
are in state DOWN.
  They are DOWN because they don't have binding information.
  They don't have binding information because 'HA network tenant ' 
network is corrupted.

  I mean it does not have provider:network_type and
  provider:segmentation_id parameters set.

  The weird thing is that this network was OK and worked but in some
  point in time has been corrupted. I don't have any logs from this
  point in time.

  For comparison working HA tenant network:

  
+---++
  | Field | Value   
   |
  
+---++
  | admin_state_up| True
   |
  | availability_zone_hints   | 
   |
  | availability_zones| nova
   |
  | created_at| 2018-02-16T16:52:31Z
   |
  | description   | 
   |
  | id| fa2fea5c-ccaa-4116-bb0c-ff59bbd8229a
   |
  | ipv4_address_scope| 
   |
  | ipv6_address_scope| 
   |
  | mtu   | 9000
   |
  | name  | HA network tenant 
afeeb372d7934795b63868330eca0dfe |
  | port_security_enabled | True
   |
  | project_id| 
   |
  | provider:network_type | vxlan   
   |
  | provider:physical_network | 
   |
  | provider:segmentation_id  | 35  
   |
  | revision_number   | 3   
   |
  | router:external   | False   
   |
  | shared| False   
   |
  | status| ACTIVE  
   |
  | subnets   | 5cbc612d-13cf-4889-88fb-02d1debe5f8d
   |
  | tags  | 
   |
  | tenant_id | 
   |
  | updated_at| 2018-02-16T16:52:31Z
   |
  
+---++

  and not working HA tenant network:

  
+---++
  | Field | Value   
   |
  
+---++
  | admin_state_up| True
   |
  | availability_zone_hints   | 
   |
  | availability_zones| 
   |
  | created_at| 2018-01-26T12:24:15Z
   |
  | description   | 
   |
  | id| 6390c381-871e-4945-bfa0-00828bb519bc
   |
  | ipv4_address_scope| 
   |
  | ipv6_address_scope| 
   |
  | mtu   | 9000
   |
  | name  | HA network tenant 
3e88cffb9dbb4e1fba96ee72a02e012e |
  | port_security_enabled | True
   |
  | project_id| 
   |
  | provider:network_type | 
   |
  | provider:physical_network | 
   |
  | provider:segmentation_id  | 
   |
  | revision_number   | 5   
   |
  | 

[Yahoo-eng-team] [Bug 1757495] Re: Using dvr and centralized routers in same network fails

2018-03-22 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: Incomplete => Invalid

** Changed in: neutron
   Status: Invalid => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1757495

Title:
  Using dvr and centralized routers in same network fails

Status in neutron:
  Opinion

Bug description:
  Brief overview and reproducing steps:
  1. Create tenant network, let's say 10.3.2.0/24.
  2. Create centralized HA router. Attach it at 10.3.2.1
  3. Boot VM and ping 10.3.2.1 - works.
  4. Create distributed, no-snat router. Attach it at any free IP, e.g. 10.3.2.5
  5. Try to ping 10.3.2.1 from VM - fails. Ping 10.3.2.5 - works.

  I can reproduce this consistently. The setup might be a bit of a corner case:
   - deployment with openstack kolla. Openvswitch.
   - openstack pike. neutron 11.0.2
   - tenant provider networks are vlan
   - there are 2 neutron nodes to host HA routers
   - all compute nodes configured for DVR

  No errors in logs.
  On the compute node hosting the VM, I can see dropped packages on integration 
bridge br-int.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1757495/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1756406] [NEW] DVR: Fix dvr mac address format to be backward compatible with non native openflow interface

2018-03-16 Thread Swaminathan Vasudevan
Public bug reported:

DVR MAC address is configured on the server for every node that is
configured to run in one of the dvr agent modes ( dvr,dvr_snat and
dvr_no_external).

The DVR MAC addresses are stored in the 'AA-BB-CC-DD-EE-FF' format.

When the agent tries to configure the DVR MAC addresses into the
openflow rules using the native interface drivers, they are ok.

But when used with the non native interface drivers this throws an error
as shown below.

Unable to execute ['ovs-ofctl', 'add-flows', 'br-vlan1078', '-'].
Exception: Exit code: 1; Stdin:
hard_timeout=0,idle_timeout=0,priority=2,table=3,cookie=12002607947458125225,dl_src=FA-16
-3F-AA-78-20,actions=output:2; Stdout: ; Stderr: ovs-ofctl: -:1: FA-16
-3F-AA-78-20: invalid Ethernet address.

This is also seen in the Master branch.
So to provide backward compatibility, we need to add a patch to change the 
format of the MAC before it is handed over to the openflow interface driver.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1756406

Title:
  DVR: Fix dvr mac address format to be backward compatible with non
  native openflow interface

Status in neutron:
  In Progress

Bug description:
  DVR MAC address is configured on the server for every node that is
  configured to run in one of the dvr agent modes ( dvr,dvr_snat and
  dvr_no_external).

  The DVR MAC addresses are stored in the 'AA-BB-CC-DD-EE-FF' format.

  When the agent tries to configure the DVR MAC addresses into the
  openflow rules using the native interface drivers, they are ok.

  But when used with the non native interface drivers this throws an
  error as shown below.

  Unable to execute ['ovs-ofctl', 'add-flows', 'br-vlan1078', '-'].
  Exception: Exit code: 1; Stdin:
  
hard_timeout=0,idle_timeout=0,priority=2,table=3,cookie=12002607947458125225,dl_src=FA-16
  -3F-AA-78-20,actions=output:2; Stdout: ; Stderr: ovs-ofctl: -:1: FA-16
  -3F-AA-78-20: invalid Ethernet address.

  This is also seen in the Master branch.
  So to provide backward compatibility, we need to add a patch to change the 
format of the MAC before it is handed over to the openflow interface driver.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1756406/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1657981] Re: FloatingIPs not reachable after restart of compute node (DVR)

2018-03-12 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1657981

Title:
  FloatingIPs not reachable after restart of compute node (DVR)

Status in neutron:
  Invalid

Bug description:
  I am running OpenStack Newton on Ubuntu 16.04 using DVR. When I
  restart a compute node, the FloatingIPs of those vms running on this
  node are unreachable. A manual restart of the service
  "neutron-l3-agent" or "neutron-vpn-agent" running in on node solves
  the issue.

  I think there must be a race condition at startup.

  I get the following error in the neutron-vpn-agent.log:
  2017-01-20 07:04:52.379 2541 INFO neutron.common.config [-] Logging enabled!
  2017-01-20 07:04:52.379 2541 INFO neutron.common.config [-] 
/usr/bin/neutron-vpn-agent version 9.0.0
  2017-01-20 07:04:52.380 2541 WARNING stevedore.named [-] Could not load 
neutron.agent.linux.interface.OVSInterfaceDriver
  2017-01-20 07:04:53.112 2541 WARNING stevedore.named 
[req-a9e10331-51ab-4c67-bfdd-0f6296510594 - - - - -] Could not load 
neutron_fwaas.services.firewall.drivers.linux.iptables_fwaas.IptablesFwaasDriver
  2017-01-20 07:04:53.127 2541 INFO neutron.agent.agent_extensions_manager 
[req-a9e10331-51ab-4c67-bfdd-0f6296510594 - - - - -] Loaded agent extensions: 
['fwaas']
  2017-01-20 07:04:53.128 2541 INFO neutron.agent.agent_extensions_manager 
[req-a9e10331-51ab-4c67-bfdd-0f6296510594 - - - - -] Initializing agent 
extension 'fwaas'
  2017-01-20 07:04:53.163 2541 WARNING oslo_config.cfg 
[req-bdd95fb9-bcd7-473e-a350-3bd8d6be8758 - - - - -] Option 
"external_network_bridge" from group "DEFAULT" is deprecated for removal.  Its 
value may be silently ignored in the future.
  2017-01-20 07:04:53.165 2541 WARNING stevedore.named 
[req-bdd95fb9-bcd7-473e-a350-3bd8d6be8758 - - - - -] Could not load 
neutron_vpnaas.services.vpn.device_drivers.strongswan_ipsec.StrongSwanDriver
  2017-01-20 07:04:53.236 2541 INFO eventlet.wsgi.server [-] (2541) wsgi 
starting up on http:/var/lib/neutron/keepalived-state-change
  2017-01-20 07:04:53.261 2541 INFO neutron.agent.l3.agent [-] Agent has just 
been revived. Doing a full sync.
  2017-01-20 07:04:53.373 2541 INFO neutron.agent.l3.agent [-] L3 agent started
  2017-01-20 07:05:22.832 2541 ERROR neutron.agent.linux.utils [-] Exit code: 
1; Stdin: ; Stdout: ; Stderr: Cannot find device "fg-67afaa06-bb"

  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info [-] Exit 
code: 1; Stdin: ; Stdout: ; Stderr: Cannot find device "fg-67afaa06-bb"
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info Traceback 
(most recent call last):
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/common/utils.py", line 239, in call
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info return 
func(*args, **kwargs)
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/router_info.py", line 1062, 
in process
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info 
self.process_external(agent)
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/dvr_local_router.py", line 
515, in process_external
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info 
self.create_dvr_fip_interfaces(ex_gw_port)
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/dvr_local_router.py", line 
546, in create_dvr_fip_interfaces
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info 
self.fip_ns.update_gateway_port(fip_agent_port)
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/l3/dvr_fip_ns.py", line 239, in 
update_gateway_port
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info 
ipd.route.add_gateway(gw_ip)
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 702, in 
add_gateway
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info 
self._as_root([ip_version], tuple(args))
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 373, in 
_as_root
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info 
use_root_namespace=use_root_namespace)
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info   File 
"/usr/lib/python2.7/dist-packages/neutron/agent/linux/ip_lib.py", line 95, in 
_as_root
  2017-01-20 07:05:22.833 2541 ERROR neutron.agent.l3.router_info 
log_fail_as_error=self.log_fail_as_error)
  2017-01-20 07:05:22.833 2541 ERROR 

[Yahoo-eng-team] [Bug 1716194] Re: IPTables rules are not updated if there is a change in the FWaaS rules when FWaaS is deployed in DVR mode

2018-03-12 Thread Swaminathan Vasudevan
*** This bug is a duplicate of bug 1715395 ***
https://bugs.launchpad.net/bugs/1715395

** This bug is no longer a duplicate of bug 1716401
   FWaaS: Ip tables rules do not get updated in case of distributed virtual 
routers (DVR)
** This bug has been marked a duplicate of bug 1715395
   FWaaS: Firewall creation fails in case of distributed routers (Pike)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1716194

Title:
  IPTables rules are not updated if there is a change in the FWaaS rules
  when FWaaS is deployed in DVR mode

Status in neutron:
  New

Bug description:
  Please see https://bugs.launchpad.net/neutron/+bug/1715395/comments/4
  and https://bugs.launchpad.net/neutron/+bug/1716401 for more
  information about this issue

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1716194/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1751396] [NEW] DVR: Inter Tenant Traffic between two networks and connected through a shared network not reachable with DVR routers

2018-02-23 Thread Swaminathan Vasudevan
Public bug reported:

Inter Tenant Traffic between Two Tenants on two different private
networks connected through a common shared network (created by Admin) is
not route able through DVR routers

Steps to reproduce it:

(NOTE: No external, just shared network)
This is only reproducable in Multinode scenario. ( 1 Controller - 2 compute ).
Make sure that the two VMs are isolated in two different computes.

openstack network create --share shared_net

openstack subnet create shared_net_sn --network shared_net --subnet-
range 172.168.10.0/24


openstack network create net_A
openstack subnet create net_A_sn --network net_A --subnet-range 10.1.0.0/24


openstack network create net_B
openstack subnet create net_B_sn --network net_B --subnet-range 10.2.0.0/24


openstack router create router_A

openstack port create --network=shared_net --fixed-ip 
subnet=shared_net_sn,ip-address=172.168.10.20 port_router_A_shared_net
openstack router add port router_A port_router_A_shared_net
openstack router add subnet router_A net_A_sn

openstack router create router_B
openstack port create --network=shared_net --fixed-ip 
subnet=shared_net_sn,ip-address=172.168.10.30 port_router_B_shared_net
openstack router add port router_B port_router_B_shared_net
openstack router add subnet router_B net_B_sn

openstack server create server_A --flavor m1.tiny --image cirros --nic 
net-id=net_A
openstack server create server_B --flavor m1.tiny --image cirros --nic 
net-id=net_B
  
Add static routes to the router.
openstack router set router_A --route 
destination=10.1.0.0/24,gateway=172.168.10.20
openstack router set router_B --route 
destination=10.2.0.0/24,gateway=172.168.10.30
```

Ping from one instance to the other times out

** Affects: neutron
 Importance: Undecided
 Status: Confirmed


** Tags: l3-dvr-backlog

** Changed in: neutron
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1751396

Title:
  DVR: Inter Tenant Traffic between two networks and connected through a
  shared network not reachable with DVR routers

Status in neutron:
  Confirmed

Bug description:
  Inter Tenant Traffic between Two Tenants on two different private
  networks connected through a common shared network (created by Admin)
  is not route able through DVR routers

  Steps to reproduce it:

  (NOTE: No external, just shared network)
  This is only reproducable in Multinode scenario. ( 1 Controller - 2 compute ).
  Make sure that the two VMs are isolated in two different computes.

  openstack network create --share shared_net

  openstack subnet create shared_net_sn --network shared_net --subnet-
  range 172.168.10.0/24

  
  openstack network create net_A
  openstack subnet create net_A_sn --network net_A --subnet-range 10.1.0.0/24

  
  openstack network create net_B
  openstack subnet create net_B_sn --network net_B --subnet-range 10.2.0.0/24

  
  openstack router create router_A

  openstack port create --network=shared_net --fixed-ip 
subnet=shared_net_sn,ip-address=172.168.10.20 port_router_A_shared_net
  openstack router add port router_A port_router_A_shared_net
  openstack router add subnet router_A net_A_sn

  openstack router create router_B
  openstack port create --network=shared_net --fixed-ip 
subnet=shared_net_sn,ip-address=172.168.10.30 port_router_B_shared_net
  openstack router add port router_B port_router_B_shared_net
  openstack router add subnet router_B net_B_sn

  openstack server create server_A --flavor m1.tiny --image cirros --nic 
net-id=net_A
  openstack server create server_B --flavor m1.tiny --image cirros --nic 
net-id=net_B

  Add static routes to the router.
  openstack router set router_A --route 
destination=10.1.0.0/24,gateway=172.168.10.20
  openstack router set router_B --route 
destination=10.2.0.0/24,gateway=172.168.10.30
  ```

  Ping from one instance to the other times out

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1751396/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1749577] Re: DVR: Static routes are not configured in snat-namespase for DVR Routers

2018-02-14 Thread Swaminathan Vasudevan
User error.

** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1749577

Title:
  DVR: Static routes are not configured in snat-namespase for DVR
  Routers

Status in neutron:
  Invalid

Bug description:
  Static routes are not configured in snat-namespace for DVR routers.

  Steps to reproduce:
  1. Create Network
  2. Create Subnet
  3. Create Router
  4. Add interface to Router
  5. Set gateway for the Router
  6. Add a static route (next hop to the Router)
  7. Go check the 'snat-namespace' if the static routes are configured in there.

  stack@ubuntu-ctlr:~/devstack$ neutron router-update router2-alt-demo --route 
destination=10.3.0.0/24,nexthop=192.168.100.20
  neutron CLI is deprecated and will be removed in the future. Use openstack 
CLI instead.
  Updated router: router2-alt-demo
  stack@ubuntu-ctlr:~/devstack$ 
  stack@ubuntu-ctlr:~/devstack$ 
  stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec 
snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d bash
  root@ubuntu-ctlr:~/devstack# ip route
  default via 192.168.100.9 dev qg-c5919234-7c 
  10.2.0.0/24 via 192.168.100.12 dev qg-c5919234-7c 
  192.168.100.0/24 dev qg-c5919234-7c  proto kernel  scope link  src 
192.168.100.20 
  root@ubuntu-ctlr:~/devstack# 
  root@ubuntu-ctlr:~/devstack# 
  root@ubuntu-ctlr:~/devstack# 
  root@ubuntu-ctlr:~/devstack# ifconfig
  loLink encap:Local Loopback  
inet addr:127.0.0.1  Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING  MTU:65536  Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1 
RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

  qg-c5919234-7c Link encap:Ethernet  HWaddr fa:16:3e:b7:2c:72  
inet addr:192.168.100.20  Bcast:192.168.100.255  Mask:255.255.255.0
inet6 addr: fe80::f816:3eff:feb7:2c72/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:81 errors:0 dropped:3 overruns:0 frame:0
TX packets:74 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1 
RX bytes:5334 (5.3 KB)  TX bytes:5801 (5.8 KB)

  sg-23b90333-cc Link encap:Ethernet  HWaddr fa:16:3e:87:bb:ac  
inet addr:10.2.0.8  Bcast:10.2.0.255  Mask:255.255.255.0
inet6 addr: fe80::f816:3eff:fe87:bbac/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
RX packets:2770 errors:0 dropped:0 overruns:0 frame:0
TX packets:45 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1 
RX bytes:219841 (219.8 KB)  TX bytes:4028 (4.0 KB)

  root@ubuntu-ctlr:~/devstack# ip route
  default via 192.168.100.9 dev qg-c5919234-7c 
  10.2.0.0/24 via 192.168.100.12 dev qg-c5919234-7c 
  192.168.100.0/24 dev qg-c5919234-7c  proto kernel  scope link  src 
192.168.100.20 
  root@ubuntu-ctlr:~/devstack# 
  root@ubuntu-ctlr:~/devstack# exit
  exit
  stack@ubuntu-ctlr:~/devstack$ sudo ip netns
  snat-6a6fdb6e-8284-4439-b5aa-f574d91948ae
  qrouter-6a6fdb6e-8284-4439-b5aa-f574d91948ae
  snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d
  qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d
  qdhcp-a0e5b756-7b19-42b0-8e53-4f23f1adcc72
  snat-9e989be2-b0af-4f0b-8532-a6f307fb40b4
  fip-205f29cd-359c-4f7c-b29e-d276d199640e
  qrouter-9e989be2-b0af-4f0b-8532-a6f307fb40b4
  qdhcp-03a725ab-04b5-4071-884f-00d7a7549e12
  stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec 
qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d bash
  root@ubuntu-ctlr:~/devstack# ip route
  10.2.0.0/24 dev qr-d26ef7c2-18  proto kernel  scope link  src 10.2.0.1 
  169.254.109.46/31 dev rfp-152504be-c  proto kernel  scope link  src 
169.254.109.46 
  root@ubuntu-ctlr:~/devstack# exit
  exit
  stack@ubuntu-ctlr:~/devstack$ sudo ip netns
  snat-6a6fdb6e-8284-4439-b5aa-f574d91948ae
  qrouter-6a6fdb6e-8284-4439-b5aa-f574d91948ae
  snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d
  qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d
  qdhcp-a0e5b756-7b19-42b0-8e53-4f23f1adcc72
  snat-9e989be2-b0af-4f0b-8532-a6f307fb40b4
  fip-205f29cd-359c-4f7c-b29e-d276d199640e
  qrouter-9e989be2-b0af-4f0b-8532-a6f307fb40b4
  qdhcp-03a725ab-04b5-4071-884f-00d7a7549e12
  stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec 
fip-205f29cd-359c-4f7c-b29e-d276d199640e bash
  root@ubuntu-ctlr:~/devstack# ip route
  169.254.93.94/31 dev fpr-6a6fdb6e-8  proto kernel  scope link  src 
169.254.93.95 
  169.254.106.114/31 dev fpr-9e989be2-b  proto kernel  scope link  src 
169.254.106.115 
  169.254.109.46/31 dev fpr-152504be-c  proto kernel  scope link  src 
169.254.109.47 
  192.168.100.0/24 dev fg-a6777b4d-f7  proto kernel  scope link  src 
192.168.100.11 
  root@ubuntu-ctlr:~/devstack# ip rule
  0:from all lookup local 
  

[Yahoo-eng-team] [Bug 1749577] [NEW] DVR: Static routes are not configured in snat-namespase for DVR Routers

2018-02-14 Thread Swaminathan Vasudevan
Public bug reported:

Static routes are not configured in snat-namespace for DVR routers.

Steps to reproduce:
1. Create Network
2. Create Subnet
3. Create Router
4. Add interface to Router
5. Set gateway for the Router
6. Add a static route (next hop to the Router)
7. Go check the 'snat-namespace' if the static routes are configured in there.

stack@ubuntu-ctlr:~/devstack$ neutron router-update router2-alt-demo --route 
destination=10.3.0.0/24,nexthop=192.168.100.20
neutron CLI is deprecated and will be removed in the future. Use openstack CLI 
instead.
Updated router: router2-alt-demo
stack@ubuntu-ctlr:~/devstack$ 
stack@ubuntu-ctlr:~/devstack$ 
stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec 
snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d bash
root@ubuntu-ctlr:~/devstack# ip route
default via 192.168.100.9 dev qg-c5919234-7c 
10.2.0.0/24 via 192.168.100.12 dev qg-c5919234-7c 
192.168.100.0/24 dev qg-c5919234-7c  proto kernel  scope link  src 
192.168.100.20 
root@ubuntu-ctlr:~/devstack# 
root@ubuntu-ctlr:~/devstack# 
root@ubuntu-ctlr:~/devstack# 
root@ubuntu-ctlr:~/devstack# ifconfig
loLink encap:Local Loopback  
  inet addr:127.0.0.1  Mask:255.0.0.0
  inet6 addr: ::1/128 Scope:Host
  UP LOOPBACK RUNNING  MTU:65536  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1 
  RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

qg-c5919234-7c Link encap:Ethernet  HWaddr fa:16:3e:b7:2c:72  
  inet addr:192.168.100.20  Bcast:192.168.100.255  Mask:255.255.255.0
  inet6 addr: fe80::f816:3eff:feb7:2c72/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:81 errors:0 dropped:3 overruns:0 frame:0
  TX packets:74 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1 
  RX bytes:5334 (5.3 KB)  TX bytes:5801 (5.8 KB)

sg-23b90333-cc Link encap:Ethernet  HWaddr fa:16:3e:87:bb:ac  
  inet addr:10.2.0.8  Bcast:10.2.0.255  Mask:255.255.255.0
  inet6 addr: fe80::f816:3eff:fe87:bbac/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
  RX packets:2770 errors:0 dropped:0 overruns:0 frame:0
  TX packets:45 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1 
  RX bytes:219841 (219.8 KB)  TX bytes:4028 (4.0 KB)

root@ubuntu-ctlr:~/devstack# ip route
default via 192.168.100.9 dev qg-c5919234-7c 
10.2.0.0/24 via 192.168.100.12 dev qg-c5919234-7c 
192.168.100.0/24 dev qg-c5919234-7c  proto kernel  scope link  src 
192.168.100.20 
root@ubuntu-ctlr:~/devstack# 
root@ubuntu-ctlr:~/devstack# exit
exit
stack@ubuntu-ctlr:~/devstack$ sudo ip netns
snat-6a6fdb6e-8284-4439-b5aa-f574d91948ae
qrouter-6a6fdb6e-8284-4439-b5aa-f574d91948ae
snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d
qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d
qdhcp-a0e5b756-7b19-42b0-8e53-4f23f1adcc72
snat-9e989be2-b0af-4f0b-8532-a6f307fb40b4
fip-205f29cd-359c-4f7c-b29e-d276d199640e
qrouter-9e989be2-b0af-4f0b-8532-a6f307fb40b4
qdhcp-03a725ab-04b5-4071-884f-00d7a7549e12
stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec 
qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d bash
root@ubuntu-ctlr:~/devstack# ip route
10.2.0.0/24 dev qr-d26ef7c2-18  proto kernel  scope link  src 10.2.0.1 
169.254.109.46/31 dev rfp-152504be-c  proto kernel  scope link  src 
169.254.109.46 
root@ubuntu-ctlr:~/devstack# exit
exit
stack@ubuntu-ctlr:~/devstack$ sudo ip netns
snat-6a6fdb6e-8284-4439-b5aa-f574d91948ae
qrouter-6a6fdb6e-8284-4439-b5aa-f574d91948ae
snat-152504be-c68e-4918-bf8f-d8c4d8c27d4d
qrouter-152504be-c68e-4918-bf8f-d8c4d8c27d4d
qdhcp-a0e5b756-7b19-42b0-8e53-4f23f1adcc72
snat-9e989be2-b0af-4f0b-8532-a6f307fb40b4
fip-205f29cd-359c-4f7c-b29e-d276d199640e
qrouter-9e989be2-b0af-4f0b-8532-a6f307fb40b4
qdhcp-03a725ab-04b5-4071-884f-00d7a7549e12
stack@ubuntu-ctlr:~/devstack$ sudo ip netns exec 
fip-205f29cd-359c-4f7c-b29e-d276d199640e bash
root@ubuntu-ctlr:~/devstack# ip route
169.254.93.94/31 dev fpr-6a6fdb6e-8  proto kernel  scope link  src 
169.254.93.95 
169.254.106.114/31 dev fpr-9e989be2-b  proto kernel  scope link  src 
169.254.106.115 
169.254.109.46/31 dev fpr-152504be-c  proto kernel  scope link  src 
169.254.109.47 
192.168.100.0/24 dev fg-a6777b4d-f7  proto kernel  scope link  src 
192.168.100.11 
root@ubuntu-ctlr:~/devstack# ip rule
0:  from all lookup local 
32766:  from all lookup main 
32767:  from all lookup default 
2852019551: from all iif fpr-6a6fdb6e-8 lookup 2852019551 
2852022899: from all iif fpr-9e989be2-b lookup 2852022899 
2852023599: from all iif fpr-152504be-c lookup 2852023599 
root@ubuntu-ctlr:~/devstack# ip route s t 2852019551
default via 192.168.100.9 dev fg-a6777b4d-f7 
10.3.0.0/24 via 192.168.100.20 dev fg-a6777b4d-f7 
root@ubuntu-ctlr:~/devstack#

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: 

[Yahoo-eng-team] [Bug 1667877] Re: [RFE] Allow DVR for E/W while leaving N/S centralized

2017-10-05 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1667877

Title:
  [RFE] Allow DVR for E/W while leaving N/S centralized

Status in neutron:
  Fix Released

Bug description:
  Use Case
  
  OpenStack is deployed in an L3 fabric so the external network cannot be 
extended to all compute nodes. Even though this means SNAT and floating IP 
traffic (North/South) will be run through a network node with external network 
access, the operator still wants the east/west routing offload offered by DVR.

  So even though the topology does not allow for the N/S DVR direct
  routing, we want to have a way to still take advantage of the E/W
  direct routing.


  Potential Solution
  ==

  Provide a Configurable option to configure Floatingips for DVR based routers 
to reside on Compute Node or on Network Node.
  Also proactively check the status of the agent on the destination node and if 
the agent health is down, then configure the Floatingip on the Network Node.

  Provide a configuration Option in neutron.conf such as

  DVR_FLOATINGIP_CENTRALIZED = 'enforced/circumstantial'

  If DVR_FLOATINGIP_CENTRALIZED is configured as 'enforced' all floatingip will 
be configured on the Network NOde.
  If the DVR_FLOATINGIP_CENTRALIZED is configured as 'circumstantial' based on 
the agent health the floatingip will be configured either in the compute node 
or on the Network Node.

  If this option is not configured, the Floatingip will be distributed
  for all bound ports and for just the unbound ports the floatingip will
  be implemented in the Network Node.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1667877/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1635554] Re: Delete Router / race condition

2017-10-03 Thread Swaminathan Vasudevan
In that case we should close this bug.

** Changed in: neutron
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1635554

Title:
  Delete Router /  race condition

Status in neutron:
  Invalid

Bug description:
  When deleting a router the logfile is filled up.

  
  CentOS7
  Newton(RDO)


  2016-10-21 09:45:02.526 16200 DEBUG neutron.agent.linux.utils [-] Exit code: 
0 execute /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:140
  2016-10-21 09:45:02.526 16200 WARNING neutron.agent.l3.namespaces [-] 
Namespace qrouter-8cf5-5c5c-461c-84f3-c8abeca8f79a does not exist. Skipping 
delete
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent [-] Error while 
deleting router 8cf5-5c5c-461c-84f3-c8abeca8f79a
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent Traceback (most 
recent call last):
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 357, in 
_safe_router_removed
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent 
self._router_removed(router_id)
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 376, in 
_router_removed
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent ri.delete(self)
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 381, in 
delete
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent 
self.destroy_state_change_monitor(self.process_monitor)
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 325, in 
destroy_state_change_monitor
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent pm = 
self._get_state_change_monitor_process_manager()
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 296, in 
_get_state_change_monitor_process_manager
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent 
default_cmd_callback=self._get_state_change_monitor_callback())
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 299, in 
_get_state_change_monitor_callback
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent ha_device = 
self.get_ha_device_name()
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 137, in 
get_ha_device_name
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent return 
(HA_DEV_PREFIX + self.ha_port['id'])[:self.driver.DEV_NAME_LEN]
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent TypeError: 
'NoneType' object has no attribute '__getitem__'
  2016-10-21 09:45:02.527 16200 ERROR neutron.agent.l3.agent
  2016-10-21 09:45:02.528 16200 DEBUG neutron.agent.l3.agent [-] Finished a 
router update for 8cf5-5c5c-461c-84f3-c8abeca8f79a _process_router_update 
/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:504

  
  See full log
  http://paste.openstack.org/show/586656/

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1635554/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1712795] Re: Fail to startup neutron-l3-agent

2017-10-03 Thread Swaminathan Vasudevan
Right now there is no bug fixes are support for mitaka branch. Since
this bug is not seen in the current master and stable branch, so I would
close this bug.

** Changed in: neutron
   Status: Incomplete => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1712795

Title:
  Fail to startup neutron-l3-agent

Status in neutron:
  Invalid

Bug description:
  When try to neutron-l3-agent, it's log output:

  2017-08-17 08:00:09.601 2381 ERROR oslo.messaging._drivers.impl_rabbit 
[req-aa7132bf-38d3-4e1f-9158-8743e2c5d163 - - - - -] AMQP server on 
192.168.25.1:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 
seconds.
  2017-08-17 08:00:10.610 2381 ERROR oslo.messaging._drivers.impl_rabbit 
[req-aa7132bf-38d3-4e1f-9158-8743e2c5d163 - - - - -] AMQP server on 
192.168.25.2:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 20 
seconds.
  2017-08-17 08:00:30.640 2381 INFO oslo.messaging._drivers.impl_rabbit 
[req-aa7132bf-38d3-4e1f-9158-8743e2c5d163 - - - - -] Reconnected to AMQP server 
on 192.168.25.1:5672 via [amqp] client
  2017-08-17 08:00:30.724 2381 INFO eventlet.wsgi.server [-] (2381) wsgi 
starting up on http:/var/lib/neutron/keepalived-state-change
  2017-08-17 08:00:30.766 2381 INFO neutron.agent.l3.agent [-] L3 agent started
  2017-08-17 08:00:30.770 2381 INFO neutron.agent.l3.agent [-] Agent has just 
been revived. Doing a full sync.
  2017-08-17 08:00:35.352 2381 INFO oslo_rootwrap.client [-] Spawned new 
rootwrap daemon process with pid=25789
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task 
[req-104a5ce9-3d9d-4367-8bb1-edb0880ef96f - - - - -] Error during 
L3NATAgentWithStateReport.periodic_sync_routers_task
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task Traceback (most 
recent call last):
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/oslo_service/periodic_task.py", line 220, in 
run_periodic_tasks
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task task(self, 
context)
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 545, in 
periodic_sync_routers_task
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task 
self.fetch_and_sync_all_routers(context, ns_manager)
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 579, in 
fetch_and_sync_all_routers
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task r['id'], 
r.get(l3_constants.HA_ROUTER_STATE_KEY))
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/ha.py", line 120, in 
check_ha_state_for_router
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task if ri and 
current_state != TRANSLATION_MAP[ri.ha_state]:
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 76, in 
ha_state
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task 
ha_state_path = self.keepalived_manager.get_full_config_file_path(
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task AttributeError: 
'NoneType' object has no attribute 'get_full_config_file_path'
  2017-08-17 08:00:35.486 2381 ERROR oslo_service.periodic_task
  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.linux.utils [-] Exit code: 
1; Stdin: ; Stdout: ; Stderr: Cannot open network namespace 
"snat-9ed03dce-1c07-4b65-abe1-ca4f0e8f5d04": No such file or directory

  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent [-] Failed to 
process compatible router '9ed03dce-1c07-4b65-abe1-ca4f0e8f5d04'
  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent Traceback (most 
recent call last):
  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 501, in 
_process_router_update
  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent 
self._process_router_if_compatible(router)
  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 438, in 
_process_router_if_compatible
  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent 
self._process_added_router(router)
  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 443, in 
_process_added_router
  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent 
self._router_added(router['id'], router)
  2017-08-17 08:00:35.495 2381 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 350, in 
_router_added
  2017-08-17 

[Yahoo-eng-team] [Bug 1718788] [NEW] DVR: Migrate centralized unbound floatingip to the respective host when the port is bound

2017-09-21 Thread Swaminathan Vasudevan
Public bug reported:

When unbound ports are associated with floatingIP in DVR, it implements the 
floatingIP in the dvr_snat node under the snat_namespace.
When the private ports are bound to a specific host, the floatingIPs are not 
moved or migrated to their respective hosts.

This can be reproduced by 
1. Create a network
2. Create a subnet
3. Create a router and associate the subnet to the router
4. Assign a gateway to the router.
5. Then create a port on the given network with a specific IP.
6. Now create a FloatingIP on the external network.
7. Associate the FloatingIP to the created port.
8. At this point the port is not bound and so the floatingIP gets implemented 
in the Snat_namespace in the dvr_snat node.
9. Then within a few seconds, we create a VM with the given port-id instead of 
network-id.
10. Now when the VM is built then the port gets bound.
11. Now the floatingIP is not seen on the host where the VM resides.

Theoretically the FloatingIP should be migrated to the host where it is
currently bound.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: Confirmed


** Tags: l3-dvr-backlog

** Changed in: neutron
   Status: New => Confirmed

** Changed in: neutron
 Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1718788

Title:
  DVR: Migrate centralized unbound floatingip to the respective host
  when  the port is bound

Status in neutron:
  Confirmed

Bug description:
  When unbound ports are associated with floatingIP in DVR, it implements the 
floatingIP in the dvr_snat node under the snat_namespace.
  When the private ports are bound to a specific host, the floatingIPs are not 
moved or migrated to their respective hosts.

  This can be reproduced by 
  1. Create a network
  2. Create a subnet
  3. Create a router and associate the subnet to the router
  4. Assign a gateway to the router.
  5. Then create a port on the given network with a specific IP.
  6. Now create a FloatingIP on the external network.
  7. Associate the FloatingIP to the created port.
  8. At this point the port is not bound and so the floatingIP gets implemented 
in the Snat_namespace in the dvr_snat node.
  9. Then within a few seconds, we create a VM with the given port-id instead 
of network-id.
  10. Now when the VM is built then the port gets bound.
  11. Now the floatingIP is not seen on the host where the VM resides.

  Theoretically the FloatingIP should be migrated to the host where it
  is currently bound.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1718788/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1718585] Re: set floatingip status to DOWN during creation

2017-09-21 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: New => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1718585

Title:
  set floatingip status to DOWN during creation

Status in neutron:
  Opinion

Bug description:
  floatingip status is not reliable as it is set to active during creation 
itself [1] rather than waiting for agent [2] to update it once agent finishes 
adding SNAT/DNAT rules.
  [1] https://github.com/openstack/neutron/blob/master/neutron/db/l3_db.py#L1234
  [2] 
https://github.com/openstack/neutron/blob/master/neutron/agent/l3/agent.py#L131

  User can check floatingip status after creation and can initiate data traffic 
before agent finishes 
   processing floatingip resulting in connection failures. Also fixing this can 
help tempest tests to initiate connection only after agent has finished 
floatingip processing and avoid failures.

  Also floatingip status has to be properly updated during migration of
  router.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1718585/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1717302] [NEW] Tempest floatingip scenario tests failing on DVR Multinode setup with HA

2017-09-14 Thread Swaminathan Vasudevan
Public bug reported:

neutron.tests.tempest.scenario.test_floatingip.FloatingIpSameNetwork and
neutron.tests.tempest.scenario.test_floatingip.FloatingIpSeparateNetwork are 
failing on every patch.

This trace is seen on the node-2 l3-agent.

Sep 13 07:16:43.404250 ubuntu-xenial-2-node-rax-dfw-10909819-895688 
neutron-keepalived-state-change[5461]: ERROR neutron.agent.linux.ip_lib [-] 
Failed sending gratuitous ARP to 172.24.5.3 on qg-bf79c157-e2 in namespace 
qrouter-796b8715-ca01-43ad-bc08-f81a0b4db8cc: Exit code: 2; Stdin: ; Stdout: ; 
Stderr: bind: Cannot assign requested address

   : ProcessExecutionError: Exit code: 2; Stdin: ; 
Stdout: ; Stderr: bind: Cannot assign requested address

   ERROR neutron.agent.linux.ip_lib Traceback (most 
recent call last):

   ERROR neutron.agent.linux.ip_lib   File 
"/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 1082, in _arping

   ERROR neutron.agent.linux.ip_lib 
ip_wrapper.netns.execute(arping_cmd, extra_ok_codes=[1])

   ERROR neutron.agent.linux.ip_lib   File 
"/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 901, in execute

   ERROR neutron.agent.linux.ip_lib 
log_fail_as_error=log_fail_as_error, **kwargs)

   ERROR neutron.agent.linux.ip_lib   File 
"/opt/stack/new/neutron/neutron/agent/linux/utils.py", line 151, in execute

   ERROR neutron.agent.linux.ip_lib raise 
ProcessExecutionError(msg, returncode=returncode)

   ERROR neutron.agent.linux.ip_lib 
ProcessExecutionError: Exit code: 2; Stdin: ; Stdout: ; Stderr: bind: Cannot 
assign requested address

   ERROR neutron.agent.linux.ip_lib

   ERROR neutron.agent.linux.ip_lib

If this is a DVR router, then the GARP should not go through the qg
interface for the floatingIP.

More information can be seen here.

http://logs.openstack.org/43/500143/5/check/gate-tempest-dsvm-neutron-
dvr-multinode-scenario-ubuntu-xenial-
nv/0a58fce/logs/subnode-2/screen-q-l3.txt.gz?level=TRACE#_Sep_13_07_16_47_864052

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog l3-ha

** Summary changed:

- Tempest floatingip scenario tests failing on DVR Multinode setup
+ Tempest floatingip scenario tests failing on DVR Multinode setup with HA

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1717302

Title:
  Tempest floatingip scenario tests failing on DVR Multinode setup with
  HA

Status in neutron:
  New

Bug description:
  neutron.tests.tempest.scenario.test_floatingip.FloatingIpSameNetwork and
  neutron.tests.tempest.scenario.test_floatingip.FloatingIpSeparateNetwork are 
failing on every patch.

  This trace is seen on the node-2 l3-agent.

  Sep 13 07:16:43.404250 ubuntu-xenial-2-node-rax-dfw-10909819-895688 
neutron-keepalived-state-change[5461]: ERROR neutron.agent.linux.ip_lib [-] 
Failed sending gratuitous ARP to 172.24.5.3 on qg-bf79c157-e2 in namespace 
qrouter-796b8715-ca01-43ad-bc08-f81a0b4db8cc: Exit code: 2; Stdin: ; Stdout: ; 
Stderr: bind: Cannot assign requested address

 : ProcessExecutionError: Exit code: 2; Stdin: ; 
Stdout: ; Stderr: bind: Cannot assign requested address

 ERROR neutron.agent.linux.ip_lib Traceback (most 
recent call last):

 ERROR neutron.agent.linux.ip_lib   File 
"/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py", line 1082, in _arping

 ERROR 

[Yahoo-eng-team] [Bug 1716829] [NEW] Centralized floatingips not configured right with DVR and HA

2017-09-12 Thread Swaminathan Vasudevan
Public bug reported:

Centralized floatingips are not not configured right with DVR and HA.

add_centralized_floatingip and remove_centralized_floatingip should be
over-ridden in 'dvr_edge_ha_router.py' to configure the 'vips'.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

** Changed in: neutron
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1716829

Title:
  Centralized floatingips not configured right with DVR and HA

Status in neutron:
  In Progress

Bug description:
  Centralized floatingips are not not configured right with DVR and HA.

  add_centralized_floatingip and remove_centralized_floatingip should be
  over-ridden in 'dvr_edge_ha_router.py' to configure the 'vips'.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1716829/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1712728] [NEW] DVR: get_router_cidrs in dvr_edge_router not returning the centralized_floating_ip cidr

2017-08-23 Thread Swaminathan Vasudevan
Public bug reported:

get_router_cidrs over-ridden in dvr_edge_router is not returing the 
centralized_floating_ip cidrs.
So the consequence is the DNAT rules are left over in the snat namespace when 
the centralized_floating_ips are removed.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1712728

Title:
  DVR: get_router_cidrs in dvr_edge_router not returning the
  centralized_floating_ip cidr

Status in neutron:
  New

Bug description:
  get_router_cidrs over-ridden in dvr_edge_router is not returing the 
centralized_floating_ip cidrs.
  So the consequence is the DNAT rules are left over in the snat namespace when 
the centralized_floating_ips are removed.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1712728/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1702790] [NEW] DVR Router update task fails when agent restarts

2017-07-06 Thread Swaminathan Vasudevan
Public bug reported:

When there is a DVR router with gateway enabled, and if the agent
restarts, then the router_update fails and you can see Error log in the
l3_agent.log.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1702790

Title:
  DVR Router update task fails when agent restarts

Status in neutron:
  In Progress

Bug description:
  When there is a DVR router with gateway enabled, and if the agent
  restarts, then the router_update fails and you can see Error log in
  the l3_agent.log.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1702790/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1702769] [NEW] Binding info for DVR port not found error seen when notify_l2pop_port_wiring is called with DVR routers

2017-07-06 Thread Swaminathan Vasudevan
Public bug reported:

A recent change in upstream Icd4cd4e3f735e88299e86468380c5f786e7628fe, might 
have introduced this problem.
Here the 'get_bound_port_context' is being called for non-HA ports and so with 
the given context, it is not able to retrieve the port-binding and throws the 
error.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: Confirmed


** Tags: l3-dvr-backlog

** Changed in: neutron
 Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan)

** Changed in: neutron
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1702769

Title:
  Binding info for DVR port not found error seen when
  notify_l2pop_port_wiring is called with DVR routers

Status in neutron:
  Confirmed

Bug description:
  A recent change in upstream Icd4cd4e3f735e88299e86468380c5f786e7628fe, might 
have introduced this problem.
  Here the 'get_bound_port_context' is being called for non-HA ports and so 
with the given context, it is not able to retrieve the port-binding and throws 
the error.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1702769/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1701288] [NEW] In scale testing RPC timeout error seen in the ovs_neutron_agent when update_device_list is called with DVR routers

2017-06-29 Thread Swaminathan Vasudevan
Public bug reported:

At large scale testing when trying to deploy around 8000 VMs with DVR
routers, we are seeing an RPC Timeout error in ovs_neutron_agent.

This RPC Timeout error occurs when the ovs_neutron_agent tries to bind the vif 
port.
On further analysis it seems that the update_port_status is taking a lot more 
time at large scale to return and so the ovs_neutron_agent timesout waiting for 
the message.

Looking into the update_port_status code path, after the port status
update occurs it calls the update_port_postcommit call. Since L2pop is
enabled by default with DVR, the update_port_postcommit calls
_create_agent_fdb entries for the agent, if this is the first port
associated with the agent.

In _create_agent_fdb it tries to retrieve all the PortInfo associated
with network and this DB call is very expensive and sometimes we have
seen it take upto to 3900s at some instances.


2017-06-15 17:48:30.651 9320 DEBUG neutron.agent.linux.utils 
[req-51df1df5-8a51-4679-938b-895545a225c2 - - - - -] Exit code: 0 execute 
/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/agent/linux/utils.py:146

2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
[req-ece15133-1294-46c0-b0b5-cab785d4314b - - - - -] Error while processing VIF 
ports
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent Traceback (most 
recent call last):
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py",
 line 2044, in rpc_loop
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent port_info, 
ovs_restarted)
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/osprofiler/profiler.py",
 line 154, in wrapper
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent return 
f(*args, **kwargs)
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py",
 line 1648, in process_network_ports
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
failed_devices['added'] |= self._bind_devices(need_binding_devices)
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py",
 line 888, in _bind_devices
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
self.conf.host)
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/agent/rpc.py",
 line 181, in update_device_list
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
agent_id=agent_id, host=host)
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/common/rpc.py",
 line 185, in call
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
time.sleep(wait)
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 220, in __exit__
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
self.force_reraise()
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 196, in force_reraise
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent 
six.reraise(self.type_, self.value, self.tb)
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent   File 
"/opt/stack/venv/neutron-20170421T033908Z/lib/python2.7/site-packages/neutron/common/rpc.py",
 line 162, in call
2017-06-15 17:48:38.420 9320 ERROR 
neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent return 
self._original_context.call(ctxt, method, **kwargs)
2017-06-15 17:48:38.420 9320 ERROR 

[Yahoo-eng-team] [Bug 1695101] [NEW] DVR Router ports and gateway ports are not bound to any host and no snat namespace created

2017-06-01 Thread Swaminathan Vasudevan
Public bug reported:

In the Pike cycle there were some refactoring to the DVR db classes and 
resource handler mixin. 
This lead to the regression where it was not creating the SNAT namespace for 
the DVR routers if it has gateway configured.

The only namespace seen was the fipnamespace.

This was the patch set that caused the regression.
https://review.openstack.org/#/c/457592/5

On further debugging it was found that the snat ports and the
distributed router ports were not host bound. The neutron was trying to
bind it to a 'null' host.

The '_build_routers_list' function in the l3_dvr_db.py was not called
and hence the host binding was missing.

We have seen a similar issue a while back, #1369012 (Fix KeyError on
missing gw_port_host for L3 agent in DVR mode

The issue here is the order of inheritance of the classes. If the order
of inheritance of the classes are messed up, then the functions that are
over-ridden are not called in the right order or skipped.

So with this we have seen the same problem, where the
'_build_routers_list' in the l3_db_gwmode.py was called and not the one
in the 'l3_dvr_db.py' file.

This is the current order of inheritance.

class L3_NAT_with_dvr_db_mixin(l3_db.L3_NAT_db_mixin,
   l3_attrs_db.ExtraAttributesMixin,
   DVRResourceOperationHandler,
  _DVRAgentInterfaceMixin):

If the order is shuffled, it works fine and here is the shuffled order.

class L3_NAT_with_dvr_db_mixin(DVRResourceOperationHandler,
   _DVRAgentInterfaceMixin,
   l3_attrs_db.ExtraAttributesMixin,
   l3_db.L3_NAT_db_mixin):

This seems to fix the problem.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: Confirmed


** Tags: l3-dvr-backlog

** Tags added: l3-dvr-backlog

** Changed in: neutron
 Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan)

** Changed in: neutron
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1695101

Title:
  DVR Router ports and gateway ports are not bound to any host and no
  snat namespace created

Status in neutron:
  Confirmed

Bug description:
  In the Pike cycle there were some refactoring to the DVR db classes and 
resource handler mixin. 
  This lead to the regression where it was not creating the SNAT namespace for 
the DVR routers if it has gateway configured.

  The only namespace seen was the fipnamespace.

  This was the patch set that caused the regression.
  https://review.openstack.org/#/c/457592/5

  On further debugging it was found that the snat ports and the
  distributed router ports were not host bound. The neutron was trying
  to bind it to a 'null' host.

  The '_build_routers_list' function in the l3_dvr_db.py was not called
  and hence the host binding was missing.

  We have seen a similar issue a while back, #1369012 (Fix KeyError on
  missing gw_port_host for L3 agent in DVR mode

  The issue here is the order of inheritance of the classes. If the
  order of inheritance of the classes are messed up, then the functions
  that are over-ridden are not called in the right order or skipped.

  So with this we have seen the same problem, where the
  '_build_routers_list' in the l3_db_gwmode.py was called and not the
  one in the 'l3_dvr_db.py' file.

  This is the current order of inheritance.

  class L3_NAT_with_dvr_db_mixin(l3_db.L3_NAT_db_mixin,
 l3_attrs_db.ExtraAttributesMixin,
 DVRResourceOperationHandler,
_DVRAgentInterfaceMixin):

  If the order is shuffled, it works fine and here is the shuffled
  order.

  class L3_NAT_with_dvr_db_mixin(DVRResourceOperationHandler,
 _DVRAgentInterfaceMixin,
 l3_attrs_db.ExtraAttributesMixin,
 l3_db.L3_NAT_db_mixin):

  This seems to fix the problem.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1695101/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1667877] [NEW] [RFE] DVR support for Configuring Floatingips in Network Node or in the Compute Node based on Config option.

2017-02-24 Thread Swaminathan Vasudevan
Public bug reported:

Provide a Configurable option to configure Floatingips for DVR based routers to 
reside on Compute Node or on Network Node.
Also proactively check the status of the agent on the destination node and if 
the agent health is down, then configure the Floatingip on the Network Node.

Provide a configuration Option in neutron.conf such as

DVR_FLOATINGIP_CENTRALIZED = 'enforced/circumstantial'

If DVR_FLOATINGIP_CENTRALIZED is configured as 'enforced' all floatingip will 
be configured on the Network NOde.
If the DVR_FLOATINGIP_CENTRALIZED is configured as 'circumstantial' based on 
the agent health the floatingip will be configured either in the compute node 
or on the Network Node.

If this option is not configured, the Floatingip will be distributed for
all bound ports and for just the unbound ports the floatingip will be
implemented in the Network Node.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

** Summary changed:

- [RFE] DVR support for Configurable Floatingips in Network Node or in the 
Compute Node.
+ [RFE] DVR support for Configuring Floatingips in Network Node or in the 
Compute Node based on Config option.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1667877

Title:
  [RFE] DVR support for Configuring Floatingips in Network Node or in
  the Compute Node based on Config option.

Status in neutron:
  New

Bug description:
  Provide a Configurable option to configure Floatingips for DVR based routers 
to reside on Compute Node or on Network Node.
  Also proactively check the status of the agent on the destination node and if 
the agent health is down, then configure the Floatingip on the Network Node.

  Provide a configuration Option in neutron.conf such as

  DVR_FLOATINGIP_CENTRALIZED = 'enforced/circumstantial'

  If DVR_FLOATINGIP_CENTRALIZED is configured as 'enforced' all floatingip will 
be configured on the Network NOde.
  If the DVR_FLOATINGIP_CENTRALIZED is configured as 'circumstantial' based on 
the agent health the floatingip will be configured either in the compute node 
or on the Network Node.

  If this option is not configured, the Floatingip will be distributed
  for all bound ports and for just the unbound ports the floatingip will
  be implemented in the Network Node.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1667877/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1524020] Re: DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac-address changes occur

2017-01-11 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1524020

Title:
  DVRImpact:  dvr_vmarp_table_update and dvr_update_router_add_vm is
  called for every port update instead of only when host binding or mac-
  address changes occur

Status in neutron:
  Fix Released
Status in neutron kilo series:
  Fix Released

Bug description:
  DVR arp update (dvr_vmarp_table_update) and dvr_update_router_add_vm
  called for every update_port if the mac_address changes or when
  update_devic_up is true.

  These functions should be called from _notify_l3_agent_port_update,
  only when a host binding for a service port changes or when a
  mac_address for the service port changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1524020/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1554876] Re: router not found warning logs in the L3 agent

2017-01-11 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: Fix Committed => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1554876

Title:
  router not found warning logs in the L3 agent

Status in neutron:
  Fix Released

Bug description:
  The L3 agent during a normal tempest run will be filled with warnings
  like the following:

  2016-03-08 10:10:30.465 18962 WARNING neutron.agent.l3.agent [-] Info for 
router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router 
cleanup
  2016-03-08 10:10:34.197 18962 WARNING neutron.agent.l3.agent [-] Info for 
router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router 
cleanup
  2016-03-08 10:10:35.535 18962 WARNING neutron.agent.l3.agent [-] Info for 
router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router 
cleanup
  2016-03-08 10:10:43.025 18962 WARNING neutron.agent.l3.agent [-] Info for 
router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router 
cleanup
  2016-03-08 10:10:47.029 18962 WARNING neutron.agent.l3.agent [-] Info for 
router 3688a110-8cfe-41c6-84e3-bfd965238304 was not found. Performing router 
cleanup

  
  This is completely normal as routers are deleted from the server during the 
data retrieval process of the L3 agent and should not be at the warning level.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1554876/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1631513] [NEW] DVR: Fix race conditions when trying to add default gateway for fip gateway port.

2016-10-07 Thread Swaminathan Vasudevan
Public bug reported:

There seems to be a race condition when trying to add default gateway
route in fip namespace for the fip agent gateway port.

The way it happens is at high scale testing, when there is a router
update that is currently happening for the Router-A which has a
floatingip, a fip namespace is getting created and gateway ports plugged
to the external bridge in the context of the fip namespace. While it is
getting created, if there is another router update for the same
Router-A, then it calls 'update-gateway-port' and tries to set the
default gateway and fails.

We do find a log message in the l3-agent with  'Failed to process compatible 
router' and also a TRACE in the l3-agent.
Traceback (most recent call last):
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
 line 501, in _process_router_update
 self._process_router_if_compatible(router)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
 line 440, in _process_router_if_compatible
 self._process_updated_router(router)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
 line 454, in _process_updated_router
 ri.process(self)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py",
 line 538, in process
 super(DvrLocalRouter, self).process(agent)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_router_base.py",
 line 31, in process
 super(DvrRouterBase, self).process(agent)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/common/utils.py",
 line 396, in call
 self.logger(e)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 220, in __exit__
 self.force_reraise()
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 196, in force_reraise
 six.reraise(self.type_, self.value, self.tb)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/common/utils.py",
 line 393, in call
 return func(*args, **kwargs)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
 line 989, in process
 self.process_external(agent)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py",
 line 491, in process_external
 self.create_dvr_fip_interfaces(ex_gw_port)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py",
 line 522, in create_dvr_fip_interfaces
 self.fip_ns.update_gateway_port(fip_agent_port)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/l3/dvr_fip_ns.py",
 line 243, in update_gateway_port
 ipd.route.add_gateway(gw_ip)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py",
 line 690, in add_gateway
 self._as_root([ip_version], tuple(args))
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py",
 line 361, in _as_root
 use_root_namespace=use_root_namespace)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py",
 line 94, in _as_root
 log_fail_as_error=self.log_fail_as_error)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py",
 line 103, in _execute
 log_fail_as_error=log_fail_as_error)
   File 
"/opt/stack/venv/neutron-20160927T090820Z/lib/python2.7/site-packages/neutron/agent/linux/utils.py",
 line 140, in execute
 raise RuntimeError(msg)

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog mitaka-backport-potential newton-backport-potential

** Summary changed:

- Fix race conditions when trying to add default gateway for fip gateway port.
+ DVR: Fix race conditions when trying to add default gateway for fip gateway 
port.

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1631513

Title:
  DVR: Fix race conditions when trying to add default gateway for fip
  gateway port.

Status in neutron:
  New

Bug description:
  There seems to be a race condition when trying to add default gateway
  route in fip namespace for the fip agent gateway port.

  The way it happens is at high scale testing, when there is a router
  update that is currently happening for the Router-A which has a
  floatingip, a fip namespace is getting created and gateway ports
  plugged to the external bridge in the context of the fip namespace.
  While it is getting 

[Yahoo-eng-team] [Bug 1593354] Re: SNAT HA failed because of missing nat rule in snat namespace iptable

2016-10-06 Thread Swaminathan Vasudevan
I did verify it in Mitaka and I don't see any issues with the 'sg' port
and related rules with respect to failover.

So we can close this issue as we discussed last week.

** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1593354

Title:
  SNAT HA failed because of missing nat rule in snat namespace iptable

Status in neutron:
  Invalid

Bug description:
  I have a mitaka openstack deployment with neutron DVR enabled. When I
  try to test the snat HA failover I found that even though the snat
  namespace was created on the other backup node, it doesn't has any nat
  rule in snat namespace iptable. And run "ip a" in the sant namespace
  you will find the sg port is missing.

  Here is what I found on the second neutron network node

  sandy-pistachio:/opt/openstack # ip netns
  qrouter-e25b81f9-8810-4654-9be0-ebac09c700fb
  qdhcp-abe36e89-f7a5-4cbd-a7e4-852d80ed92d6
  snat-e25b81f9-8810-4654-9be0-ebac09c700fb

  sandy-pistachio:/opt/openstack # ip netns exec 
snat-e25b81f9-8810-4654-9be0-ebac09c700fb ip a
  1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group 
default
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  inet 127.0.0.1/8 scope host lo
     valid_lft forever preferred_lft forever
  inet6 ::1/128 scope host
     valid_lft forever preferred_lft forever
  70: qg-cc3b2f8c-b7:  mtu 1500 qdisc noqueue 
state UNKNOWN group default
  link/ether fa:16:3e:cb:27:cd brd ff:ff:ff:ff:ff:ff
  inet 10.240.117.98/28 brd 10.240.117.111 scope global qg-cc3b2f8c-b7
     valid_lft forever preferred_lft forever
  inet6 fe80::f816:3eff:fecb:27cd/64 scope link
     valid_lft forever preferred_lft forever

  sandy-pistachio:/opt/openstack # ip netns exec 
snat-e25b81f9-8810-4654-9be0-ebac09c700fb iptables -L -n -v -t nat
  Chain PREROUTING (policy ACCEPT 0 packets, 0 bytes)
   pkts bytes target prot opt in out source   
destination

  Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
   pkts bytes target prot opt in out source   
destination

  Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
   pkts bytes target prot opt in out source   
destination

  Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
   pkts bytes target prot opt in out source   
destination

  Here are the package information:

  provo-pistachio:/opt/openstack # zypper info openstack-neutron
  Loading repository data...
  Reading installed packages...

  
  Information for package openstack-neutron:
  --
  Repository: Mitaka
  Name: openstack-neutron
  Version: 8.1.1~a0~dev32-2.1
  Arch: noarch
  Vendor: obs://build.opensuse.org/Cloud:OpenStack
  Installed: Yes
  Status: up-to-date
  Installed Size: 235.1 KiB
  Summary: OpenStack Network
  Description: 
Neutron is a virtual network service for Openstack.

Just like OpenStack Nova provides an API to dynamically request and
configure virtual servers, Neutron provides an API to dynamically
request and configure virtual networks. These networks connect
"interfaces" from other OpenStack services (e.g., vNICs from Nova VMs).
The Neutron API supports extensions to provide advanced network
capabilities (e.g., QoS, ACLs, network monitoring, etc)

  
  provo-pistachio:/opt/openstack # zypper info 
openstack-neutron-openvswitch-agent
  Loading repository data...
  Reading installed packages...

  
  Information for package openstack-neutron-openvswitch-agent:
  
  Repository: Mitaka
  Name: openstack-neutron-openvswitch-agent
  Version: 8.1.1~a0~dev32-2.1
  Arch: noarch
  Vendor: obs://build.opensuse.org/Cloud:OpenStack
  Installed: Yes
  Status: up-to-date
  Installed Size: 14.9 KiB
  Summary: OpenStack Network - Open vSwitch
  Description: 
This package provides the OpenVSwitch Agent.

  
  provo-pistachio:/opt/openstack # zypper info openstack-neutron-l3-agent
  Loading repository data...
  Reading installed packages...

  
  Information for package openstack-neutron-l3-agent:
  ---
  Repository: Mitaka
  Name: openstack-neutron-l3-agent
  Version: 8.1.1~a0~dev32-2.1
  Arch: noarch
  Vendor: obs://build.opensuse.org/Cloud:OpenStack
  Installed: Yes
  Status: up-to-date
  Installed Size: 24.7 KiB
  Summary: OpenStack Network Service (Neutron) - L3 Agent
  Description: 
This package provides the L3 Agent.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1593354/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More 

[Yahoo-eng-team] [Bug 1476469] Re: with DVR, a VM can't use floatingIP and VPN at the same time

2016-09-28 Thread Swaminathan Vasudevan
VPN is a centralized service and not distributed one. The VPN service is only 
running in the SNAT Namespace and not on the router or fip namespace.
So the fip traffic flowing through the fip namespace or router namespace may 
not go through the IPsec driver that is running in SNAT Namespace.

This is working as per design. If we need to make the VPN for DVR
routers to work with FIP, then we need to first work on running
distributed VPN service.

Until then I would not recommend doing it.

** Changed in: neutron
   Status: Confirmed => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1476469

Title:
  with DVR, a VM can't use floatingIP and VPN at the same time

Status in neutron:
  Opinion

Bug description:
  Now VPN Service is available for Distributed Routers by patch 
#https://review.openstack.org/#/c/143203/, 
  but there is another problem,  with DVR, a VM can't use floatingIP and VPN at 
the same time.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1476469/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1609217] Re: DVR: dvr router should not exist in not-binded network node

2016-08-31 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: In Progress => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1609217

Title:
  DVR: dvr router should not exist in not-binded network node

Status in neutron:
  Invalid

Bug description:
  ENV:
  stable/mitaka
  hosts:
  compute1 (nova-compute, l3-agent (dvr), metedate-agent)
  compute2 (nova-compute, l3-agent (dvr), metedate-agent)
  network1 (l3-agent (dvr_snat), metedata-agent, dhcp-agent)
  network2 (l3-agent(dvr_snat), metedata-agent, dhcp-agent)

  How to reproduce? (scenario 1)
  set: dhcp_agents_per_network = 2

  1. create a DVR router:
  neutron router-create --ha False --distributed True test1

  2. Create a network & subnet with dhcp enabled.
  neutron net-create test1
  neutron subnet-create --enable-dhcp test1 --name test1 192.168.190.0/24

  3. Attach the router and subnet
  neutron router-interface-add test1 subnet=test1

  Then the router test1 will exist in both network1 and network2. But in
  the DB routerl3agentbindings, there is only one record for DVR router
  to one l3 agent.

  http://paste.openstack.org/show/547695/

  And for another scenario 2:
  change the network2 node deployment to only run metedata-agent, dhcp-agent.
  Both in the qdhcp-namespace and the VM could ping each other.
  So qrouter-namespace in the not-binded network node is not used, and should 
not exist.

  Code:
  The function in following position should not return the DVR router id in 
scenario 1.
  
https://github.com/openstack/neutron/blob/master/neutron/db/l3_dvrscheduler_db.py#L263

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1609217/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1614337] Re: L3 agent fails on FIP when DVR and HA both enabled in router

2016-08-29 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: Confirmed => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1614337

Title:
  L3 agent fails on FIP when DVR and HA both enabled in router

Status in neutron:
  Invalid

Bug description:
  I have a vlan-based Neutron configuration.  My tenant networks are
  vlans, and my shared external network (br-ex) is a flat network.
  Neutron is configured for DVR+SNAT mode.  In testing floating IPs,
  I've run into issues with my neutron router, and I've traced it back
  to a single scenario: when the router is both distributed AND ha.  To
  be clear, I've tested all four possibilities:

  "--distributed False --ha False"  == works
  "--distributed True --ha False"  == works
  "--distributed False --ha True"  == works
  "--distributed True --ha True"  == fails

  * I can reproduce this again and again, just by deleting the router I
  have (which implies first clearing its gateway, and removing any
  associated ports), then re-creating the router in any of the four
  configurations above.  Then I boot some VMs, associate a FIP to any
  one of them, and attempt to reach the FIP.  Results are the same
  whether I create the router on the CLI or from within Horizon.

  * Expected result is that I should be able to associate a floating IP
  to a running VM and then ping that floating IP (and ultimately other
  kinds of activity, such as SSH access to the VM).

  
  * Actual result is that the floating IP is completely unreachable from other 
valid IPs within same L2 space.  Additionally, in /var/log/neutron/l3-agent.log 
on the compute node hosting the VM whose associated FIP I can't reach, I find 
this:

  
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent [-] Failed to 
process compatible router '13356ddb-8e36-4f54-b8b2-6a62a5aecf86'
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent Traceback (most 
recent call last):
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 501, in 
_process_router_update
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent 
self._process_router_if_compatible(router)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 440, in 
_process_router_if_compatible
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent 
self._process_updated_router(router)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 454, in 
_process_updated_router
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent 
ri.process(self)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_edge_ha_router.py", line 
92, in process
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent 
super(DvrEdgeHaRouter, self).process(agent)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_local_router.py", line 
488, in process
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent 
super(DvrLocalRouter, self).process(agent)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/dvr_router_base.py", line 
30, in process
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent 
super(DvrRouterBase, self).process(agent)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 386, in 
process
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent 
super(HaRouter, self).process(agent)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 385, in call
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent self.logger(e)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent 
self.force_reraise()
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent 
six.reraise(self.type_, self.value, self.tb)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 
"/usr/lib/python2.7/site-packages/neutron/common/utils.py", line 382, in call
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent return 
func(*args, **kwargs)
  2016-08-17 22:33:25.512 11369 ERROR neutron.agent.l3.agent   File 

[Yahoo-eng-team] [Bug 1611964] [NEW] SNAT redirect rules should be removed only on Gateway clear.

2016-08-10 Thread Swaminathan Vasudevan
Public bug reported:

SNAT redirect rules should be removed only on Gateway clear and not for a 
gateway move or gateway reschedule.
This would cause the snat_node unreachable by the dvr service ports on the 
originating node.

How to reproduce it.

1. Create a two network node setup (dvr_snat)
2. Create a network
3. Create a subnet
4. Create a router and attach the subnet to the router.
5. Set gateway to the router.
6. Now try to reschedule the router to the secondary node or do a manaul move 
to a second node.
7. In this case the 'external_gateway_removed" is called through 
'external_gateway_updated' function and tries to call snat_redirect_remove.

8. After you move the snat, the router namespace will not have the routing rule 
for the 'csnat' port.
9. It clears up and you only see the base rules.

Expected:
root@ubuntu-ctlr:~/devstack# ip rule
0:  from all lookup local 
32766:  from all lookup main 
32767:  from all lookup default 
167772161:  from 10.0.0.1/24 lookup 167772161 
root@ubuntu-ctlr:~/devstack# ip route s t 167772161
default via 10.0.0.9 dev qr-18deeb39-3b 

But Actual:
root@ubuntu-ctlr:~/devstack# ip rule
0:  from all lookup local 
32766:  from all lookup main 
32767:  from all lookup default

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1611964

Title:
  SNAT redirect rules should be removed only on Gateway clear.

Status in neutron:
  New

Bug description:
  SNAT redirect rules should be removed only on Gateway clear and not for a 
gateway move or gateway reschedule.
  This would cause the snat_node unreachable by the dvr service ports on the 
originating node.

  How to reproduce it.

  1. Create a two network node setup (dvr_snat)
  2. Create a network
  3. Create a subnet
  4. Create a router and attach the subnet to the router.
  5. Set gateway to the router.
  6. Now try to reschedule the router to the secondary node or do a manaul move 
to a second node.
  7. In this case the 'external_gateway_removed" is called through 
'external_gateway_updated' function and tries to call snat_redirect_remove.

  8. After you move the snat, the router namespace will not have the routing 
rule for the 'csnat' port.
  9. It clears up and you only see the base rules.

  Expected:
  root@ubuntu-ctlr:~/devstack# ip rule
  0:from all lookup local 
  32766:from all lookup main 
  32767:from all lookup default 
  167772161:from 10.0.0.1/24 lookup 167772161 
  root@ubuntu-ctlr:~/devstack# ip route s t 167772161
  default via 10.0.0.9 dev qr-18deeb39-3b 

  But Actual:
  root@ubuntu-ctlr:~/devstack# ip rule
  0:from all lookup local 
  32766:from all lookup main 
  32767:from all lookup default

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1611964/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1611513] [NEW] ip_lib: Add support for 'Flush' command in iproute

2016-08-09 Thread Swaminathan Vasudevan
Public bug reported:

This would be enhancement to the ip_lib iproute library to provide
additional support for the 'Flush' command that is not available right
now.

This is a dependency for a fix in DVR to cleanup the gateway rules. 
Ref: Bug: #1599287

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

** Summary changed:

- ip_lib: Add support for 'Flush' command for iproute
+ ip_lib: Add support for 'Flush' command in iproute

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1611513

Title:
  ip_lib: Add support for 'Flush' command in iproute

Status in neutron:
  In Progress

Bug description:
  This would be enhancement to the ip_lib iproute library to provide
  additional support for the 'Flush' command that is not available right
  now.

  This is a dependency for a fix in DVR to cleanup the gateway rules. 
  Ref: Bug: #1599287

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1611513/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1599287] [NEW] Cleanup snat redirect rules when agent restarts after stale snat namespace is cleaned.

2016-07-05 Thread Swaminathan Vasudevan
Public bug reported:

When the L3 agent is dead, if the gateway is removed, the snat namespace
and its rules are not properly cleaned when agent restarts.

Even though the patch https://review.openstack.org/#/c/326729/ addresses
the cleanup of the snat namespace, it does not remove the redirect rules
and the gateway device from the router namespace when gateway is
disabled.

When agent restarts the agent does not get the gateway data from the
server, and so it is not possible for the agent to clean it properly.

In order to clean the snat redirect rules, the gateway data should be
cached to the local file system and reused later when necessary.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

** Tags added: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1599287

Title:
  Cleanup snat redirect rules when agent restarts after stale snat
  namespace is cleaned.

Status in neutron:
  In Progress

Bug description:
  When the L3 agent is dead, if the gateway is removed, the snat
  namespace and its rules are not properly cleaned when agent restarts.

  Even though the patch https://review.openstack.org/#/c/326729/
  addresses the cleanup of the snat namespace, it does not remove the
  redirect rules and the gateway device from the router namespace when
  gateway is disabled.

  When agent restarts the agent does not get the gateway data from the
  server, and so it is not possible for the agent to clean it properly.

  In order to clean the snat redirect rules, the gateway data should be
  cached to the local file system and reused later when necessary.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1599287/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1583694] [NEW] [RFE] DVR support for Allowed_address_pair port that are bound to multiple ACTIVE VM ports used by Octavia

2016-05-19 Thread Swaminathan Vasudevan
Public bug reported:

DVR support for Allowed_address_pair ports with FloatingIP that are
unbound and assgined to Multiple VMs that are active.

Problem Statement:

When FloatingIP is asssigned to Allowed_address_pair port and assigned to 
multiple VMs that are ACTIVE and connected to DVR (Distributed Virtual Router) 
routers, the FloatingIP is not functional. 
The use case here is to provide redundancy to the VMs that are serviced by the 
DVR routers.
This feature works good for Legacy Routers ( Centralized Routers).

Theory:
Distributed Virtual Routers were designed for scalability and performance and 
to reduce the load on the single network node.

Distributed Virtual Routers are created on each Compute node dynamically
on demand and removed when not required. Distributed Virtual Routers
heavily depend on the port binding to identify the requirement of a DVR
service on a particular node.

Today we only create/update/delete floatingip based on the router and
the host in which the floatingip service is required. So the 'host' part
is very critical for the operation of the DVR.

In the above mentioned use case, we are dealing with
Allowed_address_pair port, which is unbound to any specific host and are
also assigned to multiple VMs that are ACTIVE at the same time.

We have a work around today to inherit the parent VMs port binding
properties for the allowed_address_pair port if the parent VMs port is
ACTIVE. This has a limitation, that we assume that there would be only
one "ACTIVE" VM port with the allowed_address_pair port for this to
work.

The reason for this is, if we have multiple "ACTIVE" VM port associated
with the same allowed_address_pair port, and if the allowed_address_pair
port has a FloatingIP associated with it, we can't provide the
FloatingIP service on all the nodes were the VM's port is ACTIVE. This
would create an issue because we will be seeing the same FloatingIP
being advertised(GARP) from all nodes, and so the users on the external
network will get confused on where the actual "ACTIVE" port is.

Why is it working with Legacy Routers:

In the case of legacy routers, the routers are always located a the
network node and the DNAT is also done at the router_namespace in the
Network node. They don't depend on the host-binding, since all the
traffic have to flow through the centralized router in the network node.
Also in the case of centralized routers, there is not issue of
Floatingip GARP, since it is always going to be coming in through a
single node.

So in the background, the allowed_address_pair port MAC is being
dynamically switched from one VM to another VM by the keepalived that
runs in the VM. So neutron does not need to know about any of those and
it works as expected.


Why it is not working with DVR Routers:
1. Allowed_address_pair does not have host-binding.
2. If we were to inherit from the VMs host-binding, there are multiple VMs that 
are ACTIVE, so we can't have a single host-binding for these 
allowed_address_pair ports.
3. Even if we ignore the port_binding on the allowed_address_pair port and try 
to start providing the plumbing for the FloatingIP on multiple nodes based on 
the VMs it is assoicated with, there are issues with the same FloatingIP being 
GARP from different compute nodes that would confuse.

How we can make it to work with DVR:

Option 1:
Neutron should have a some visibility on the state of the VM port, when the 
switch between ACTIVE and STANDBY happens. Today it is done by the keepalived 
on the VM and so it is not being logged anywhere.
If the keepalived can log the event in neutron port, then it can be used by the 
neutron to determine when to allow FloatingIP traffic and block FloatingIP 
traffic for a particular node, and then send the GARP from the respective node. 
There is some delay introduced in this as well.

(Desired) Low-hanging fruit.

Option 2:

Option 2 basically negates the Distributed nature of DVR and makes it 
centralized for North-South.
The other option is to have the FloatingIP functionality centralized for such 
features. But this would be more complex, since we need to introduce config 
options for agents and floatingip. Also in this case, we can't have both the 
local floatingip and centralized floatingip support for the same node. A 
compute node can only have either localized floatingip or centralized 
floatingip.

Complex ( Negates the purpose of DVR)

References:
Some references to the patches that we have already to support a single use 
case for the Allowed_address_pair with FloatingIP in DVR.

https://review.openstack.org/#/c/254439/
https://review.openstack.org/#/c/301410/
https://review.openstack.org/#/c/304905/

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog lbaas neutron

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1583694

Title:
  [RFE] DVR support for 

[Yahoo-eng-team] [Bug 1578866] Re: test_user_update_own_password failing intermittently

2016-05-11 Thread Swaminathan Vasudevan
** Also affects: neutron
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1578866

Title:
  test_user_update_own_password failing intermittently

Status in OpenStack Identity (keystone):
  Confirmed
Status in neutron:
  New

Bug description:
  test_user_update_own_password is failing intermittently on a variety
  of jobs

  stack trace:
  Traceback (most recent call last):
File "tempest/api/identity/v2/test_users.py", line 71, in 
test_user_update_own_password
  self.non_admin_users_client.token)
File 
"/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/testtools/testcase.py",
 line 480, in assertRaises
  self.assertThat(our_callable, matcher)
File 
"/opt/stack/new/tempest/.tox/tempest/local/lib/python2.7/site-packages/testtools/testcase.py",
 line 493, in assertThat
  raise mismatch_error
  testtools.matchers._impl.MismatchError: > returned {u'token': {u'expires': u'2016-05-06T00:13:53Z', 
u'issued_at': u'2016-05-05T23:13:54.00Z', u'audit_ids': 
[u'mbdiQZcNT5GxEUebXZqKOA', u'BAlcCwKLS9Co8C3jg2vfAw'], u'id': 
u'gABXK9Oyhuw7yBJJehrIIGlzIB8VTbgnM_M5Cve9q0BEHeZ2xNohJ_lkVqp7kicVbNgZ93p2dcLHfUfXWCcPvO4BWkTIry1mAGSvhzeLI7SYxSS6CBpeGK0FH3Uf_5vhHTCWFvcDvKOSzajGImeN7GaYts91H1zsXV7B1HRs0xN-4LADokI'},
 u'metadata': {u'roles': [], u'is_admin': 0}, u'serviceCatalog': [], u'user': 
{u'roles_links': [], u'username': u'tempest-IdentityUsersTest-972219078', 
u'name': u'tempest-IdentityUsersTest-972219078', u'roles': [], u'id': 
u'97a1836c5a2c40c99575e46aa37b8b50'}}

  
  example failures: 
http://logs.openstack.org/17/311617/1/gate/gate-tempest-dsvm-neutron-linuxbridge/084f25d/logs/testr_results.html.gz

  and http://logs.openstack.org/91/312791/2/check/gate-tempest-dsvm-
  full/88d9fff/logs/testr_results.html.gz

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1578866/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1569918] [NEW] Allowed_address_pair fixed_ip configured with FloatingIP after getting associated with a VM port does not work with DVR routers

2016-04-13 Thread Swaminathan Vasudevan
Public bug reported:

Allowed_address_pair fixed_ip when configured with FloatingIP after the
port is associated with the VM port is not reachable from DVR router.

The current code only supports adding in the proper ARP update and port
host binding inheritence for the Allowed_address_pair port only if the
port has a FloatingIP configured before it is associated with a VM port.

When the floatingIP is added later,  it fails.

How to reproduce.

1. Create networks
2. Create vrrp-net.
3. Create vrrp-subnet.
4. Create a DVR router.
5. Attach the vrrp-subnet to the router.
6. Create a VM on the vrrp-subnet
7. Create a VRRP port.
8. Attach the VRRP port with the VM.
9. Now assign a FloatingIP to the VRRP port.
10. Now check the ARP table entry in the router_namespace and also the VRRP 
port details. The VRRP port is still unbound and so the DVR cannot handle 
unbound ports.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1569918

Title:
  Allowed_address_pair fixed_ip configured with FloatingIP after getting
  associated with a VM port does not work with DVR routers

Status in neutron:
  New

Bug description:
  Allowed_address_pair fixed_ip when configured with FloatingIP after
  the port is associated with the VM port is not reachable from DVR
  router.

  The current code only supports adding in the proper ARP update and
  port host binding inheritence for the Allowed_address_pair port only
  if the port has a FloatingIP configured before it is associated with a
  VM port.

  When the floatingIP is added later,  it fails.

  How to reproduce.

  1. Create networks
  2. Create vrrp-net.
  3. Create vrrp-subnet.
  4. Create a DVR router.
  5. Attach the vrrp-subnet to the router.
  6. Create a VM on the vrrp-subnet
  7. Create a VRRP port.
  8. Attach the VRRP port with the VM.
  9. Now assign a FloatingIP to the VRRP port.
  10. Now check the ARP table entry in the router_namespace and also the VRRP 
port details. The VRRP port is still unbound and so the DVR cannot handle 
unbound ports.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1569918/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1566046] [NEW] Fix TypeError when trying to update an arp entry for ports with allowed_address_pairs on DVR router

2016-04-04 Thread Swaminathan Vasudevan
Public bug reported:

TypeError is seen when trying to update an arp entry for ports with 
allowed_address_pairs on DVR router.
This was seen in the master branch while I was testing the allowed_address_pair 
with floatingips on DVR router.


plugin.update_arp_entry_for_dvr_service_port(context, port)
^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00m  File "/opt/stack/neutron/neutron/db/l3_dvr_db.py", line 775, 
in update_arp_entry_for_dvr_service_port
^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00mself.l3_rpc_notifier.add_arp_entry)
^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00m  File "/opt/stack/neutron/neutron/db/l3_dvr_db.py", line 729, 
in _generate_arp_table_and_notify_agent
^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00mip_address = fixed_ip['ip_address']
^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00mTypeError: string indices must be integers


How to reproduce it.

1. Create a vrrp-network
2. Create a vrrp-subnet
3. Create a dvr router
4. Attach the vrrp-subnet to the router
5. Create security group rules for the vrrp-net and add rules to it.
6. Now create a VM on the vrrp-subnet
8. Now create a vrrp-port (allowed_address_pair) on the vrrp-subnet
9. Associate a floatingip to the vrrp-port.
10. Now update the VM port with the allowed_address_pair IP.

You should see this in the neutron-server logs.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: New


** Tags: l3-dvr-backlog

** Changed in: neutron
 Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1566046

Title:
  Fix TypeError when trying to update an arp entry for ports with
  allowed_address_pairs on DVR router

Status in neutron:
  New

Bug description:
  TypeError is seen when trying to update an arp entry for ports with 
allowed_address_pairs on DVR router.
  This was seen in the master branch while I was testing the 
allowed_address_pair with floatingips on DVR router.

  
  plugin.update_arp_entry_for_dvr_service_port(context, port)
  ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00m  File "/opt/stack/neutron/neutron/db/l3_dvr_db.py", line 775, 
in update_arp_entry_for_dvr_service_port
  ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00mself.l3_rpc_notifier.add_arp_entry)
  ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00m  File "/opt/stack/neutron/neutron/db/l3_dvr_db.py", line 729, 
in _generate_arp_table_and_notify_agent
  ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00mip_address = fixed_ip['ip_address']
  ^[[01;31m2016-03-30 12:06:00.910 TRACE neutron.callbacks.manager 
^[[01;35m^[[00mTypeError: string indices must be integers

  
  How to reproduce it.

  1. Create a vrrp-network
  2. Create a vrrp-subnet
  3. Create a dvr router
  4. Attach the vrrp-subnet to the router
  5. Create security group rules for the vrrp-net and add rules to it.
  6. Now create a VM on the vrrp-subnet
  8. Now create a vrrp-port (allowed_address_pair) on the vrrp-subnet
  9. Associate a floatingip to the vrrp-port.
  10. Now update the VM port with the allowed_address_pair IP.

  You should see this in the neutron-server logs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1566046/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1564776] [NEW] DVR l3 agent should check for snat namespace existence before adding or deleting anything from the namespace

2016-04-01 Thread Swaminathan Vasudevan
Public bug reported:

Check for snat_namespace existence in the node before any operation
in the namespace.

Today we check the self.snatnamespace which may or may not reflect
the exact state of the system.

If the snat_namespace is accidentally deleted and if we try to
remove the gateway from the router, the agent throws in a bunch of
error messages and the agent goes in loop constantly spewing error
messages.

Here is the link to the error message.

http://paste.openstack.org/show/492700/

This can be easily reproduced.

1. Create a network
2. Create a subnet
3. Create a router ( dvr)
4. Attach the subnet to the router.
5. Configure default gateway to the router.
6. Now verify the namespaces in the 'dvr_snat' node.
7. You should see
a. snat_namespace
b. router_namespace
c. dhcp namespace.
8. Now delete the snat_namespace.
9. Try to remove the gateway from the router.
10. Watch the L3 agent logs

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1564776

Title:
  DVR l3 agent should check for snat namespace existence before adding
  or deleting anything from the namespace

Status in neutron:
  New

Bug description:
  Check for snat_namespace existence in the node before any operation
  in the namespace.

  Today we check the self.snatnamespace which may or may not reflect
  the exact state of the system.

  If the snat_namespace is accidentally deleted and if we try to
  remove the gateway from the router, the agent throws in a bunch of
  error messages and the agent goes in loop constantly spewing error
  messages.

  Here is the link to the error message.

  http://paste.openstack.org/show/492700/

  This can be easily reproduced.

  1. Create a network
  2. Create a subnet
  3. Create a router ( dvr)
  4. Attach the subnet to the router.
  5. Configure default gateway to the router.
  6. Now verify the namespaces in the 'dvr_snat' node.
  7. You should see
  a. snat_namespace
  b. router_namespace
  c. dhcp namespace.
  8. Now delete the snat_namespace.
  9. Try to remove the gateway from the router.
  10. Watch the L3 agent logs

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1564776/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1564575] [NEW] DVR router namespaces are deleted when we manually move a DVR router from one SNAT_node to another SNAT_node even though there are active VMs in the node

2016-03-31 Thread Swaminathan Vasudevan
Public bug reported:

DVR router namespaces are deleted when we manually move the router from
on dvr_snat node to another dvr_snat node.

It should be only deleting the snat_namespace and not the
router_namespace, since there are 'dhcp' ports and 'vm' ports still
serviced by DVR.

How to reproduce:

Configure a two node setup:

1. I have one node with  Controller, compute and networking node with dhcp 
running in dvr_snat mode.
2. I have another node with  compute and networking node without dhcp running 
in dvr_snat mode.
3. Now create network
4. Create a subnet
5. Create a router and attach the subnet to the router.
6. Also set a gateway to the router.
7. Now you should see that there are three namespaces in the first node.
a. snat_namespace
b. qrouter_namespace
c. dhcp_namespace
8. Now create a VM on the first node.
9. Now try to remove the router from the first agent and assign it to the 
second agent in the second node.
neutron l3-agent-router-remove agent-id  router-id

This currently removes both the snat_namespace and the router_namespace
when there is still a valid vm and dhcp port.


Suspect that checking for available DVR service ports might be causing an issue 
here.

Will try to find out the root cause.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog mitaka-rc-potential

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1564575

Title:
  DVR router namespaces are deleted when we manually move a DVR router
  from one SNAT_node to another SNAT_node even though there are active
  VMs in the node

Status in neutron:
  New

Bug description:
  DVR router namespaces are deleted when we manually move the router
  from on dvr_snat node to another dvr_snat node.

  It should be only deleting the snat_namespace and not the
  router_namespace, since there are 'dhcp' ports and 'vm' ports still
  serviced by DVR.

  How to reproduce:

  Configure a two node setup:

  1. I have one node with  Controller, compute and networking node with dhcp 
running in dvr_snat mode.
  2. I have another node with  compute and networking node without dhcp running 
in dvr_snat mode.
  3. Now create network
  4. Create a subnet
  5. Create a router and attach the subnet to the router.
  6. Also set a gateway to the router.
  7. Now you should see that there are three namespaces in the first node.
  a. snat_namespace
  b. qrouter_namespace
  c. dhcp_namespace
  8. Now create a VM on the first node.
  9. Now try to remove the router from the first agent and assign it to the 
second agent in the second node.
  neutron l3-agent-router-remove agent-id  router-id

  This currently removes both the snat_namespace and the
  router_namespace when there is still a valid vm and dhcp port.

  
  Suspect that checking for available DVR service ports might be causing an 
issue here.

  Will try to find out the root cause.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1564575/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1563879] [NEW] [RFE] DVR should route packets to Instances behind the L2 Gateway

2016-03-30 Thread Swaminathan Vasudevan
Public bug reported:

L2 Gateway bridges the neutron network with the hardware based VxLAN
gateways. The DVR routers in neutron could not forward traffic to an
instance that is behind the VxLAN gateways since it could not 'ARP' for
those instances.

DVR currently has prepopulated ARP entries for all instances created
with DVR serviceable port. But somehow we should be able to populate the
ARP entries of instances behind the VxLAN gateway on all DVR nodes and
so the traffic can flow between them.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

** Tags added: l3-dvr-backlog

** Summary changed:

- [RFE] DVR should route packets to Instances on the L2 Gateway
+ [RFE] DVR should route packets to Instances behind the L2 Gateway

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1563879

Title:
  [RFE] DVR should route packets to Instances behind the L2 Gateway

Status in neutron:
  New

Bug description:
  L2 Gateway bridges the neutron network with the hardware based VxLAN
  gateways. The DVR routers in neutron could not forward traffic to an
  instance that is behind the VxLAN gateways since it could not 'ARP'
  for those instances.

  DVR currently has prepopulated ARP entries for all instances created
  with DVR serviceable port. But somehow we should be able to populate
  the ARP entries of instances behind the VxLAN gateway on all DVR nodes
  and so the traffic can flow between them.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1563879/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1562110] [NEW] link-lock-address allocater for DVR has a limit of 256 address pairs per node

2016-03-25 Thread Swaminathan Vasudevan
Public bug reported:

The current 'link-lock-address allocator for DVR routers has a limit fo
256 routers per node.

This should be configurable and not just limit to 256 routers per node.

** Affects: neutron
 Importance: Undecided
 Status: Confirmed


** Tags: l3-dvr-backlog

** Changed in: neutron
   Status: New => Confirmed

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1562110

Title:
  link-lock-address allocater for DVR has a limit of 256 address pairs
  per node

Status in neutron:
  Confirmed

Bug description:
  The current 'link-lock-address allocator for DVR routers has a limit
  fo 256 routers per node.

  This should be configurable and not just limit to 256 routers per
  node.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1562110/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1499045] Re: get_snat_port_for_internal_port called twice when an interface is added or removed by the l3 agent in the case of DVR routers.

2016-03-22 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: In Progress => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1499045

Title:
  get_snat_port_for_internal_port called twice when an interface is
  added or removed by the l3 agent in the case of DVR routers.

Status in neutron:
  Opinion

Bug description:
  get_snat_port_for_internal_port retrieves the internal snat port created for 
each router interface added to a DVR router.
  But this function is called twice in the L3 agent code.

  for every interface add or delete to the router,  it is called by the
  'dvr_local_router.py' and again it is called by the
  'dvr_edge_router.py'.

  This can be reduced to a single call to improve the controll plane
  performance.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1499045/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1538369] Re: re factor add_router_interface in l3_dvr_db.py

2016-03-22 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: New => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1538369

Title:
  re factor add_router_interface in l3_dvr_db.py

Status in neutron:
  Opinion

Bug description:
  lot of code is repeated in add_router_interface in l3_db.py and
  l3_dvr_db.py

  It would be nice to re factor the code and have one common func 
_add_router_interface which should be 
  called while add_router_interface in l3_db.py and l3_dvr_db.py

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1538369/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1558097] [NEW] DVR SNAT HA - Documentation for Networking guide

2016-03-18 Thread Swaminathan Vasudevan
Public bug reported:

DVR SNAT HA - Documentation for Networking guide for Mitaka.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1558097

Title:
  DVR SNAT HA - Documentation for Networking guide

Status in neutron:
  New

Bug description:
  DVR SNAT HA - Documentation for Networking guide for Mitaka.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1558097/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1554392] Re: Set extra route for DVR might cause error

2016-03-08 Thread Swaminathan Vasudevan
This is a known issue, since the router does not have an external
network interface in the router namespace and if you try to configure an
extra route pointing to the next hop which does not have a corresponding
interface in the router namespace.

This was a descision that we made since we don't want to complicate it
too much, but not adding the external routes in the router namespace and
only add it in the snat_namespace.

** Changed in: neutron
   Status: New => Opinion

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1554392

Title:
  Set extra route for DVR might cause error

Status in neutron:
  Opinion

Bug description:
  With a DVR router. I have 
  external network: 172.24.4.0/24
  internal network: 10.0.0.0/24

  I want to set an extra route for it, so I execute the following
  command:

  neutron router-update router1 --route
  destination=20.0.0.0/24,nexthop=172.24.4.6

  But I get this error at the output of neutron-l3-agent.

  ERROR neutron.agent.linux.utils [-] Exit code: 2; Stdin: ; Stdout: ;
  Stderr: RTNETLINK answers: Network is unreachable

  The reason for it is that the DVR router will set extra route to snat
  and qrouter namespace. However, qrouter namespace will not have the
  route to external network, so error is reported when l3-agent try to
  add a route with nexthop to external network to qroute namespace.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1554392/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1549511] [NEW] "test_volume_backed_live_migration" test failures seen in the gate

2016-02-24 Thread Swaminathan Vasudevan
Public bug reported:

Recently we have seen the "Test_volume_backed_live_migration" fail with
Multinode gate setup.

This test failure is seen in nova/neutron etc.,

http://logs.openstack.org/17/258417/6/check/gate-tempest-dsvm-multinode-
full/0d516d3/console.html#_2016-02-24_17_43_48_123

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1549511

Title:
  "test_volume_backed_live_migration" test failures seen in the gate

Status in OpenStack Compute (nova):
  New

Bug description:
  Recently we have seen the "Test_volume_backed_live_migration" fail
  with Multinode gate setup.

  This test failure is seen in nova/neutron etc.,

  http://logs.openstack.org/17/258417/6/check/gate-tempest-dsvm-
  multinode-full/0d516d3/console.html#_2016-02-24_17_43_48_123

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1549511/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1541714] Re: DVR routers are not created on a compute node that runs agent in 'dvr' mode

2016-02-04 Thread Swaminathan Vasudevan
It was an invalid user configuration.

The "dvr"node was not configured with the right agent mode, and so this
issue was seen.

Please ignore this bug.

** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1541714

Title:
  DVR routers are not created on a compute node that runs agent in 'dvr'
  mode

Status in neutron:
  Invalid

Bug description:
  DVR routers are not created on a compute node that is running L3 agent
  in "dvr" mode.

  This might have been introduced by the latest patch that changed the 
scheduling behavior.
  https://review.openstack.org/#/c/254837/

  Steps to reproduce:

  1. Stack up two nodes. ( dvr_snat node) and (dvr node)
  2. Create a Network
  3. Create a Subnet
  4. Create a Router
  5. Add Subnet to the Router
  6. Create a VM on the "dvr_snat" node.
  Everything works fine here. We can see the router-namespace, snat-namespace 
and the dhcp-namespace.

  7. Now Create a VM and force the VM to be created on the second node ( dvr 
node).
- nova boot --flavor xyz --image abc --net net-id yyy-id 
--availability-zone nova:dvr-node myinstance2

  Now see the image is created in the second node.
  But the router namespace is missing in the second node.

  The router is scheduled to the dvr-snat node, but not to the compute
  node.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1541714/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1541714] [NEW] DVR routers are not created on a compute node that runs agent in 'dvr' mode

2016-02-03 Thread Swaminathan Vasudevan
Public bug reported:

DVR routers are not created on a compute node that is running L3 agent
in "dvr" mode.

This might have been introduced by the latest patch that changed the scheduling 
behavior.
https://review.openstack.org/#/c/254837/

Steps to reproduce:

1. Stack up two nodes. ( dvr_snat node) and (dvr node)
2. Create a Network
3. Create a Subnet
4. Create a Router
5. Add Subnet to the Router
6. Create a VM on the "dvr_snat" node.
Everything works fine here. We can see the router-namespace, snat-namespace and 
the dhcp-namespace.

7. Now Create a VM and force the VM to be created on the second node ( dvr 
node).
  - nova boot --flavor xyz --image abc --net net-id yyy-id --availability-zone 
nova:dvr-node myinstance2

Now see the image is created in the second node.
But the router namespace is missing in the second node.

The router is scheduled to the dvr-snat node, but not to the compute
node.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1541714

Title:
  DVR routers are not created on a compute node that runs agent in 'dvr'
  mode

Status in neutron:
  New

Bug description:
  DVR routers are not created on a compute node that is running L3 agent
  in "dvr" mode.

  This might have been introduced by the latest patch that changed the 
scheduling behavior.
  https://review.openstack.org/#/c/254837/

  Steps to reproduce:

  1. Stack up two nodes. ( dvr_snat node) and (dvr node)
  2. Create a Network
  3. Create a Subnet
  4. Create a Router
  5. Add Subnet to the Router
  6. Create a VM on the "dvr_snat" node.
  Everything works fine here. We can see the router-namespace, snat-namespace 
and the dhcp-namespace.

  7. Now Create a VM and force the VM to be created on the second node ( dvr 
node).
- nova boot --flavor xyz --image abc --net net-id yyy-id 
--availability-zone nova:dvr-node myinstance2

  Now see the image is created in the second node.
  But the router namespace is missing in the second node.

  The router is scheduled to the dvr-snat node, but not to the compute
  node.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1541714/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1535928] [NEW] Duplicate IPtables rule detected warning message seen in L3 agent

2016-01-19 Thread Swaminathan Vasudevan
Public bug reported:

In recent L3 agent logs in the gate we have been seeing this warning message 
associated with the DVR router jobs. 
Right now none of the jobs are failing, but we need to see why this warning 
message is showing up in the logs or it might be due to some hidden issues.

http://logs.openstack.org/89/255989/11/check/gate-tempest-dsvm-neutron-
dvr/e3464a5/logs/screen-q-l3.txt.gz?level=WARNING#_2016-01-18_13_34_52_764

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

** Summary changed:

- Duplicate IPtables rule detected warning message seen in L3 agent for DVR 
Routers
+ Duplicate IPtables rule detected warning message seen in L3 agent

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1535928

Title:
  Duplicate IPtables rule detected warning message seen in L3 agent

Status in neutron:
  New

Bug description:
  In recent L3 agent logs in the gate we have been seeing this warning message 
associated with the DVR router jobs. 
  Right now none of the jobs are failing, but we need to see why this warning 
message is showing up in the logs or it might be due to some hidden issues.

  http://logs.openstack.org/89/255989/11/check/gate-tempest-dsvm-
  neutron-
  dvr/e3464a5/logs/screen-q-l3.txt.gz?level=WARNING#_2016-01-18_13_34_52_764

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1535928/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1524020] [NEW] DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac-address changes occur

2015-12-08 Thread Swaminathan Vasudevan
Public bug reported:

DVR arp update (dvr_vmarp_table_update) and dvr_update_router_add_vm
called for every update_port if the mac_address changes or when
update_devic_up is true.

These functions should be called from _notify_l3_agent_port_update, only
when a host binding for a service port changes or when a mac_address for
the service port changes.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

** Summary changed:

- DVR Arp update and dvr_update_router_add_vm is called for every port update 
instead of only when host binding or mac-address changes occur
+ DVRImpact:  dvr_vmarp_table_update and dvr_update_router_add_vm is called for 
every port update instead of only when host binding or mac-address changes occur

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1524020

Title:
  DVRImpact:  dvr_vmarp_table_update and dvr_update_router_add_vm is
  called for every port update instead of only when host binding or mac-
  address changes occur

Status in neutron:
  New

Bug description:
  DVR arp update (dvr_vmarp_table_update) and dvr_update_router_add_vm
  called for every update_port if the mac_address changes or when
  update_devic_up is true.

  These functions should be called from _notify_l3_agent_port_update,
  only when a host binding for a service port changes or when a
  mac_address for the service port changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1524020/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1515360] [NEW] Add more verbose to Tempest Test Errors that causes "SSHTimeout" seen in CVR and DVR

2015-11-11 Thread Swaminathan Vasudevan
Public bug reported:

Today "SSHTimeout" Errors are seen both in CVR ( Centralized Virtual Routers) 
and DVR ( Distributed Virtual Routers).
The frequency of occurence is more on DVR than the CVR.

But the problem here, is the error statement that is returned and the data that 
is dumped.
SSHTimeout may have occured due to several reasons, since in all our tempest 
test we are trying to ssh to the VM using the public IP ( FloatingIP) 
1. VM did not come up
2. VM does not have a private IP address
3. Security rules in the VM was not applied properly
4. Setting up of Floating IP
5. DNAT rules in the Router Namespace.
6. Scheduling.
7. Namespace Errors etc.,


We need a way to identify through the tempest test exactly were and what went 
wrong.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: gate-failure l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1515360

Title:
  Add more verbose to Tempest Test Errors that causes "SSHTimeout" seen
  in CVR and DVR

Status in neutron:
  New

Bug description:
  Today "SSHTimeout" Errors are seen both in CVR ( Centralized Virtual Routers) 
and DVR ( Distributed Virtual Routers).
  The frequency of occurence is more on DVR than the CVR.

  But the problem here, is the error statement that is returned and the data 
that is dumped.
  SSHTimeout may have occured due to several reasons, since in all our tempest 
test we are trying to ssh to the VM using the public IP ( FloatingIP) 
  1. VM did not come up
  2. VM does not have a private IP address
  3. Security rules in the VM was not applied properly
  4. Setting up of Floating IP
  5. DNAT rules in the Router Namespace.
  6. Scheduling.
  7. Namespace Errors etc.,

  
  We need a way to identify through the tempest test exactly were and what went 
wrong.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1515360/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1513678] [NEW] At scale router scheduling takes a long time with DVR routers with multiple compute nodes hosting thousands of VMs

2015-11-05 Thread Swaminathan Vasudevan
Public bug reported:

At scale when we have 100s of compute Node and 1000s of VM in networks that are 
routed by Distributed Virtual Router, we are seeing a control plane performance 
issue.
It takes a while for all the routers to be schedule in the Nodes.

The _schedule_router calls _get_candidates, and it internally calls
get_l3_agent_candidates. In the case of the DVR Routers, all the active
agents are passed to the get_l3_agent_candidates which iterates through
the agents and for each agent it tries to find out if there are any
dvr_service ports available in the routed subnet.

This might be taking lot more time.

So we need to figure out the issue and reduce the time taken for the
scheduling.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1513678

Title:
  At scale router scheduling takes a long time with DVR routers with
  multiple compute nodes hosting thousands of VMs

Status in neutron:
  In Progress

Bug description:
  At scale when we have 100s of compute Node and 1000s of VM in networks that 
are routed by Distributed Virtual Router, we are seeing a control plane 
performance issue.
  It takes a while for all the routers to be schedule in the Nodes.

  The _schedule_router calls _get_candidates, and it internally calls
  get_l3_agent_candidates. In the case of the DVR Routers, all the
  active agents are passed to the get_l3_agent_candidates which iterates
  through the agents and for each agent it tries to find out if there
  are any dvr_service ports available in the routed subnet.

  This might be taking lot more time.

  So we need to figure out the issue and reduce the time taken for the
  scheduling.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1513678/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1512199] Re: change vm fixed ips will cause unable to communicate to vm in other network

2015-11-03 Thread Swaminathan Vasudevan
Not able to reproduce I could see the arp table update on the router
namespaces on both nodes.

I tried to modify the ports on both the subnet 10.2.0.X and 10.0.0.X.
In this example I have change the 10.2.0.4 to 10.2.0.25 and 10.0.0.8 10.0.0.20. 
In both cases I saw that the arp entry was updated.

There is one thing that is true on both our testing is, the VM is not
able to get the new IP until I reboot the VM. ( This might be filed as a
different bug in nova)

ARP output from Node 2:
root@ubuntu-new-compute:~/devstack# arp -a 
? (10.2.0.4) at fa:16:3e:7e:0b:48 [ether] PERM on qr-b25bad4f-5f
? (10.0.0.6) at fa:16:3e:7a:78:fe [ether] PERM on qr-66c29926-29
? (10.2.0.3) at fa:16:3e:b6:19:da [ether] PERM on qr-b25bad4f-5f
? (10.0.0.2) at fa:16:3e:91:1a:d2 [ether] PERM on qr-b2b8c9a4-68
? (10.0.0.2) at fa:16:3e:91:1a:d2 [ether] PERM on qr-66c29926-29
? (10.0.0.6) at fa:16:3e:7a:78:fe [ether] PERM on qr-b2b8c9a4-68
? (10.2.0.25) at fa:16:3e:7e:0b:48 [ether] PERM on qr-b25bad4f-5f ( changed arp 
info)
? (10.0.0.7) at fa:16:3e:5d:12:fd [ether] PERM on qr-66c29926-29
? (10.2.0.2) at fa:16:3e:b6:84:91 [ether] PERM on qr-b25bad4f-5f
? (10.0.0.8) at fa:16:3e:a1:cc:87 [ether] PERM on qr-66c29926-29
? (10.0.0.8) at fa:16:3e:a1:cc:87 [ether] PERM on qr-b2b8c9a4-68
? (10.0.0.20) at fa:16:3e:a1:cc:87 [ether] PERM on qr-66c29926-29 ( changed arp 
info)
? (10.0.0.7) at fa:16:3e:5d:12:fd [ether] PERM on qr-b2b8c9a4-68
? (10.0.0.3) at fa:16:3e:fd:a1:d6 [ether] PERM on qr-66c29926-29
root@ubuntu-new-compute:~/devstack#

ARP Info from Node 1:
root@ubuntu-ctlr:~/devstack# arp -a
? (10.0.0.3) at fa:16:3e:fd:a1:d6 [ether] PERM on qr-66c29926-29
? (10.2.0.3) at fa:16:3e:b6:19:da [ether] PERM on qr-b25bad4f-5f
? (10.0.0.7) at fa:16:3e:5d:12:fd [ether] PERM on qr-66c29926-29
? (10.0.0.2) at fa:16:3e:91:1a:d2 [ether] PERM on qr-b2b8c9a4-68
? (10.0.0.2) at fa:16:3e:91:1a:d2 [ether] PERM on qr-66c29926-29
? (10.2.0.4) at fa:16:3e:7e:0b:48 [ether] PERM on qr-b25bad4f-5f
? (10.0.0.6) at fa:16:3e:7a:78:fe [ether] PERM on qr-66c29926-29
? (10.2.0.25) at fa:16:3e:7e:0b:48 [ether] PERM on qr-b25bad4f-5f
? (10.2.0.5) at  on qr-b25bad4f-5f
? (10.0.0.5) at  on qr-66c29926-29
? (10.0.0.20) at fa:16:3e:a1:cc:87 [ether] PERM on qr-66c29926-29
? (10.0.0.8) at fa:16:3e:a1:cc:87 [ether] PERM on qr-66c29926-29
? (10.2.0.2) at fa:16:3e:b6:84:91 [ether] PERM on qr-b25bad4f-5f
? (10.0.0.4) at  on qr-66c29926-29
root@ubuntu-ctlr:~/devstack# 



** Changed in: neutron
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1512199

Title:
  change vm fixed ips will cause unable to communicate to vm in other
  network

Status in neutron:
  Invalid

Bug description:
  I use dvr+kilo,  vxlan.  The environment is like:

  vm2-2<- compute1  --vxlan-  comupte2 ->vm2-1
  vm3-1<-

  vm2-1<- net2  -router1- net3 ->vm3-1
  vm2-2<-

  
  vm2-1(192.168.2.3) and vm2-2(192.168.2.4) are in the same net(net2 
192.168.2.0/24) but not assigned to the same compute node. vm3-1 is in 
net3(192.168.3.0/24). net2 and net3 are connected by router1. The three vms are 
in default security-group. Not use firewall.

  1. Using command below to change the ip of vm2-1.
  neutron port-update portID  --fixed-ip 
subnet_id=subnetID,ip_address=192.168.2.10 --fixed-ip 
subnet_id=subnetID,ip_address=192.168.2.20
  In vm2-1 using "sudo udhcpc"(carrios) to get ip, the dhcp message is correct 
but the ip not changed.
  Then reboot vm2-1. The ip of vm2-1 turned to be 192.168.2.20.

  2. Using vm2-2 could ping 192.168.2.20 successfully . But vm3-1 could
  not ping 192.168.2.20 successfully.

  By capturing packets and looking for related information, the reason maybe:
  1. newIP(192.168.2.20) and MAC of vm2-1 was not wrote to arp cache in the 
namespace of router1 in compute1 node.
  2. In dvr mode, the arp request from gw port(192.168.2.1) from compute1 to 
vm2-1 was dropped by flowtable in compute2. So the arp 
request(192.168.2.1->192.168.2.20) could not arrive at vm2-1.
  3. For vm2-2, the arp request(192.168.2.4->192.168.2.20) was not dropped and 
could connect with vm2-1.

  In my opinion, if both new fixed IPs of vm2-1(192.168.2.10 and
  102.168.2.20) and MAC is wrote to arp cache in namespace of router1 in
  compute1 node, the problem will resolved. But only one
  ip(192.168.2.10) and MAC is wrote.

  BTW, if only set one fixed ip for vm2-1, it works fine. But if set two
  fixed ips for vm2-1, the problem above most probably happens.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1512199/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1509004] [NEW] "test_dualnet_dhcp6_stateless_from_os" failures seen in the gate

2015-10-22 Thread Swaminathan Vasudevan
Public bug reported:

"test_dualnet_dhcp6_stateless_from_os" - This test fails in the gate
randomly both with DVR and non-DVR routers.

http://logs.openstack.org/79/230079/27/check/gate-tempest-dsvm-neutron-
full/1caed8b/logs/testr_results.html.gz

http://logs.openstack.org/85/238485/1/check/gate-tempest-dsvm-neutron-
dvr/1059e22/logs/testr_results.html.gz

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: ipv6

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1509004

Title:
  "test_dualnet_dhcp6_stateless_from_os" failures seen in the gate

Status in neutron:
  New

Bug description:
  "test_dualnet_dhcp6_stateless_from_os" - This test fails in the gate
  randomly both with DVR and non-DVR routers.

  http://logs.openstack.org/79/230079/27/check/gate-tempest-dsvm-
  neutron-full/1caed8b/logs/testr_results.html.gz

  http://logs.openstack.org/85/238485/1/check/gate-tempest-dsvm-neutron-
  dvr/1059e22/logs/testr_results.html.gz

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1509004/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1503847] [NEW] Python34 test failures in gate - Logging Error

2015-10-07 Thread Swaminathan Vasudevan
Public bug reported:

I am seeing "gate-neutron-python34" test failures again in neutron.

http://logs.openstack.org/82/228582/13/check/gate-neutron-
python34/5b36c34/console.html

http://logs.openstack.org/82/228582/13/check/gate-neutron-
python34/5b36c34/console.html#_2015-10-07_17_36_06_987

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: py34

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1503847

Title:
  Python34 test failures in gate - Logging Error

Status in neutron:
  New

Bug description:
  I am seeing "gate-neutron-python34" test failures again in neutron.

  http://logs.openstack.org/82/228582/13/check/gate-neutron-
  python34/5b36c34/console.html

  http://logs.openstack.org/82/228582/13/check/gate-neutron-
  python34/5b36c34/console.html#_2015-10-07_17_36_06_987

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1503847/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1501873] [NEW] FIP Namespace add/delete race condition seen in DVR router log

2015-10-01 Thread Swaminathan Vasudevan
e None None] 
Command: ['ip', 'netns', 'exec', 'fip-31689320-95d7-44f9-932a-cc82c1bca2b4', 
'sysctl', '-w', 'net.ipv4.ip_forward=1']
Exit code: 1
Stdin:
Stdout:
Stderr: seting the network namespace "fip-31689320-95d7-44f9-932a-cc82c1bca2b4" 
failed: Invalid argument

 
This leads to a series of failures.

This failure is seen only in the gate.

This can be reproduced by constantly adding and deleting floatingip to a
private IP, with multiple API worker threads.

For more information you can also look at the "logstash" output below.

http://logs.openstack.org/82/228582/8/check/gate-tempest-dsvm-neutron-
dvr/9053337/logs/screen-q-l3.txt.gz?level=TRACE#_2015-09-29_21_10_34_084

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1501873

Title:
  FIP Namespace add/delete race condition seen in DVR router log

Status in neutron:
  In Progress

Bug description:
  FIP Namespace add/delete race conditon seen in DVR router log. This might 
cause the FIP functionality to fail.
  From the trace log it seems when this happens, a bunch of tests related to 
FIP functionality fails with SSH Timeout waiting for reply.

  
  Here is the output of the trace that kinds of shows the race condition.

  Exit code: 0
   execute /opt/stack/new/neutron/neutron/agent/linux/utils.py:156
  2015-09-29 21:10:33.433 7884 DEBUG neutron.agent.l3.dvr_local_router [-] 
Removed last floatingip, so requesting the server to delete Floatingip Agent 
Gateway port:{u'allowed_address_pairs': [], u'extra_dhcp_opts': [], 
u'device_owner': u'network:floatingip_agent_gateway', u'port_security_enabled': 
False, u'binding:profile': {}, u'fixed_ips': [{u'subnet_id': 
u'362e9033-db93-4193-9413-1073215ab326', u'prefixlen': 24, u'ip_address': 
u'172.24.5.9'}, {u'subnet_id': u'feb3aa76-53b1-4d4e-b136-412c747ffd30', 
u'prefixlen': 64, u'ip_address': u'2001:db8::a'}], u'id': 
u'044a8e2f-00eb-4231-b526-13cb46dcc42f', u'security_groups': [], 
u'binding:vif_details': {u'port_filter': True, u'ovs_hybrid_plug': True}, 
u'binding:vif_type': u'ovs', u'mac_address': u'fa:16:3e:7a:a6:85', u'status': 
u'DOWN', u'subnets': [{u'ipv6_ra_mode': None, u'cidr': u'2001:db8::/64', 
u'gateway_ip': u'2001:db8::2', u'id': u'feb3aa76-53b1-4d4e-b136-412c747ffd30', 
u'subnetpool_id': None}, {u'ipv6_ra_mode': None, u'cidr': u'172.
 24.5.0/24', u'gateway_ip': u'172.24.5.1', u'id': 
u'362e9033-db93-4193-9413-1073215ab326', u'subnetpool_id': None}], 
u'binding:host_id': u'devstack-trusty-hpcloud-b5-5153724', u'dns_assignment': 
[{u'hostname': u'host-172-24-5-9', u'ip_address': u'172.24.5.9', u'fqdn': 
u'host-172-24-5-9.openstacklocal.'}, {u'hostname': u'host-2001-db8--a', 
u'ip_address': u'2001:db8::a', u'fqdn': u'host-2001-db8--a.openstacklocal.'}], 
u'device_id': u'646bb18b-da52-4ead-a635-012c72c1ccf1', u'name': u'', 
u'admin_state_up': True, u'network_id': 
u'31689320-95d7-44f9-932a-cc82c1bca2b4', u'dns_name': u'', 
u'binding:vnic_type': u'normal', u'tenant_id': u'', u'extra_subnets': []} 
floating_ip_removed_dist 
/opt/stack/new/neutron/neutron/agent/l3/dvr_local_router.py:148

  2015-09-29 21:10:34.031 7884 DEBUG neutron.agent.linux.utils [-]
  Running command (rootwrap daemon): ['ip', 'netns', 'delete',
  'fip-31689320-95d7-44f9-932a-cc82c1bca2b4'] execute_rootwrap_daemon
  /opt/stack/new/neutron/neutron/agent/linux/utils.py:101

  
  2015-09-29 21:10:34.043 DEBUG neutron.agent.l3.dvr_local_router 
[req-33413b07-784c-469e-8a35-0e20312a157e None None] FloatingIP agent gateway 
port received from the plugin: {u'allowed_address_pairs': [], 
u'extra_dhcp_opts': [], u'device_owner': u'network:floatingip_agent_gateway', 
u'port_security_enabled': False, u'binding:profile': {}, u'fixed_ips': 
[{u'subnet_id': u'362e9033-db93-4193-9413-1073215ab326', u'prefixlen': 24, 
u'ip_address': u'172.24.5.9'}, {u'subnet_id': 
u'feb3aa76-53b1-4d4e-b136-412c747ffd30', u'prefixlen': 64, u'ip_address': 
u'2001:db8::a'}], u'id': u'044a8e2f-00eb-4231-b526-13cb46dcc42f', 
u'security_groups': [], u'binding:vif_details': {u'port_filter': True, 
u'ovs_hybrid_plug': True}, u'binding:vif_type': u'ovs', u'mac_address': 
u'fa:16:3e:7a:a6:85', u'status': u'ACTIVE', u'subnets': [{u'ipv6_ra_mode': 
None, u'cidr': u'172.24.5.0/24', u'gateway_ip': u'172.24.5.1', u'id': 
u'362e9033-db93-4193-9413-1073215ab326', u'subnetpool_id': None}, 
{u'ipv6_ra_mode': None, u'ci
 dr': u'2001:db8::/64', u'gateway_ip': u'2001:db8::2', u'id': 
u'feb3aa76-53b1-4d4e-b136-412c747ffd30', u'subnetpool_id': None}], 
u'binding:host_id': u'devstack-trusty-hpcloud-b5-5153724', u'dns_assignment': 
[{u'hostname': u'host-172-24-5-9', u'ip_address': u'172.24.5.9', u'fqdn': 
u'host-172-24-5-9.openstacklocal.'}, {u'hostname': u'host-2001-db8--a', 
u'ip_address': u'200

[Yahoo-eng-team] [Bug 1501086] [NEW] ARP entries dropped by DVR routers when the qr device is not ready or present

2015-09-29 Thread Swaminathan Vasudevan
Public bug reported:

The ARP entries are dropped by DVR routers when the 'qr' device does not
exist in the namespace.

There are two ways in the L3 agent the ARP entries are updated.
Once when an internal csnat port is created, then arp entries added from the 
'dvr_local_router' by calling the "set_subnet_arp_info" which in turn calls the 
"_update_arp_entry".

There is another time, when an arp update "rpc" message comes from the
Server to the agent as "add_arp_entry" or "delete_arp_entry" which
inturn calls "_update_arp_entry".

We have seen log traces that shows that the arp update message comes
before the "qr" device is ready. So we get to drop those arp message.

We need to kind of cache those arp messages and update the router-
namespace when the "qr" device is ready.

If you see the message below, we are checking for the device and
throwing a warning message that the device is not ready, but the arp
entries are not saved anywere. They are dropped.

2015-09-24 18:45:30.150 WARNING neutron.agent.l3.dvr_local_router [req-
0565ce3a-905d-43fa-a6f3-1a07df6c6c2b None None] Arp operation add failed
for device qr-b672ffde-cd, since the device does not exist anymore. The
device might have been concurrently deleted or not created yet.

If you see here the internal_network 'qr' device is added later.

2015-09-24 18:45:30.367 DEBUG neutron.agent.l3.router_info [req-
7e5722e4-5fef-4889-9372-8cf1218522a2 None None] adding internal network:
prefix(qr-), port(b672ffde-cd80-49eb-9817-58436fa8e8fd)
_internal_network_added
/opt/stack/new/neutron/neutron/agent/l3/router_info.py:300

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

** Changed in: neutron
   Status: New => Confirmed

** Changed in: neutron
 Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1501086

Title:
  ARP entries dropped by DVR routers when the qr device is not ready or
  present

Status in neutron:
  In Progress

Bug description:
  The ARP entries are dropped by DVR routers when the 'qr' device does
  not exist in the namespace.

  There are two ways in the L3 agent the ARP entries are updated.
  Once when an internal csnat port is created, then arp entries added from the 
'dvr_local_router' by calling the "set_subnet_arp_info" which in turn calls the 
"_update_arp_entry".

  There is another time, when an arp update "rpc" message comes from the
  Server to the agent as "add_arp_entry" or "delete_arp_entry" which
  inturn calls "_update_arp_entry".

  We have seen log traces that shows that the arp update message comes
  before the "qr" device is ready. So we get to drop those arp message.

  We need to kind of cache those arp messages and update the router-
  namespace when the "qr" device is ready.

  If you see the message below, we are checking for the device and
  throwing a warning message that the device is not ready, but the arp
  entries are not saved anywere. They are dropped.

  2015-09-24 18:45:30.150 WARNING neutron.agent.l3.dvr_local_router
  [req-0565ce3a-905d-43fa-a6f3-1a07df6c6c2b None None] Arp operation add
  failed for device qr-b672ffde-cd, since the device does not exist
  anymore. The device might have been concurrently deleted or not
  created yet.

  If you see here the internal_network 'qr' device is added later.

  2015-09-24 18:45:30.367 DEBUG neutron.agent.l3.router_info [req-
  7e5722e4-5fef-4889-9372-8cf1218522a2 None None] adding internal
  network: prefix(qr-), port(b672ffde-cd80-49eb-9817-58436fa8e8fd)
  _internal_network_added
  /opt/stack/new/neutron/neutron/agent/l3/router_info.py:300

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1501086/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1499787] [NEW] Static routes are attempted to add to SNAT Namespace of DVR routers without checking for Router Gateway.

2015-09-25 Thread Swaminathan Vasudevan
Public bug reported:

In DVR routers static routes are now only added to snat namespace.
But before adding to snat namespace, the routers are not checked for the 
existence of gateway.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1499787

Title:
  Static routes are attempted to add to SNAT Namespace of DVR routers
  without checking for Router Gateway.

Status in neutron:
  New

Bug description:
  In DVR routers static routes are now only added to snat namespace.
  But before adding to snat namespace, the routers are not checked for the 
existence of gateway.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1499787/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1499785] [NEW] Static routes are not added to the qrouter namespace for DVR routers

2015-09-25 Thread Swaminathan Vasudevan
Public bug reported:

Static routes are not added to the qrouter namespace when routers are
added.

Initially it used to be configuring the routes in the qrouter namespace but not 
in the SNAT namespace.
A recent patch caused this regression in moving the routes from qrouter 
namespace to SNAT namespace.

2bb48eb58ad28a629dd12c434b83680aa3f240a4

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1499785

Title:
  Static routes are not added to the qrouter namespace for DVR routers

Status in neutron:
  New

Bug description:
  Static routes are not added to the qrouter namespace when routers are
  added.

  Initially it used to be configuring the routes in the qrouter namespace but 
not in the SNAT namespace.
  A recent patch caused this regression in moving the routes from qrouter 
namespace to SNAT namespace.

  2bb48eb58ad28a629dd12c434b83680aa3f240a4

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1499785/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1499045] [NEW] get_snat_port_for_internal_port called twice when an interface is added or removed by the l3 agent in the case of DVR routers.

2015-09-23 Thread Swaminathan Vasudevan
Public bug reported:

get_snat_port_for_internal_port retrieves the internal snat port created for 
each router interface added to a DVR router.
But this function is called twice in the L3 agent code.

for every interface add or delete to the router,  it is called by the
'dvr_local_router.py' and again it is called by the
'dvr_edge_router.py'.

This can be reduced to a single call to improve the controll plane
performance.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1499045

Title:
  get_snat_port_for_internal_port called twice when an interface is
  added or removed by the l3 agent in the case of DVR routers.

Status in neutron:
  In Progress

Bug description:
  get_snat_port_for_internal_port retrieves the internal snat port created for 
each router interface added to a DVR router.
  But this function is called twice in the L3 agent code.

  for every interface add or delete to the router,  it is called by the
  'dvr_local_router.py' and again it is called by the
  'dvr_edge_router.py'.

  This can be reduced to a single call to improve the controll plane
  performance.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1499045/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1419175] Re: Cannot find device "qr-" error message found in logtrace with DVR routers while trying to update arp entry

2015-09-21 Thread Swaminathan Vasudevan
** Summary changed:

- DVR qrouter created without OVS qr device
+ Cannot find device "qr-" error message found in logtrace with DVR routers 
while trying to update arp entry

** Changed in: neutron
   Status: Expired => Confirmed

** Changed in: neutron
 Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan)

** Tags added: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1419175

Title:
  Cannot find device "qr-" error message found in logtrace with DVR
  routers while trying to update arp entry

Status in neutron:
  In Progress

Bug description:
  We have running stable/juno with DVR enabled.
  During tests, we created router, gateway and instance.

  There is one qrouter on one compute node was created with
  RuntimeError:

  Command: ['sudo', '/usr/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 
'ip', 'netns', 'exec', 'qrouter-086cf9e6-4c43-4b65-b623-fbd5d593f687', 'ip', 
'-4', 'neigh', 'replace', '10.100.100.13', 'lladdr', 'fa:16:3e:84:fe:e4', 
'nud', 'permanent', 'dev', 'qr-00d7d90b-01']
  Exit code: 1
  Stdout: ''
  Stderr: 'Cannot find device "qr-00d7d90b-01"\n'
  2015-02-05 20:48:11.834 27031 ERROR neutron.agent.l3_agent 
[req-2c71f61b-c036-4d90-bcfd-75ffdd5340ff None] DVR: Failed updating arp entry
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent Traceback (most 
recent call last):
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent   File 
"/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/l3_agent.py",
 line 1719, in _update_arp_entry
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent 
device.neigh.add(net.version, ip, mac)
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent   File 
"/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py",
 line 515, in add
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent 
options=[ip_version])
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent   File 
"/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py",
 line 247, in _as_root
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent 
kwargs.get('use_root_namespace', False))
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent   File 
"/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py",
 line 79, in _as_root
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent 
log_fail_as_error=self.log_fail_as_error)
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent   File 
"/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py",
 line 91, in _execute
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent 
log_fail_as_error=log_fail_as_error)
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent   File 
"/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py",
 line 82, in execute
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent raise 
RuntimeError(m)
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent RuntimeError:

  As the result, all future router update failed as well. When the router was 
removed, the qrouter namespace was left on the compute node as well because of 
error:
  2015-02-05 20:48:11.834 27031 TRACE neutron.agent.l3_agent Stderr: 'Cannot 
find device "qr-00d7d90b-01"\n'

  Logs also can be read at: http://paste.openstack.org/show/168348/

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1419175/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1496578] [NEW] SNAT port not found for the given internal port error message seen when gateway is removed for DVR routers.

2015-09-16 Thread Swaminathan Vasudevan
Public bug reported:

Recently the logstash logs showed traces about "SNAT port not found for
the given internal port".

http://logs.openstack.org/22/219422/13/check/gate-tempest-dsvm-neutron-
dvr/e5243b2/logs/screen-q-l3.txt.gz?level=TRACE#_2015-09-15_12_28_08_880

By analyzing the failure it seems when a gateway is removed, the
"get_snat_port_for_internal_port" is called without the cache value.

This bug was introduced by the patch shown below.

Icc099c1a97e3e68eeaf4690bc83167ba30d8099a

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

** Changed in: neutron
   Status: New => Confirmed

** Changed in: neutron
 Assignee: (unassigned) => Swaminathan Vasudevan (swaminathan-vasudevan)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1496578

Title:
  SNAT port not found for the given internal port error message seen
  when gateway is removed for DVR routers.

Status in neutron:
  In Progress

Bug description:
  Recently the logstash logs showed traces about "SNAT port not found
  for the given internal port".

  http://logs.openstack.org/22/219422/13/check/gate-tempest-dsvm-
  neutron-
  dvr/e5243b2/logs/screen-q-l3.txt.gz?level=TRACE#_2015-09-15_12_28_08_880

  By analyzing the failure it seems when a gateway is removed, the
  "get_snat_port_for_internal_port" is called without the cache value.

  This bug was introduced by the patch shown below.

  Icc099c1a97e3e68eeaf4690bc83167ba30d8099a

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1496578/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1493524] [NEW] IPv6 support for DVR routers

2015-09-08 Thread Swaminathan Vasudevan
Public bug reported:

This bug would capture all the IPv6 related work on DVR routers going
forward.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1493524

Title:
  IPv6 support for DVR routers

Status in neutron:
  In Progress

Bug description:
  This bug would capture all the IPv6 related work on DVR routers going
  forward.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1493524/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1475011] [NEW] FloatingIPsTestJson tests fail with DVR routers

2015-07-15 Thread Swaminathan Vasudevan
Public bug reported:

FloatingIPsTestJSON tests fail with DVR routers.
In this test suite test_associate_already_associated_floating_ip and  
test_associate_disassociate_floating_ip are the tests that are failing with 
Internal Server Error when trying to delete the 
floatingip_agent_gateway_port.

Floatingip_agent_gateway_port calls _delete_port  and so the
ML2PLugin throws attribute not found error with recent changes.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: New


** Tags: l3-dvr-backlog

** Changed in: neutron
 Assignee: (unassigned) = Swaminathan Vasudevan (swaminathan-vasudevan)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1475011

Title:
  FloatingIPsTestJson tests fail with DVR routers

Status in neutron:
  New

Bug description:
  FloatingIPsTestJSON tests fail with DVR routers.
  In this test suite test_associate_already_associated_floating_ip and  
test_associate_disassociate_floating_ip are the tests that are failing with 
Internal Server Error when trying to delete the 
floatingip_agent_gateway_port.

  Floatingip_agent_gateway_port calls _delete_port  and so the
  ML2PLugin throws attribute not found error with recent changes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1475011/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1456755] Re: Could not retrieve gateway port for subnet

2015-06-23 Thread Swaminathan Vasudevan
*** This bug is a duplicate of bug 1404823 ***
https://bugs.launchpad.net/bugs/1404823

** This bug is no longer a duplicate of bug 1456756
   Could not retrieve gateway port for subnet
** This bug has been marked a duplicate of bug 1404823
   router-interface-add port succeed but does not add corresponding flows

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1456755

Title:
  Could not retrieve gateway port for subnet

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  There is this trace at error level in the server logs when DVR is
  enabled by default:

  [req-179a109e-456a-4743-8395-58b2f322afe2 None None] Could not
  retrieve gateway port for subnet {'ipv6_ra_mode': None,
  'allocation_pools': [{'start': u'10.100.0.2', 'end': u'10.100.0.14'}],
  'host_routes': [], 'ipv6_address_mode': None, 'cidr':
  u'10.100.0.0/28', 'id': u'8a47789b-452d-4ac7-a85b-9e57838456f0',
  'subnetpool_id': None, 'name': u'', 'enable_dhcp': True, 'network_id':
  u'85389a7f-8f50-405c-a19c-c4ad7b35e9ff', 'tenant_id':
  u'ec2ad8998456415ea6e8f9a217b5c1d8', 'dns_nameservers': [],
  'gateway_ip': u'10.100.0.1', 'ip_version': 4L, 'shared': False}

  This is the logstash query:

  
http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiQ291bGQgbm90IHJldHJpZXZlIGdhdGV3YXkgcG9ydCBmb3Igc3VibmV0XCIgQU5EIGJ1aWxkX25hbWU6XCJjaGVjay10ZW1wZXN0LWRzdm0tbmV1dHJvbi1kdnJcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTQzMjA2MDU2NzU0MSwibW9kZSI6IiIsImFuYWx5emVfZmllbGQiOiIifQ==

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1456755/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1465434] [NEW] DVR issues with supporting multiple subnets per network on DVR routers

2015-06-15 Thread Swaminathan Vasudevan
Public bug reported:

DVR today has issues with supporting multiple subnets per network on its
routers.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1465434

Title:
  DVR issues with supporting multiple subnets per network on DVR routers

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  DVR today has issues with supporting multiple subnets per network on
  its routers.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1465434/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1426165] Re: DVR: Device or resource busy error seen when fip namespace is being deleted

2015-04-10 Thread Swaminathan Vasudevan
Let us go ahead and close this bug.

** Changed in: neutron
   Status: New = Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1426165

Title:
  DVR: Device or resource busy error seen when fip namespace is being
  deleted

Status in OpenStack Neutron (virtual network service):
  Invalid

Bug description:
  How to reproduce -

  1. Assign 2 routers with network/subnet/etc sharing the same external network 
for FIPs to a single agent/host.
  2.  Disassociate all FIPs
  3.  FIP namespace should be deleted but the following trace is seen instead

  
  2015-02-26 15:38:34.457 ^[[00;32mDEBUG neutron.agent.l3.dvr_fip_ns 
[^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mDVR: destroy fip ns: 
fip-6473ee45-f14f-4b86-a7da-678845a10c08^[[00m ^[[00;33mfrom (pid=6207) destroy 
/opt/stack/neutron/neutron/agent/l3/dvr_fip_ns.py:153^[[00m
  2015-02-26 15:38:34.457 ^[[00;32mDEBUG neutron.agent.linux.utils 
[^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mRunning command: ['sudo', 
'/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 
'delete', 'fip-6473ee45-f14f-4b86-a7da-678845a10c08']^[[00m ^[[00;33mfrom 
(pid=6207) create_process 
/opt/stack/neutron/neutron/agent/linux/utils.py:50^[[00m
  2015-02-26 15:38:34.651 ^[[01;31mERROR neutron.agent.linux.utils 
[^[[00;36m-^[[01;31m] ^[[01;35m^[[01;31m
  Command: ['sudo', '/usr/local/bin/neutron-rootwrap', 
'/etc/neutron/rootwrap.conf', 'ip', 'netns', 'delete', 
'fip-6473ee45-f14f-4b86-a7da-678845a10c08']
  Exit code: 1
  Stdout:
  Stderr: Cannot remove 
/var/run/netns/fip-6473ee45-f14f-4b86-a7da-678845a10c08: Device or resource busy
  ^[[00m
  2015-02-26 15:38:34.652 ^[[01;31mERROR neutron.agent.l3.dvr_fip_ns 
[^[[00;36m-^[[01;31m] ^[[01;35m^[[01;31mFailed trying to delete namespace: 
fip-6473ee45-f14f-4b86-a7da-678845a10c08^[[00m
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mTraceback (most recent call last):
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00m  File /opt/stack/neutron/neutron/agent/l3/dvr_fip_ns.py, line 
157, in destroy
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mip_wrapper.netns.delete(ns)
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00m  File /opt/stack/neutron/neutron/agent/linux/ip_lib.py, line 
541, in delete
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mself._as_root('delete', name, use_root_namespace=True)
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00m  File /opt/stack/neutron/neutron/agent/linux/ip_lib.py, line 
250, in _as_root
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mkwargs.get('use_root_namespace', False))
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00m  File /opt/stack/neutron/neutron/agent/linux/ip_lib.py, line 
72, in _as_root
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mlog_fail_as_error=self.log_fail_as_error)
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00m  File /opt/stack/neutron/neutron/agent/linux/ip_lib.py, line 
84, in _execute
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mlog_fail_as_error=log_fail_as_error)
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00m  File /opt/stack/neutron/neutron/agent/linux/utils.py, line 
86, in execute
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mraise RuntimeError(m)
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mRuntimeError:
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mCommand: ['sudo', '/usr/local/bin/neutron-rootwrap', 
'/etc/neutron/rootwrap.conf', 'ip', 'netns', 'delete', 
'fip-6473ee45-f14f-4b86-a7da-678845a10c08']
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mExit code: 1
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mStdout:
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00mStderr: Cannot remove 
/var/run/netns/fip-6473ee45-f14f-4b86-a7da-678845a10c08: Device or resource busy
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00m
  ^[[01;31m2015-02-26 15:38:34.652 TRACE neutron.agent.l3.dvr_fip_ns 
^[[01;35m^[[00m

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1426165/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1398446] Re: Nova compute failed to delete VM port with DVR

2015-04-01 Thread Swaminathan Vasudevan
** Changed in: neutron
   Status: In Progress = Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1398446

Title:
  Nova compute failed to delete VM port with DVR

Status in OpenStack Neutron (virtual network service):
  Invalid

Bug description:
  This defect is hard to reproduce, only happens when I have more than 3
  compute node with DVR enabled.

  With the following script, run several times, I can see one VM in
  ERROR state.

  
  neutron net-create demo-net
  netdemoid=$(neutron net-list | awk '{if($4=='demo-net'){print $2;}}')
  neutron subnet-create demo-net 10.100.100.0/24 --name demo-subnet
  subnetdemoid=$(neutron subnet-list | awk '{if($4=='demo-subnet'){print 
$2;}}')
  neutron router-create demo-router
  routerdemoid=$(neutron router-list | awk '{if($4=='demo-router'){print 
$2;}}')

  exnetid=$(neutron net-list | awk '{if($4=='ext-net'){print $2;}}')
  for i in `seq 1 10`; do
  #boot vm, and create floating ip
  nova boot --image cirros --flavor m1.tiny --nic net-id=$netdemoid 
cirrosdemo${i}
  cirrosdemoid[i]=$(nova list | awk '{if($4=='cirrosdemo${i}'){print 
$2;}}')
  output=$(neutron floatingip-create $exnetid)
  echo $output
  floatipid[i]=$(echo $output | awk '{if($2==id){print $4;}}')
  floatip[i]=$(echo $output | awk '{if($2==floating_ip_address){print 
$4;}}')a
  done

  # Setup router
  neutron router-gateway-set $routerdemoid $exnetid
  neutron router-interface-add demo-router $subnetdemoid
  #wait for VM to be running
  sleep 30

  for i in `seq 1 10`; do
  cirrosfix=$(nova list | awk '{if($4=='cirrosdemo${i}'){print $12;}}')
  cirrosfixip=${cirrosfix#*=}
  output=$(neutron port-list | grep ${cirrosfixip})
  echo $output
  portid=$(echo $output | awk '{print $2;}')
  neutron floatingip-associate --fixed-ip-address $cirrosfixip 
${floatipid[i]} $portid
  neutron floatingip-delete ${floatipid[i]}
  nova delete ${cirrosdemoid[i]}
  done

  
  neutron router-interface-delete demo-router $subnetdemoid
  neutron router-gateway-clear demo-router $netdemoid
  neutron router-delete demo-router
  neutron subnet-delete $subnetdemoid
  neutron net-delete $netdemoid

  Looking at log file:
  2014-11-20 17:25:56.258 31042 DEBUG neutron.openstack.common.lockutils 
[req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] Got semaphore db-access lock 
/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/openstack/common/lockutils.py:168
  2014-11-20 17:25:56.424 31042 ERROR neutron.api.v2.resource 
[req-6eabf07e-2fe4-4960-89ca-f0ac3f04f7a5 None] delete failed
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource Traceback (most 
recent call last):
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File 
/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/api/v2/resource.py,
 line 87, in resource
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource result = 
method(request=request, **args)
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File 
/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/api/v2/base.py,
 line 476, in delete
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource 
obj_deleter(request.context, id, **kwargs)
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File 
/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py,
 line 1036, in delete_port
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource router_info = 
l3plugin.dvr_deletens_if_no_vm(context, id)
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File 
/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/db/l3_dvrscheduler_db.py,
 line 195, in dvr_deletens_if_no_vm
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource port_host)
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File 
/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/neutron/db/agents_db.py,
 line 136, in _get_agent_by_type_and_host
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource Agent.host == 
host).one()
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File 
/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py,
 line 2369, in one
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource ret = list(self)
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File 
/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/query.py,
 line 2411, in _iter_
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource 
self.session._autoflush()
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource File 
/opt/stack/venvs/openstack/local/lib/python2.7/site-packages/sqlalchemy/orm/session.py,
 line 1198, in _autoflush
  2014-11-20 17:25:56.424 31042 TRACE neutron.api.v2.resource 

[Yahoo-eng-team] [Bug 1431077] [NEW] TRACE: attribute error when trying to fetch the router.snat_namespace.name

2015-03-11 Thread Swaminathan Vasudevan
Public bug reported:

TRACE seen in the vpn-agent log when configured with DVR router. 
A recent refactoring to the agent have introduced this problem.

http://logs.openstack.org/71/130471/6/check/check-tempest-dsvm-neutron-
dvr/10208dc/logs/screen-q-vpn.txt.gz?level=TRACE


2015-03-11 14:09:03.570 ERROR neutron.agent.l3.agent 
[req-1c27f913-7f3c-40ff-8b86-f915fdde4be9 None None] Failed to process 
compatible router '25160ab1-c55e-424a-b209-a98f6b2bf769'
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent Traceback (most 
recent call last):
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 895, in 
_process_router_update
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent 
self._process_router_if_compatible(router)
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 843, in 
_process_router_if_compatible
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent 
self._process_added_router(router)
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 854, in 
_process_added_router
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent 
adv_svc.AdvancedService.after_router_added, ri)
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/event_observers.py, line 40, in notify
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent getattr(observer, 
method_name)(*args, **kwargs)
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/vpn_service.py, 
line 61, in after_router_added
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent 
device.sync(self.context, [ri.router])
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py, line 
431, in inner
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent return f(*args, 
**kwargs)
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/device_drivers/ipsec.py,
 line 773, in sync
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent 
self._delete_vpn_processes(sync_router_ids, router_ids)
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/device_drivers/ipsec.py,
 line 795, in _delete_vpn_processes
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent 
self.ensure_process(process_id)
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/device_drivers/ipsec.py,
 line 643, in ensure_process
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent namespace = 
self.get_namespace(process_id)
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron-vpnaas/neutron_vpnaas/services/vpn/device_drivers/ipsec.py,
 line 535, in get_namespace
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent return 
router.snat_namespace.name
2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent AttributeError: 
'NoneType' object has no attribute 'name

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: vpnaas

** Tags removed: neutron-vpnaas
** Tags added: vpnaas

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1431077

Title:
  TRACE: attribute error when trying to fetch the
  router.snat_namespace.name

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  TRACE seen in the vpn-agent log when configured with DVR router. 
  A recent refactoring to the agent have introduced this problem.

  http://logs.openstack.org/71/130471/6/check/check-tempest-dsvm-
  neutron-dvr/10208dc/logs/screen-q-vpn.txt.gz?level=TRACE

  
  2015-03-11 14:09:03.570 ERROR neutron.agent.l3.agent 
[req-1c27f913-7f3c-40ff-8b86-f915fdde4be9 None None] Failed to process 
compatible router '25160ab1-c55e-424a-b209-a98f6b2bf769'
  2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent Traceback (most 
recent call last):
  2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 895, in 
_process_router_update
  2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent 
self._process_router_if_compatible(router)
  2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 843, in 
_process_router_if_compatible
  2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent 
self._process_added_router(router)
  2015-03-11 14:09:03.570 3909 TRACE neutron.agent.l3.agent   File 

[Yahoo-eng-team] [Bug 1423422] [NEW] FloatingIP Agent Gateway Port is created for Non-DVR Routers

2015-02-18 Thread Swaminathan Vasudevan
Public bug reported:

FloatingIP Agent Gateway Port is only required for the DVR Routers.

A recent patch to remove the RPC dependency to create the FloatingIP Agent 
Gateway Port has introduced a bug that creates FloatingIP Agent Gateway Port 
for non DVR routers.
Change-Id: Ieaa79c8bf2b1e03bc352f9252ce22286703e3715

This might generate an error when trying to get the L3 agent information
from a Compute Node that is not running an L3 Agent in DVR mode in a
Multi Node Scenario. This issue may not be visible in a Single Node
deployment.

In a single node deployment we might see FloatingIP Agent Gateway Ports
for Legacy routers which are not utilized.

Only DVR routers require L3 agent to be present in the Compute Node.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: New


** Tags: l3-dvr-backlog

** Changed in: neutron
 Assignee: (unassigned) = Swaminathan Vasudevan (swaminathan-vasudevan)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1423422

Title:
  FloatingIP Agent Gateway Port is created for Non-DVR Routers

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  FloatingIP Agent Gateway Port is only required for the DVR Routers.

  A recent patch to remove the RPC dependency to create the FloatingIP Agent 
Gateway Port has introduced a bug that creates FloatingIP Agent Gateway Port 
for non DVR routers.
  Change-Id: Ieaa79c8bf2b1e03bc352f9252ce22286703e3715

  This might generate an error when trying to get the L3 agent
  information from a Compute Node that is not running an L3 Agent in
  DVR mode in a Multi Node Scenario. This issue may not be visible in
  a Single Node deployment.

  In a single node deployment we might see FloatingIP Agent Gateway
  Ports for Legacy routers which are not utilized.

  Only DVR routers require L3 agent to be present in the Compute Node.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1423422/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1421886] [NEW] FloatingIP agent gateway port should delete the FIP Agent gateway port based on the host and the external network id when there are multiple external networks.

2015-02-13 Thread Swaminathan Vasudevan
Public bug reported:

FloatingIP Agent Gateway port should be deleted based on the host and
also based on the External network id.

In the Multiple external network scenario what happens is there might be
more than one FloatingIP Agent Gateway Port on the same host and so it
has to be deleted based on the External Network ID.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1421886

Title:
  FloatingIP agent gateway port should delete the FIP Agent gateway port
  based on the host and the external network id when there are multiple
  external networks.

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  FloatingIP Agent Gateway port should be deleted based on the host and
  also based on the External network id.

  In the Multiple external network scenario what happens is there might
  be more than one FloatingIP Agent Gateway Port on the same host and so
  it has to be deleted based on the External Network ID.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1421886/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1421497] [NEW] Gateway clear generates a TRACE - AttributeError in get_int_device_name in DVR routers

2015-02-12 Thread Swaminathan Vasudevan
Public bug reported:

A recent change in the agent code have introduced this problem.

When a Gateway is cleared from the router, even though there are no existing 
floating IPs, the external_gateway_removed function in agent.py is calling 
process_floatingips. 
That may be the reason for this failure.


Stderr: RTNETLINK answers: No such process

2015-02-11 23:12:15.307 2809 ERROR neutron.agent.l3.dvr [-] DVR: removed snat 
failed
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Traceback (most recent 
call last):
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr   File 
/opt/stack/new/neutron/neutron/agent/l3/dvr.py, line 197, in 
_snat_redirect_remove
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr 
ns_ipd.route.delete_gateway(table=snat_idx)
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr   File 
/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py, line 415, in 
delete_gateway
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr self._as_root(*args)
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr   File 
/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py, line 253, in _as_root
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr 
kwargs.get('use_root_namespace', False))
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr   File 
/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py, line 83, in _as_root
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr 
log_fail_as_error=self.log_fail_as_error)
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr   File 
/opt/stack/new/neutron/neutron/agent/linux/ip_lib.py, line 95, in _execute
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr 
log_fail_as_error=log_fail_as_error)
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr   File 
/opt/stack/new/neutron/neutron/agent/linux/utils.py, line 83, in execute
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr raise 
RuntimeError(m)
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr RuntimeError: 
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Command: ['sudo', 
'/usr/local/bin/neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 
'exec', 'qrouter-1cfe7654-a669-4f73-a21d-d5110d7c0297', 'ip', 'route', 'del', 
'default', 'dev', 'qr-467e8832-93', 'table', '547711270']
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Exit code: 2
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Stdout: 
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr Stderr: RTNETLINK 
answers: No such process
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr 
2015-02-11 23:12:15.307 2809 TRACE neutron.agent.l3.dvr 
2015-02-11 23:12:18.846 2809 ERROR neutron.agent.l3.agent [-] 'NoneType' object 
has no attribute 'get_int_device_name'
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent Traceback (most 
recent call last):
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/common/utils.py, line 342, in call
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent return 
func(*args, **kwargs)
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 602, in process_router
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent 
self._process_external(ri)
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 565, in 
_process_external
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent 
self._process_external_gateway(ri)
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 503, in 
_process_external_gateway
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent 
self.external_gateway_removed(ri, ri.ex_gw_port, interface_name)
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 905, in 
external_gateway_removed
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent ri, ex_gw_port)
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent   File 
/opt/stack/new/neutron/neutron/agent/l3/agent.py, line 694, in 
_get_external_device_interface_name
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent fip_int = 
ri.fip_ns.get_int_device_name(ri.router_id)
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent AttributeError: 
'NoneType' object has no attribute 'get_int_device_name'
2015-02-11 23:12:18.846 2809 TRACE neutron.agent.l3.agent 
Traceback (most recent call last):
  File /usr/local/lib/python2.7/dist-packages/eventlet/greenpool.py, line 82, 
in _spawn_n_impl
func(*args, **kwargs)
  File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 1137, in 
_process_router_update
self._router_removed(update.id)
  File /opt/stack/new/neutron/neutron/agent/l3/agent.py, line 409, in 
_router_removed
self.process_router(ri)
  File 

[Yahoo-eng-team] [Bug 1421011] [NEW] Remove unused RPC methods from the L3_rpc

2015-02-11 Thread Swaminathan Vasudevan
Public bug reported:

Remove unsued RPC methods from the L3_rpc.
The get_snat_router_interface_ports is defined but not currently used by any 
agents. So it need to be cleaned.

** Affects: neutron
 Importance: Undecided
 Assignee: Swaminathan Vasudevan (swaminathan-vasudevan)
 Status: In Progress


** Tags: l3-dvr-backlog

** Tags added: l3-dvr-backlog

** Changed in: neutron
 Assignee: (unassigned) = Swaminathan Vasudevan (swaminathan-vasudevan)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1421011

Title:
  Remove unused RPC methods from the L3_rpc

Status in OpenStack Neutron (virtual network service):
  In Progress

Bug description:
  Remove unsued RPC methods from the L3_rpc.
  The get_snat_router_interface_ports is defined but not currently used by 
any agents. So it need to be cleaned.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1421011/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1417386] [NEW] AttributeError: _oslo_messaging_localcontext errors found in neutron l3-agent logs

2015-02-02 Thread Swaminathan Vasudevan
Public bug reported:

This TRACE is seen in many places in the neutron l3-agent logs from the
jenkins logs.

2015-02-02 23:29:13.916 ERROR oslo_messaging.rpc.dispatcher 
[req-ce57a6b0-04fc-41dd-a114-5b69c0ebcf6d FloatingIPsNegativeTestJSON-267704990 
FloatingIPsNegativeTestJSON-594738290] Exception during message handling: 
_oslo_messaging_localcontext_fdc8b1dcad1246c49200370f4281e5d5
2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher Traceback 
(most recent call last):
2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py, line 
142, in _dispatch_and_reply
2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 
executor_callback))
2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py, line 
188, in _dispatch
2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 
localcontext.clear_local_context()
2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo_messaging/localcontext.py, line 
55, in clear_local_context
2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 
delattr(_STORE, _KEY)
2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 
AttributeError: _oslo_messaging_localcontext_fdc8b1dcad1246c49200370f4281e5d5
2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 

I am not sure if there is any other similar bugs that have been
reported.

** Affects: neutron
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1417386

Title:
  AttributeError: _oslo_messaging_localcontext errors found in neutron
  l3-agent logs

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  This TRACE is seen in many places in the neutron l3-agent logs from
  the jenkins logs.

  2015-02-02 23:29:13.916 ERROR oslo_messaging.rpc.dispatcher 
[req-ce57a6b0-04fc-41dd-a114-5b69c0ebcf6d FloatingIPsNegativeTestJSON-267704990 
FloatingIPsNegativeTestJSON-594738290] Exception during message handling: 
_oslo_messaging_localcontext_fdc8b1dcad1246c49200370f4281e5d5
  2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher Traceback 
(most recent call last):
  2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py, line 
142, in _dispatch_and_reply
  2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 
executor_callback))
  2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py, line 
188, in _dispatch
  2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 
localcontext.clear_local_context()
  2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher   File 
/usr/local/lib/python2.7/dist-packages/oslo_messaging/localcontext.py, line 
55, in clear_local_context
  2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 
delattr(_STORE, _KEY)
  2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 
AttributeError: _oslo_messaging_localcontext_fdc8b1dcad1246c49200370f4281e5d5
  2015-02-02 23:29:13.916 3051 TRACE oslo_messaging.rpc.dispatcher 

  I am not sure if there is any other similar bugs that have been
  reported.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1417386/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1415522] [NEW] DVR Tempest Job check-tempest-dsvm-neutron-dvr not stable when compared to the neutron job

2015-01-28 Thread Swaminathan Vasudevan
Public bug reported:

DVR Tempest Job check-tempest-dsvm-neutron-dvr is unstable when compared to the 
legacy router job.
This is very critical to make the DVR job gating.
So we need to find out the actual subtest that is causing the failure.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3-dvr-backlog

** Tags added: l3-dvr-backlog

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1415522

Title:
  DVR Tempest Job check-tempest-dsvm-neutron-dvr not stable when
  compared to the neutron job

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  DVR Tempest Job check-tempest-dsvm-neutron-dvr is unstable when compared to 
the legacy router job.
  This is very critical to make the DVR job gating.
  So we need to find out the actual subtest that is causing the failure.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1415522/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


  1   2   >