[Yahoo-eng-team] [Bug 1923281] [NEW] failed to attach a volume with multiattach to an ironic instance

2021-04-09 Thread Simon Li
Public bug reported:

Description
It's no supported that attaches a volume has `multiattach` set to an ironic 
instance. 
It is supported if the volume set `multiattach` to `false`.
Additionally,the storage back end for the volume has `multiattach` property.

Steps to reproduce:
* attach a multiattach volume to an ironic instance.
`openstack --os-compute-api-version 2.60 server add volume  
`

Expected result
The volume is attached to the server on /dev/*

Actual result
Volume  has 'multiattach' set, which is not supported for this instance. 
(HTTP 409) 

Environment
* nova:18.2.4
* cinder: 13.0.8
* ironic: 11.1.3
* storage type: G2 Series block storage of Inspur Inc.
* the volume is available state.
* the instance is active and power on.
* the baremetal volume connector has been created and can be sure is available.

More
* Logs of nova-compute for ironic:
ERROR oslo_messaging.rpc.server [req-17d65e1f-0db3-442c-87eb-57d94ffa6940 
421c1e16837b4189b9a6ae04ba4af86b 6e3d2c325bc94a5e8dbb41a7a73ae593 - default 
default] Exception during message handling: 
nova.exception.MultiattachNotSupportedByVirtDriver: Volume 
3e9b3371-d56d-4b16-a180-00b835993662 has 'multiattach' set, which is not 
supported for this instance.

** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1923281

Title:
  failed to attach a volume with multiattach to an ironic instance

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  It's no supported that attaches a volume has `multiattach` set to an ironic 
instance. 
  It is supported if the volume set `multiattach` to `false`.
  Additionally,the storage back end for the volume has `multiattach` property.

  Steps to reproduce:
  * attach a multiattach volume to an ironic instance.
  `openstack --os-compute-api-version 2.60 server add volume  
`

  Expected result
  The volume is attached to the server on /dev/*

  Actual result
  Volume  has 'multiattach' set, which is not supported for this 
instance. (HTTP 409) 

  Environment
  * nova:18.2.4
  * cinder: 13.0.8
  * ironic: 11.1.3
  * storage type: G2 Series block storage of Inspur Inc.
  * the volume is available state.
  * the instance is active and power on.
  * the baremetal volume connector has been created and can be sure is 
available.

  More
  * Logs of nova-compute for ironic:
  ERROR oslo_messaging.rpc.server [req-17d65e1f-0db3-442c-87eb-57d94ffa6940 
421c1e16837b4189b9a6ae04ba4af86b 6e3d2c325bc94a5e8dbb41a7a73ae593 - default 
default] Exception during message handling: 
nova.exception.MultiattachNotSupportedByVirtDriver: Volume 
3e9b3371-d56d-4b16-a180-00b835993662 has 'multiattach' set, which is not 
supported for this instance.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1923281/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1923257] [NEW] diminished networking, packet issues during Bionic openstack deploys

2021-04-09 Thread Joshua Genet
Public bug reported:

Test run here:
https://solutions.qa.canonical.com/testruns/testRun/19f7492e-1a8c-4ec3-b0bb-0fc4f2453f7b

Artifacts/Logs/Bundles here:
https://oil-jenkins.canonical.com/artifacts/19f7492e-1a8c-4ec3-b0bb-0fc4f2453f7b/index.html

Juju Openstack model crashdump here:
https://oil-jenkins.canonical.com/artifacts/19f7492e-1a8c-4ec3-b0bb-0fc4f2453f7b/generated/generated/openstack/juju-crashdump-openstack-2021-04-09-19.57.18.tar.gz

---

We've seen this in several different environments. Some using Stein-
Bionic, some using Ussuri-Bionic. Ussuri-Focal had no issues.

Juju ssh-ing to any of these machines is incredibly slow and unresponsive. (an 
ls takes somewhere around 30-60seconds)
As seen below, we're failing to grab lxd images for our instances. Likely due 
to the diminished networking. On an earlier manual run I was apt installing at 
400B/s.

---

Machine   StateDNSInst id  Series  AZ Message
0 started  duision  bionic  zone1  Deployed
0/lxd/0   pending pending  bionic starting
0/lxd/1   pending pending  bionic starting
0/lxd/2   pending pending  bionic starting
0/lxd/3   pending pending  bionic acquiring LXD image
0/lxd/4   pending pending  bionic starting
0/lxd/5   downpending  bionic failed to start 
machine 0/lxd/5 (acquiring LXD image: no matching image found), retrying in 10s 
(10 more attempts)
0/lxd/6   pending pending  bionic Creating container 
spec
0/lxd/7   pending pending  bionic starting
0/lxd/8   pending pending  bionic starting
0/lxd/9   pending pending  bionic starting
0/lxd/10  pending pending  bionic starting
1 started  azurill  bionic  zone1  Deployed
1/lxd/0   pending pending  bionic starting
1/lxd/1   downpending  bionic failed to start 
machine 1/lxd/1 (acquiring LXD image: no matching image found), retrying in 10s 
(10 more attempts)
1/lxd/2   pending pending  bionic starting
1/lxd/3   pending pending  bionic starting
1/lxd/4   pending pending  bionic starting
1/lxd/5   pending pending  bionic starting
1/lxd/6   pending pending  bionic starting
1/lxd/7   downpending  bionic failed to start 
machine 1/lxd/7 (acquiring LXD image: Failed remote image download: Get 
https://cloud-images.ubuntu.com/releases/server/releases/bionic/release-20210325/ubuntu-18.04-server-cloudimg-amd64.squashfs:
 proxyconnect tcp: Unable to connect to: squid.internal), retrying in 10s (10 
more attempts)
1/lxd/8   pending pending  bionic starting
1/lxd/9   pending pending  bionic starting
1/lxd/10  pending pending  bionic starting
2 started  meowth   bionic  zone2  Deployed
2/lxd/0   downpending  bionic failed to start 
machine 2/lxd/0 (acquiring LXD image: Failed remote image download: Get 
https://cloud-images.ubuntu.com/releases/bionic/release-20210325/ubuntu-18.04-server-cloudimg-amd64.squashfs:
 read tcp: read: connection reset by peer), retrying in 10s (10 more attempts)

---

kern.log from one of our machines shows a bunch of this:

---

Apr  9 20:52:22 azurill kernel: [ 4196.092358] br-eth0: received packet on eth0 
with own address as source address
Apr  9 20:52:22 azurill kernel: [ 4196.317426] br-eth0: received packet on eth0 
with own address as source address
Apr  9 20:52:22 azurill kernel: [ 4196.387884] br-eth0: received packet on eth0 
with own address as source address
Apr  9 20:52:22 azurill kernel: [ 4196.423968] br-eth0: received packet on eth0 
with own address as source address
Apr  9 20:52:22 azurill kernel: [ 4196.488808] br-eth0: received packet on eth0 
with own address as source address
Apr  9 20:52:22 azurill kernel: [ 4196.919973] br-eth0: received packet on eth0 
with own address as source address
Apr  9 20:52:23 azurill kernel: [ 4197.116675] br-eth0: received packet on eth0 
with own address as source address
Apr  9 20:52:23 azurill kernel: [ 4197.331899] br-eth0: received packet on eth0 
with own address as source address
Apr  9 20:52:23 azurill kernel: [ 4197.447991] br-eth0: received packet on eth0 
with own address as source address
Apr  9 20:52:23 azurill kernel: [ 4197.513159] br-eth0: received packet on eth0 
with own address as source address

** Affects: cloud-init
 Importance: Undecided
 Status: New


** Tags: cdo-qa cdo-release-blocker foundations-engine

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.

[Yahoo-eng-team] [Bug 1923206] [NEW] libvirt.libvirtError: internal error: unable to execute QEMU command 'device_del': Device $device is already in the process of unplug

2021-04-09 Thread Lee Yarwood
Public bug reported:

Description
===
This was initially reported downstream against QEMU in the following bug:

Get libvirtError "Device XX is already in the process of unplug" when detach 
device in OSP env
https://bugzilla.redhat.com/show_bug.cgi?id=1878659

I first saw the error crop up while testing q35 in TripleO in the
following job:

https://c6b36562677324bf8249-804f3f4695b3063292bbb3235f424ae0.ssl.cf1.rackcdn.com/785027/5/check
/tripleo-ci-
centos-8-standalone/6860050/logs/undercloud/var/log/containers/nova
/nova-compute.log

2021-04-09 11:09:53.702 8 DEBUG nova.virt.libvirt.guest 
[req-4d0b64d5-a2cf-4a6e-a2f7-f6cc7ced4df1 7e2b737ed8f04b3ca819841a41be66c1 
d4d933c7b10c462c8141820b0e70822b - default default] Attempting initial detach 
for device vdb detach_device_with_retry 
/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py:455
[..]
2021-04-09 11:09:58.721 8 DEBUG nova.virt.libvirt.guest 
[req-4d0b64d5-a2cf-4a6e-a2f7-f6cc7ced4df1 7e2b737ed8f04b3ca819841a41be66c1 
d4d933c7b10c462c8141820b0e70822b - default default] Start retrying detach until 
device vdb is gone. detach_device_with_retry 
/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py:471
[..]
2021-04-09 11:09:58.729 8 ERROR oslo.service.loopingcall libvirt.libvirtError: 
internal error: unable to execute QEMU command 'device_del': Device 
virtio-disk1 is already in the process of unplug


Steps to reproduce
==
Unclear at present, it looks like a genuine QEMU bug that causes it to fail 
when a repeat request to device_del a device comes in instead of ignore the 
request as would previously happen. I've asked for clarification in the 
downstream QEMU bug.

Expected result
===
Repeat calls to device_del are ignored or the failure while raised is ignored 
by Nova.

Actual result
=
Repeat calls to device_del lead to an error being raised to Nova via libvirt 
that causes the detach to fail while it still succeeds asynchronously within 
QEMU. 

Environment
===
1. Exact version of OpenStack you are running. See the following
  list for all releases: http://docs.openstack.org/releases/

   master

2. Which hypervisor did you use?
   (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
   What's the version of that?

   libvirt + QEMU/KVM

2. Which storage type did you use?
   (For example: Ceph, LVM, GPFS, ...)
   What's the version of that?

   N/A

3. Which networking type did you use?
   (For example: nova-network, Neutron with OpenVSwitch, ...)

   N/A

Logs & Configs
==
See above.

** Affects: nova
 Importance: Undecided
 Assignee: Lee Yarwood (lyarwood)
 Status: New


** Tags: libvirt

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1923206

Title:
  libvirt.libvirtError: internal error: unable to execute QEMU command
  'device_del': Device $device is already in the process of unplug

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===
  This was initially reported downstream against QEMU in the following bug:

  Get libvirtError "Device XX is already in the process of unplug" when detach 
device in OSP env
  https://bugzilla.redhat.com/show_bug.cgi?id=1878659

  I first saw the error crop up while testing q35 in TripleO in the
  following job:

  
https://c6b36562677324bf8249-804f3f4695b3063292bbb3235f424ae0.ssl.cf1.rackcdn.com/785027/5/check
  /tripleo-ci-
  centos-8-standalone/6860050/logs/undercloud/var/log/containers/nova
  /nova-compute.log

  2021-04-09 11:09:53.702 8 DEBUG nova.virt.libvirt.guest 
[req-4d0b64d5-a2cf-4a6e-a2f7-f6cc7ced4df1 7e2b737ed8f04b3ca819841a41be66c1 
d4d933c7b10c462c8141820b0e70822b - default default] Attempting initial detach 
for device vdb detach_device_with_retry 
/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py:455
  [..]
  2021-04-09 11:09:58.721 8 DEBUG nova.virt.libvirt.guest 
[req-4d0b64d5-a2cf-4a6e-a2f7-f6cc7ced4df1 7e2b737ed8f04b3ca819841a41be66c1 
d4d933c7b10c462c8141820b0e70822b - default default] Start retrying detach until 
device vdb is gone. detach_device_with_retry 
/usr/lib/python3.6/site-packages/nova/virt/libvirt/guest.py:471
  [..]
  2021-04-09 11:09:58.729 8 ERROR oslo.service.loopingcall 
libvirt.libvirtError: internal error: unable to execute QEMU command 
'device_del': Device virtio-disk1 is already in the process of unplug

  
  Steps to reproduce
  ==
  Unclear at present, it looks like a genuine QEMU bug that causes it to fail 
when a repeat request to device_del a device comes in instead of ignore the 
request as would previously happen. I've asked for clarification in the 
downstream QEMU bug.

  Expected result
  ===
  Repeat calls to device_del are ignored or the failure while raised is ignored 
by Nova.

  Actual result
  =
  Repeat calls to device_del lead to 

[Yahoo-eng-team] [Bug 1923201] [NEW] neutron-centos-8-tripleo-standalone in periodic queue runs Neutron from Victroria release

2021-04-09 Thread Slawek Kaplonski
Public bug reported:

For example run from 9.04.2021:
https://e1066ed764372feffb00-43bef2b011f595598cf89d565fc6c894.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/neutron/master
/neutron-centos-8-tripleo-
standalone/7373ecf/logs/undercloud/var/log/containers/neutron/dhcp-
agent.log

2021-04-09 07:58:04.387 140158 INFO neutron.common.config [-] /usr/bin
/neutron-dhcp-agent version 17.1.0.dev626

This job should always test latest master branch of Neutron.

** Affects: neutron
 Importance: High
 Assignee: Slawek Kaplonski (slaweq)
 Status: Confirmed


** Tags: tripleo

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1923201

Title:
  neutron-centos-8-tripleo-standalone in periodic queue runs Neutron
  from Victroria release

Status in neutron:
  Confirmed

Bug description:
  For example run from 9.04.2021:
  
https://e1066ed764372feffb00-43bef2b011f595598cf89d565fc6c894.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/neutron/master
  /neutron-centos-8-tripleo-
  standalone/7373ecf/logs/undercloud/var/log/containers/neutron/dhcp-
  agent.log

  2021-04-09 07:58:04.387 140158 INFO neutron.common.config [-] /usr/bin
  /neutron-dhcp-agent version 17.1.0.dev626

  This job should always test latest master branch of Neutron.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1923201/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1923198] [NEW] custom kill scripts don't works after migration to privsep

2021-04-09 Thread Slawek Kaplonski
Public bug reported:

It seems that custom kill scripts aren't working properly if they are in the 
PATH which isn't in the standard PATH now.
When we were using rootwrap to run such scripts it was fine when scripts were 
e.g. in default path which is /etc/neutron/kill_scripts/ as this directory is 
added in the rootwrap's exec_dirs: 
https://github.com/openstack/neutron/blob/07b7da2251fbb607d599d48e80e4a701fa6b394e/etc/rootwrap.conf#L13
 and rootwrap is looking for binary to execute in the directories from that 
config option.

But now we moved to privsep and we have errors like:

2021-04-09 12:01:19.348 176680 DEBUG oslo.privsep.daemon [-] privsep: Exception 
during request[140575473731280]: [Errno 2] No such file or directory: 
'dnsmasq-kill': 'dnsmasq-kill' _process_cmd 
/usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:490
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 485, in 
_process_cmd
ret = func(*f_args, **f_kwargs)
  File "/usr/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 
249, in _wrap
return func(*args, **kwargs)
  File 
"/usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/utils.py", 
line 56, in execute_process
obj, cmd = _create_process(cmd, addl_env=addl_env)
  File 
"/usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/utils.py", 
line 83, in _create_process
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  File "/usr/lib/python3.6/site-packages/eventlet/green/subprocess.py", line 
58, in __init__
subprocess_orig.Popen.__init__(self, args, 0, *argss, **kwds)
  File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
  File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'dnsmasq-kill': 
'dnsmasq-kill'


Even if dnsmasq-kill script is in the /etc/neutron/kill_scripts directory.

We didn't spot it in our CI jobs as we don't run any job with those
custom kill scripts. But it is used e.g. by Tripleo and they spot it in
their jobs.

** Affects: neutron
 Importance: High
 Assignee: Slawek Kaplonski (slaweq)
 Status: Confirmed


** Tags: l3-ipam-dhcp

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1923198

Title:
  custom kill scripts don't works after migration to privsep

Status in neutron:
  Confirmed

Bug description:
  It seems that custom kill scripts aren't working properly if they are in the 
PATH which isn't in the standard PATH now.
  When we were using rootwrap to run such scripts it was fine when scripts were 
e.g. in default path which is /etc/neutron/kill_scripts/ as this directory is 
added in the rootwrap's exec_dirs: 
https://github.com/openstack/neutron/blob/07b7da2251fbb607d599d48e80e4a701fa6b394e/etc/rootwrap.conf#L13
 and rootwrap is looking for binary to execute in the directories from that 
config option.

  But now we moved to privsep and we have errors like:

  2021-04-09 12:01:19.348 176680 DEBUG oslo.privsep.daemon [-] privsep: 
Exception during request[140575473731280]: [Errno 2] No such file or directory: 
'dnsmasq-kill': 'dnsmasq-kill' _process_cmd 
/usr/lib/python3.6/site-packages/oslo_privsep/daemon.py:490
  Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/oslo_privsep/daemon.py", line 485, 
in _process_cmd
  ret = func(*f_args, **f_kwargs)
File "/usr/lib/python3.6/site-packages/oslo_privsep/priv_context.py", line 
249, in _wrap
  return func(*args, **kwargs)
File 
"/usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/utils.py", 
line 56, in execute_process
  obj, cmd = _create_process(cmd, addl_env=addl_env)
File 
"/usr/lib/python3.6/site-packages/neutron/privileged/agent/linux/utils.py", 
line 83, in _create_process
  stdout=subprocess.PIPE, stderr=subprocess.PIPE)
File "/usr/lib/python3.6/site-packages/eventlet/green/subprocess.py", line 
58, in __init__
  subprocess_orig.Popen.__init__(self, args, 0, *argss, **kwds)
File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
  restore_signals, start_new_session)
File "/usr/lib64/python3.6/subprocess.py", line 1364, in _execute_child
  raise child_exception_type(errno_num, err_msg, err_filename)
  FileNotFoundError: [Errno 2] No such file or directory: 'dnsmasq-kill': 
'dnsmasq-kill'

  
  Even if dnsmasq-kill script is in the /etc/neutron/kill_scripts directory.

  We didn't spot it in our CI jobs as we don't run any job with those
  custom kill scripts. But it is used e.g. by Tripleo and they spot it
  in their jobs.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1923198/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team

[Yahoo-eng-team] [Bug 1732428] Re: Unshelving a VM breaks instance metadata when using qcow2 backed images

2021-04-09 Thread Lee Yarwood
** Also affects: nova/ussuri
   Importance: Undecided
   Status: New

** Also affects: nova/train
   Importance: Undecided
   Status: New

** Changed in: nova/train
   Importance: Undecided => Medium

** Changed in: nova/ussuri
   Importance: Undecided => Medium

** Changed in: nova/ussuri
 Assignee: (unassigned) => Lee Yarwood (lyarwood)

** Changed in: nova/train
   Status: New => In Progress

** Changed in: nova/ussuri
   Status: New => In Progress

** Changed in: nova/train
 Assignee: (unassigned) => Lee Yarwood (lyarwood)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1732428

Title:
  Unshelving a VM breaks instance metadata when using qcow2 backed
  images

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Confirmed
Status in OpenStack Compute (nova) pike series:
  Confirmed
Status in OpenStack Compute (nova) train series:
  In Progress
Status in OpenStack Compute (nova) ussuri series:
  In Progress

Bug description:
  If you unshelve instances on compute nodes that use qcow2 backed
  instances, the instance image_ref will point to the original image the
  VM was lauched from. The base file for
  /var/lib/nova/instances/uuid/disk will be the snapshot which was used
  for shelving. This causes errors with e.g. resizes and migrations.

  Steps to reproduce/what happens:
  Have at least 2 compute nodes configured with the standard qcow2 backed 
images.

  1) Launch an instance.
  2) Shelve the instance. In the background this should in practice create a 
flattened snapshot of the VM.

  3) Unshelve the instance. The instance will boot on one of the compute
  nodes. The /var/lib/nova/instances/uuid/disk should now have the
  snapshot as its base file. The instance metadata still claims that the
  image_ref is the original image which the VM was launched from, not
  the snapshot.

  4) Resize/migrate the instance. /var/lib/nova/instances/uuid/disk
  should be copied to the other compute node. If you resize to an image
  with the same size disk, go to 5), if you resize to flavor with a
  larger disk, it probably causes an error here when it tries to grow
  the disk.

  5a) If the instance was running: When nova tries to start the VM, it
  will copy the original base image to the new compute node, not the
  snapshot base image. The instance can't boot, since it doesn't find
  its actual base file, and it goes to an ERROR state.

  5b) If the instance was shutdown: You can confirm the resize, but the
  VM won't start. The snapshot base file may be removed from the source
  machine causing dataloss.

  What should have happened:
  Either the instance image_ref should be updated to the snapshot image, or the 
snapshot image should be rebased to the original image, or is should force a 
raw only image after unshelve, or something else you smart people come up with.

  Environment:
  RDO Neutron with KVM

  rpm -qa |grep nova
  openstack-nova-common-14.0.6-1.el7.noarch
  python2-novaclient-6.0.1-1.el7.noarch
  python-nova-14.0.6-1.el7.noarch
  openstack-nova-compute-14.0.6-1.el7.noarch

  Also a big thank you to Toni Peltonen and Anton Aksola from nebula.fi
  for discovering and debugging this issue.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1732428/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1923161] [NEW] DHCP notification could be optimized

2021-04-09 Thread Oleg Bondarev
Public bug reported:

DHCP notification is done after each create/update/delete for
network, subnet and port [1].

This notification currently has to retrieve network from DB each time,
which is a quite heavy DB request and hence affects performance of
port and subnet CRUD [2].

2 proposals:
- not fetch network when it's not needed
- pass network dict from plugin

[1]
https://github.com/openstack/neutron/blob/bdd661d21898d573ef39448316860aa4c692b834/neutron/api/rpc/agentnotifiers/dhcp_rpc_agent_api.py#L111-L120

[2]
https://github.com/openstack/neutron/blob/bdd661d21898d573ef39448316860aa4c692b834/neutron/api/rpc/agentnotifiers/dhcp_rpc_agent_api.py#L200

** Affects: neutron
 Importance: Wishlist
 Assignee: Oleg Bondarev (obondarev)
 Status: In Progress


** Tags: loadimpact

** Changed in: neutron
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1923161

Title:
  DHCP notification could be optimized

Status in neutron:
  In Progress

Bug description:
  DHCP notification is done after each create/update/delete for
  network, subnet and port [1].

  This notification currently has to retrieve network from DB each time,
  which is a quite heavy DB request and hence affects performance of
  port and subnet CRUD [2].

  2 proposals:
  - not fetch network when it's not needed
  - pass network dict from plugin

  [1]
  
https://github.com/openstack/neutron/blob/bdd661d21898d573ef39448316860aa4c692b834/neutron/api/rpc/agentnotifiers/dhcp_rpc_agent_api.py#L111-L120

  [2]
  
https://github.com/openstack/neutron/blob/bdd661d21898d573ef39448316860aa4c692b834/neutron/api/rpc/agentnotifiers/dhcp_rpc_agent_api.py#L200

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1923161/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp