[Openstack-operators] [manila] manila operator's feedback forum etherpad available
Cross posting from Openstack-dev because Tom's unable to post to this list yet. Manila operators, please note the session at the Forum next week. Thanks, Goutham -- Forwarded message -- From: Tom BarronDate: Thu, May 17, 2018 at 10:57 AM Subject: [openstack-dev] [manila] manila operator's feedback forum etherpad available To: openstack-operators@lists.openstack.org, openstack-...@lists.openstack.org Next week at the Summit there is a forum session dedicated to Manila opertors' feedback on Thursday from 1:50-2:30pm [1] for which we have started an etherpad [2]. Please come and help manila developers do the right thing! We're particularly interested in experiences running the OpenStack share service at scale and overcoming any obstacles to deployment but are interested in getting any and all feedback from real deployments so that we can tailor our development and maintenance efforts to real world needs. Please feel free and encouraged to add to the etherpad starting now. See you there! -- Tom Barron Manila PTL irc: tbarron [1] https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/21780/manila-ops-feedback-running-at-scale-overcoming-barriers-to-deployment [2] https://etherpad.openstack.org/p/YVR18-manila-forum-ops-feedback ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] [nova] FYI on changes that might impact out of tree scheduler filters
CERN has upgraded to Cells v2 and is doing performance testing of the scheduler and were reporting some things today which got us back to this bug [1]. So I've starting pushing some patches related to this but also related to an older blueprint I created [2]. In summary, we do quite a bit of DB work just to load up a list of instance objects per host that the in-tree filters don't even use. The first change [3] is a simple optimization to avoid the default joins on the instance_info_caches and security_groups tables. If you have out of tree filters that, for whatever reason, rely on the HostState.instances objects to have info_cache or security_groups set, they'll continue to work, but will have to round-trip to the DB to lazy-load the fields, which is going to be a performance penalty on that filter. See the change for details. The second change in the series [4] is more drastic in that we'll do away with pulling the full Instance object per host, which means only a select set of optional fields can be lazy-loaded [5], and the rest will result in an exception. The patch currently has a workaround config option to continue doing things the old way if you have out of tree filters that rely on this, but for good citizens with only in-tree filters, you will get a performance improvement during scheduling. There are some other things we can do to optimize more of this flow, but this email is just about the ones that have patches up right now. [1] https://bugs.launchpad.net/nova/+bug/1737465 [2] https://blueprints.launchpad.net/nova/+spec/put-host-manager-instance-info-on-a-diet [3] https://review.openstack.org/#/c/569218/ [4] https://review.openstack.org/#/c/569247/ [5] https://github.com/openstack/nova/blob/de52fefa1fd52ccaac6807e5010c5f2a2dcbaab5/nova/objects/instance.py#L66 -- Thanks, Matt ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Need feedback for nova aborting cold migration function
On 5/15/2018 3:48 AM, saga...@nttdata.co.jp wrote: We store the service logs which are created by VM on that storage. I don't mean to be glib, but have you considered maybe not doing that? -- Thanks, Matt ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] attaching network cards to VMs taking a very long time
We have other scheduled tests that perform end-to-end (assign floating IP, ssh, ping outside) and never had an issue. I think we turned it off because the callback code was initially buggy and nova would wait forever while things were in fact ok, but I'll change "vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run another large test, just to confirm. We usually run these large tests after a version upgrade to test the APIs under load. On Thu, May 17, 2018 at 11:42 AM, Matt Riedemannwrote: > On 5/17/2018 9:46 AM, George Mihaiescu wrote: > >> and large rally tests of 500 instances complete with no issues. >> > > Sure, except you can't ssh into the guests. > > The whole reason the vif plugging is fatal and timeout and callback code > was because the upstream CI was unstable without it. The server would > report as ACTIVE but the ports weren't wired up so ssh would fail. Having > an ACTIVE guest that you can't actually do anything with is kind of > pointless. > > -- > > Thanks, > > Matt > > ___ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] attaching network cards to VMs taking a very long time
On 5/17/2018 9:46 AM, George Mihaiescu wrote: and large rally tests of 500 instances complete with no issues. Sure, except you can't ssh into the guests. The whole reason the vif plugging is fatal and timeout and callback code was because the upstream CI was unstable without it. The server would report as ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE guest that you can't actually do anything with is kind of pointless. -- Thanks, Matt ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] attaching network cards to VMs taking a very long time
We use "vif_plugging_is_fatal = False" and "vif_plugging_timeout = 0" as well as "no-ping" in the dnsmasq-neutron.conf, and large rally tests of 500 instances complete with no issues. These are some good blogposts about Neutron performance: https://www.mirantis.com/blog/openstack-neutron-performance-and-scalability-testing-summary/ https://www.mirantis.com/blog/improving-dhcp-performance-openstack/ I would run a large rally test like this one and see where time is spent mostly: { "NovaServers.boot_and_delete_server": [ { "args": { "flavor": { "name": "c2.small" }, "image": { "name": "^Ubuntu 16.04 - latest$" }, "force_delete": false }, "runner": { "type": "constant", "times": 500, "concurrency": 100 } } ] } Cheers, George On Thu, May 17, 2018 at 7:49 AM, Radu Popescu | eMAG, Technology < radu.pope...@emag.ro> wrote: > Hi, > > unfortunately, didn't get the reply in my inbox, so I'm answering from the > link here: > http://lists.openstack.org/pipermail/openstack-operators/ > 2018-May/015270.html > (hopefully, my reply will go to the same thread) > > Anyway, I can see the neutron openvswitch agent logs processing the > interface way after the VM is up (in this case, 30 minutes). And after the > vif plugin timeout of 5 minutes (currently 10 minutes). > After searching for logs, I came out with an example here: (replaced nova > compute hostname with "nova.compute.hostname") > > http://paste.openstack.org/show/1VevKuimoBMs4G8X53Eu/ > > As you can see, the request for the VM starts around 3:27AM. Ports get > created, openvswitch has the command to do it, has DHCP, but apparently > Neutron server sends the callback after Neutron Openvswitch agent finishes. > Callback is at 2018-05-10 03:57:36.177 while Neutron Openvswitch agent says > it completed the setup and configuration at 2018-05-10 03:57:35.247. > > So, my question is, why is Neutron Openvswitch agent processing the > request 30 minutes after the VM is started? And where can I search for logs > for whatever happens during those 30 minutes? > And yes, we're using libvirt. At some point, we added some new nova > compute nodes and the new ones came with v3.2.0 and was breaking migration > between hosts. That's why we downgraded (and versionlocked) everything at > v2.0.0. > > Thanks, > Radu > > ___ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > > ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] attaching network cards to VMs taking a very long time
Hi, unfortunately, didn't get the reply in my inbox, so I'm answering from the link here: http://lists.openstack.org/pipermail/openstack-operators/2018-May/015270.html (hopefully, my reply will go to the same thread) Anyway, I can see the neutron openvswitch agent logs processing the interface way after the VM is up (in this case, 30 minutes). And after the vif plugin timeout of 5 minutes (currently 10 minutes). After searching for logs, I came out with an example here: (replaced nova compute hostname with "nova.compute.hostname") http://paste.openstack.org/show/1VevKuimoBMs4G8X53Eu/ As you can see, the request for the VM starts around 3:27AM. Ports get created, openvswitch has the command to do it, has DHCP, but apparently Neutron server sends the callback after Neutron Openvswitch agent finishes. Callback is at 2018-05-10 03:57:36.177 while Neutron Openvswitch agent says it completed the setup and configuration at 2018-05-10 03:57:35.247. So, my question is, why is Neutron Openvswitch agent processing the request 30 minutes after the VM is started? And where can I search for logs for whatever happens during those 30 minutes? And yes, we're using libvirt. At some point, we added some new nova compute nodes and the new ones came with v3.2.0 and was breaking migration between hosts. That's why we downgraded (and versionlocked) everything at v2.0.0. Thanks, Radu ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators