Folks, we'll take a look if we have observed this issue during MOS testing on 200 nodes lab. Sadly I do not remember this pattern being seen, although we do not keep lots of the VMs for a long time during the synthetic tests.
Cheers, Dina On Fri, Nov 13, 2015 at 2:34 AM, Joshua Harlow <[email protected]> wrote: > Ok, so the following is starting to form: > > https://etherpad.openstack.org/p/remote-conductor-performance > > Hopefully we can get to the bottom of this (especially for clouds that run > a large amount of computes in a single cell/only one cell). > > > Andrew Laski wrote: > >> On 11/12/15 at 10:53am, Clint Byrum wrote: >> >>> Excerpts from Joshua Harlow's message of 2015-11-12 10:35:21 -0800: >>> >>>> Mike Dorman wrote: >>>> > We do have a backlog story to investigate this more deeply, we just >>>> have not had the time to do it yet. For us, it’s been easier/faster >>>> to add more hardware to conductor to get over the hump temporarily. >>>> > >>>> > We kind of have that work earmarked for after the Liberty upgrade, >>>> in hopes that maybe it’ll be fixed there. >>>> > >>>> > If anybody else has done even some trivial troubleshooting already, >>>> it’d be great to get that info as a starting point. I.e. which >>>> specific calls to conductor are causing the load, etc. >>>> > >>>> > Mike >>>> > >>>> >>>> +1 I think we in the #openstack-performance channel really need to >>>> investigate this, because it really worries me personally from hearing >>>> many many rumors about how the remote conductor falls over. Please join >>>> there and we can try to work through a plan to figure out what to do >>>> about this situation. It would be great if the nova people also joined >>>> there (because in the end, likely something in nova will need to be >>>> fixed/changed/something else to resolve what appears to be a problem for >>>> many operators). >>>> >>>> >>> Falling over is definitely a bad sign. ;) >>> >>> The concept of pushing messages over a bus instead of just making local >>> calls shouldn't result in much extra load. Perhaps we just have too many >>> layers of unoptimized encapsulation. I have to wonder if something like >>> protobuf would help. >>> >> >> Falling over is also a very broad description and doesn't let us know >> what the actual issue is. >> >> From my experience the performance concern with conductor has been in >> not understanding the ratio of conductor nodes to computes that are >> necessary for our usage. Conductor doesn't add much extra load, but it >> concentrates it on a smaller number of services. If we ran one conductor >> per compute I suspect we would have no performance issues, but that's a >> lot of capacity to use for this. >> >> I am curious what conductor/compute ratios that others are trying to >> achieve, given equal hardware types for each, and what are the barriers >> to this happening? >> >> >>> _______________________________________________ >>> OpenStack-operators mailing list >>> [email protected] >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >>> >> >> _______________________________________________ >> OpenStack-operators mailing list >> [email protected] >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators >> > > _______________________________________________ > OpenStack-operators mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > -- Best regards, Dina Belova Software Engineer Mirantis Inc.
_______________________________________________ OpenStack-operators mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
