@Michael Okay, focusing on 'thousands' now, I know 'millions' is not good metaphor here. I also know 'cells' functionality is nova's solution for large scale deployment. But it also makes sense to find and re-produce large scale problems in relatively small scale deployment.
@Sandy All-in-all, I think you'd be better off load testing each piece independently on a fixed hardware platform and faking out all the incoming/outgoing services.... I understand and this is what I want to know. Is anyone doing the work like this? If yes, I would like to join :) On Fri, Nov 28, 2014 at 8:36 AM, Sandy Walsh <[email protected]> wrote: > >From: Michael Still [[email protected]] Thursday, November 27, 2014 6:57 > PM > >To: OpenStack Development Mailing List (not for usage questions) > >Subject: Re: [openstack-dev] [nova] is there a way to simulate thousands > or millions of compute nodes? > > > >I would say that supporting millions of compute nodes is not a current > >priority for nova... We are actively working on improving support for > >thousands of compute nodes, but that is via cells (so each nova deploy > >except the top is still in the hundreds of nodes). > > <ramble on> > > Agreed, it wouldn't make much sense to simulate this on a single machine. > > That said, if one *was* to simulate this, there are the well known > bottlenecks: > > 1. the API. How much can one node handle with given hardware specs? Which > operations hit the DB the hardest? > 2. the Scheduler. There's your API bottleneck and big load on the DB for > Create operations. > 3. the Conductor. Shouldn't be too bad, essentially just a proxy. > 4. child-to-global-cell updates. Assuming a two-cell deployment. > 5. the virt driver. YMMV. > ... and that's excluding networking, volumes, etc. > > The virt driver should be load tested independently. So FakeDriver would > be fine (with some delays added for common operations as Gareth suggests). > Something like Bees-with-MachineGuns could be used to get a baseline metric > for the API. Then it comes down to DB performance in the scheduler and > conductor (for a single cell). Finally, inter-cell loads. Who blows out the > queue first? > > All-in-all, I think you'd be better off load testing each piece > independently on a fixed hardware platform and faking out all the > incoming/outgoing services. Test the API with fake everything. Test the > Scheduler with fake API calls and fake compute nodes. Test the conductor > with fake compute nodes (not FakeDriver). Test the compute node directly. > > Probably all going to come down to the DB and I think there is some good > performance data around that already? > > But I'm just spit-ballin' ... and I agree, not something I could see the > Nova team taking on in the near term ;) > > -S > > > > > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Gareth *Cloud Computing, OpenStack, Distributed Storage, Fitness, Basketball* *OpenStack contributor, kun_huang@freenode* *My promise: if you find any spelling or grammar mistakes in my email from Mar 1 2013, notify me * *and I'll donate $1 or ¥1 to an open organization you specify.*
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
