On 07/23/2013 06:00 PM, Clint Byrum wrote: > This is really interesting work, thanks for sharing it with us. The > discussion that has followed has brought up some thoughts I've had for > a while about this choke point in what is supposed to be an extremely > scalable cloud platform (OpenStack). > > I feel like the discussions have all been centered around making "the" > scheduler(s) intelligent. There seems to be a commonly held belief that > scheduling is a single step, and should be done with as much knowledge > of the system as possible by a well informed entity. > > Can you name for me one large scale system that has a single entity, > human or computer, that knows everything about the system and can make > good decisions quickly? > > This problem is screaming to be broken up, de-coupled, and distributed. > > I keep asking myself these questions: > > Why are all of the compute nodes informing all of the schedulers? > > Why are all of the schedulers expecting to know about all of the compute > nodes? > > Can we break this problem up into simpler problems and distribute the load to > the entire system? > > This has been bouncing around in my head for a while now, but as a > shallow observer of nova dev, I feel like there are some well known > scaling techniques which have not been brought up. Here is my idea, > forgive me if I have glossed over something or missed a huge hole: > > * Schedulers break up compute nodes by hash table, only caring about > those in their hash table. > * Schedulers, upon claiming a compute node by hash table, poll compute > node directly for its information. > * Requests to boot go into fanout. > * Schedulers get request and try to satisfy using only their own compute > nodes. > * Failure to boot results in re-insertion in the fanout. > > This gives up the certainty that the scheduler will find a compute node > for a boot request on the first try. It is also possible that a request > gets unlucky and takes a long time to find the one scheduler that has > the one last "X" resource that it is looking for. There are some further > optimization strategies that can be employed (like queues based on hashes > already tried.. etc). > > Anyway, I don't see any point in trying to hot-rod the intelligent > scheduler to go super fast, when we can just optimize for having many > many schedulers doing the same body of work without blocking and without > pounding a database.
These are some *very* good observations. I'd like all of the nova folks interested in this are to give some deep consideration of this type of approach. -- Russell Bryant _______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
