Hi Don,

Adding [nova] in the subject too, because we could miss some people here.


Le 08/09/2014 07:24, Dugger, Donald D a écrit :

As I mentioned in a prior email I think that, although we’re in agreement on what needs to be done before splitting out the scheduler into the Gantt project, I believe we have different views on what that agreement actually is. Given that we have multiple people that actively want to work on this split I would like to try and put down the specifics of what needs to be accomplished.

As I see it the top level issue is cleaning up the internal interfaces between the Nova core code and the scheduler, specifically:

1)The client interface

a.Done – we’ve created and pushed a patch to address this interface

+1. Scheduler-lib is now merged, but using JSON dicts to pass updates to the scheduler. The main point of this blueprint is to create a new interface for updating stats to the Scheduler, as RT was previously directly sending DB modifications to the conductor (even not yet using objects)


2)Data-base access

a.Ongoing – we’ve created a patch that missed the Juno deadline, try again in Kilo


This isolate-scheduler-db blueprint was based on Extensible Resource Tracker (ERT). As ERT is not yet fully merged upstream (the scheduler part is still on review) and as it's not providing a clear interface for stats (just adding nested dicts to a big JSON string), we decided to review other opportunities for sending these updates necessary for having the filters looking at HostState instead of directly calling other Nova objects.


3)Resource Tracker

a.Identify what data is sent from compute to scheduler

b.Track that data inside the scheduler

c.Not started yet (being discussed)


To be precise, we need to clearly define the interface that the Scheduler is exposing. As I said above, there is now another method called update_resource_stats(name, stats) which provided a first endpoint for sending updates to the Scheduler, but we need to strengthen this method by having validation and typing here instead of a blob.

On the other hand, we also need to make sure that the claiming mechanism is robust enough for supporting various kinds of claiming (the NUMA patches that were sent proved that there is room for improvement here). Ideally, the claiming system should be done on the Scheduler itself (by having a distributed transactional model for concurrent scheduling requests to multiple schedulers).


These to me are the critical items for the split. Yes there are lots of other areas/interfaces inside Nova that should be cleaned up but the goal here is to split out the scheduler, not to refactor every interface inside Nova.


Indeed, we need to keep in track the objective to split the Scheduler as soon as possible. That's why I'm proposing a strategy of updating stats to the Scheduler by passing Nova objects (ComputeNode here) to the Scheduler using the update_resource_stats() method previously given and by adding the instance_claim and resize_claim methods to the ComputeNode object itself, so that a select_destinations() call from the conductor could issue a call to each ComputeNode it elects for making sure it would have enough resources for it. The current RT claims would be kept for backwards compatibility purpose and doublechecking until we consider the new workflow good enough for removing these claims.

The above strategy is coming from a braindump but estimated as the lowest common denominator for all the necessary changes. I'm really concerned by any temptative of doing some big-bangs here which could leave us to loose the focus on splitting the scheduler.

Feel free to correct this email but I really want to make sure we all are in agreement on the same thing so that we can actually get something done.


Yeah, I assume that's quite frustrating because the design phase is not yet ended. IMHO I think we need to find some sort of online meetup for discussing all the bits of the split, as we can't wait for the Kilo Summit to be here. The Gantt meetings are obviously not the right place for discussing design and implementation so we need to promote online tools for doing such work.

We need ideas, we need volunteers, so feel free to raise your hand (and your voice) if you reader, you're willing to work on that effort.

-Sylvain




--

Don Dugger

"Censeo Toto nos in Kansa esse decisse." - D. Gale

Ph: 303/443-3786



_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to