Please update your use cases here ......


> Hi all,
> Adding to the interesting discussion thread regarding the scheduler split
> and its importance, I would like to pitch in a couple of thoughts in favor
> of Gantt.  It was in the Icehouse summit in HKG in one of the scheduler
> design sessions, I along with a few others (cc’d) pitched a session on Smart
> Resource Placement
> (,
> where we pitched for a  Smart Placement Decision Engine  as a Service ,
> addressing cross-service scheduling as one of the use cases.  We pitched the
> idea as to how a stand-alone service can act as a  smart resource placement
> engine, (see figure:
> that can use state data from all the services, and make a unified placement
> decision.   We even have proposed a separate blueprint
> ( with working
> code now here: called
> Smart Scheduler (Solver Scheduler), which has the goals of being able to do
> smart resource placement taking into account complex constraints
> incorporating compute(nova), storage(cinder), and network constraints.   The
> existing Filter Scheduler or the projects like Smart (Solver) Scheduler (for
> covering the complex constraints scenarios) could easily fulfill the
> decision making aspects of the placement engine.
> I believe the Gantt project is the right direction in terms of separating
> out the placement decision concern, and creating a separate scheduler as a
> service, so that it can freely talk to any of the other services, or use a
> unified global state repository and make the unified decision.  Projects
> like Smart(Solver) Scheduler can easily fit into the Gantt Project as
> pluggable drivers to add the additional smarts required.
> To make our Smart Scheduler as a service, we currently have prototyped this
> Scheduler as a service providing a RESTful interface to the smart scheduler,
> that is detached from Nova (loosely connected):
> For example a RESTful request like this (where I am requests for 2 Vms, with
> a requirement of 1 GB disk, and another request for 1 Vm of flavor
> ‘m1.tiny’, but also has a special requirement that it should be close to the
> volume with uuid: “ef6348300bc511e4bc4cc03fd564d1bc" (Compute-Volume
> affinity constraint)) :
> curl -i -H "Content-Type: application/json" -X POST -d
> '{"instance_requests": [{"num_instances": 2, "request_properties":
> {"instance_type": {"root_gb": 1}}}, {"num_instances": 1,
> "request_properties": {"flavor": "m1.tiny”, “volume_affinity":
> "ef6348300bc511e4bc4cc03fd564d1bc"}}]}'
> http://<x.x.x.x>/smart-scheduler-as-a-service/v1.0/placement
> provides a placement decision something like this:
> {
>   "result": [
>     [
>       {
>         "host": {
>           "host": "Host1",
>           "nodename": "Node1"
>         },
>         "instance_uuid": "VM_ID_0_0"
>       },
>       {
>         "host": {
>           "host": "Host2",
>           "nodename": "Node2"
>         },
>         "instance_uuid": "VM_ID_0_1"
>       }
>     ],
>     [
>       {
>         "host": {
>           "host": "Host1",
>           "nodename": "Node1"
>         },
>         "instance_uuid": "VM_ID_1_0"
>       }
>     ]
>   ]
> }
> This placement result can be used by Nova to proceed and complete the
> scheduling.
> This is where I see the potential for Gantt, which will be a stand alone
> placement decision engine, and can easily accommodate different pluggable
> engines (such as Smart Scheduler
> (  to do smart
> placement decisions.
> Pointers:
> Smart Resource Placement overview:
> Figure:
> Nova Design Session Etherpad:
> Smart Scheduler Blueprint:
> Working code:
> Thanks,
> Yathi.
> Hi All,
> I’m sorry I am so late to this lively discussion – it looks a good one! Jay
> has been driving the debate a bit so most of this is in response to his
> comments. But please, anyone should chip in.
> On extensible resource tracking
> Jay, I am surprised to hear you say no one has explained to you why there is
> an extensible resource tracking blueprint. It’s simple, there was a
> succession of blueprints wanting to add data about this and that to the
> resource tracker and the scheduler and the database tables used to
> communicate. These included capabilities, all the stuff in the stats,
> rxtx_factor, the equivalent for cpu (which only works on one hypervisor I
> think), pci_stats and more were coming including,
> So, in short, your claim that there are no operators asking for additional
> stuff is simply not true.
> Around about the Icehouse summit (I think) it was suggested that we should
> stop the obvious trend and add a way to make resource tracking extensible,
> similar to metrics, which had just been added as an extensible way of
> collecting on going usage data (because that was also wanted).
> The json blob you refer to was down to the bad experience of the
> compute_node_stats table implemented for stats – which had a particular
> performance hit because it required an expensive join. This was dealt with
> by removing the table and adding a string field to contain the data as a
> json blob. A pure performance optimization. Clearly there is no need to
> store things in this way and with Nova objects being introduced there is a
> means to provide strict type checking on the data even if it is stored as
> json blobs in the database.
> On scheduler split
> I have no particular position on splitting the scheduler. However, there was
> an interesting reaction to the network bandwidth entitlement blueprint
> listed above. The nova community felt it was a network thing and so nova
> should not provide it – neutron should. Of course, in nova, the nova
> scheduler makes placement decisions… can you see where this is going…? Nova
> needs to coordinate its placement decision with neutron to decide if a host
> has sufficient bandwidth available. Similar points are made about cinder –
> nova has no idea about cinder, but in some environments the location of a
> volume matters when you come to place an instance.
> I should re-iterate that I have no position on splitting out the scheduler,
> but some way to deal with information from outside nova is certainly
> desirable. Maybe other services have the same dilemma.
> On global resource tracker
> I have to say I am inclined to be against the idea of turning the scheduler
> into a “global resource tracker”. I do see the benefit of obtaining a
> resource claim up front, we have all seen that the scheduler can make
> incorrect choices because of the delay in reflecting resource allocation to
> the database and so to the scheduler – it operates on imperfect information.
> However, it is best to avoid a global service relying on synchronous
> interaction with compute nodes during the process of servicing a request. I
> have looked at your example code for the scheduler (global resource tracker)
> and it seems to make a choice from local information and then interact with
> the chosen compute node to obtain a claim and then try again if the claim
> fails. I get it – I see that it deals with the same list of hosts on the
> retry. I also see it has no better chance of getting it right.
> Your desire to have a claim is borne out by the persistent claims spec (I
> love the spec, I really I don’t see why they have to be persistent). I think
> that is a great idea. Why not let the scheduler make placement suggestions
> (as a global service) and then allow conductors to obtain the claim and
> retry if the claim fails? Similar process to your code, but the scheduler
> only does its part and the conductors scale out the process by acting more
> locally and with more parallelism. (Of course, you could also be optimistic
> and allow the compute node to do the claim as part of the create as the
> degenerate case).
> To emphasize the point further, what would a cells scheduler do? Would that
> also make a synchronous operation to obtain the claim?
> My reaction to the global resource tracker idea has been quite negative. I
> want to like the idea because I like the thought of knowing I have the
> resources when I get my answer. Its just that I think the persistent claims
> (without the persistent part J ) gives us a lot of what we need. But I am
> still open to be convinced.
> Paul
> On 07/14/2014 10:16 AM, Sylvain Bauza wrote:
>> Le 12/07/2014 06:07, Jay Pipes a écrit :
>>> On 07/11/2014 07:14 AM, John Garbutt wrote:
>>>> On 10 July 2014 16:59, Sylvain Bauza <sbauza at> wrote:
>>>>> Le 10/07/2014 15:47, Russell Bryant a écrit :
>>>>>> On 07/10/2014 05:06 AM, Sylvain Bauza wrote:
>>>>>>> Hi all,
>>>>>>> === tl;dr: Now that we agree on waiting for the split
>>>>>>> prereqs to be done, we debate on if ResourceTracker should
>>>>>>> be part of the scheduler code and consequently Scheduler
>>>>>>> should expose ResourceTracker APIs so that Nova wouldn't
>>>>>>> own compute nodes resources. I'm proposing to first come
>>>>>>> with RT as Nova resource in Juno and move ResourceTracker
>>>>>>> in Scheduler for K, so we at least merge some patches by
>>>>>>> Juno. ===
>>>>>>> Some debates occured recently about the scheduler split, so
>>>>>>> I think it's important to loop back with you all to see
>>>>>>> where we are and what are the discussions. Again, feel free
>>>>>>> to express your opinions, they are welcome.
>>>>>> Where did this resource tracker discussion come up?  Do you
>>>>>> have any references that I can read to catch up on it?  I
>>>>>> would like to see more detail on the proposal for what should
>>>>>> stay in Nova vs. be moved.  What is the interface between
>>>>>> Nova and the scheduler here?
>>>>> Oh, missed the most important question you asked. So, about
>>>>> the interface in between scheduler and Nova, the original
>>>>> agreed proposal is in the spec
>>>>> (approved) where the
>>>>> Scheduler exposes : - select_destinations() : for querying the
>>>>> scheduler to provide candidates - update_resource_stats() : for
>>>>> updating the scheduler internal state (ie. HostState)
>>>>> Here, update_resource_stats() is called by the
>>>>> ResourceTracker, see the implementations (in review)
>>>>> and
>>>>> The alternative that has just been raised this week is to
>>>>> provide a new interface where ComputeNode claims for resources
>>>>> and frees these resources, so that all the resources are fully
>>>>> owned by the Scheduler. An initial PoC has been raised here
>>>>> but I tried to see what
>>>>> would be a ResourceTracker proxified by a Scheduler client here
>>>>> : As the spec hasn't been
>>>>> written, the names of the interfaces are not properly defined
>>>>> but I made a proposal as : - select_destinations() : same as
>>>>> above - usage_claim() : claim a resource amount -
>>>>> usage_update() : update a resource amount - usage_drop(): frees
>>>>> the resource amount
>>>>> Again, this is a dummy proposal, a spec has to written if we
>>>>> consider moving the RT.
>>>> While I am not against moving the resource tracker, I feel we
>>>> could move this to Gantt after the core scheduling has been
>>>> moved.
>>> Big -1 from me on this, John.
>>> Frankly, I see no urgency whatsoever -- and actually very little
>>> benefit -- to moving the scheduler out of Nova. The Gantt project I
>>> think is getting ahead of itself by focusing on a split instead of
>>> focusing on cleaning up the interfaces between nova-conductor,
>>> nova-scheduler, and nova-compute.
>> -1 on saying there is no urgency. Don't you see the NFV group saying
>> each meeting what is the status of the scheduler split ?
> Frankly, I don't think a lot of the NFV use cases are well-defined.
> Even more frankly, I don't see any benefit to a split-out scheduler to a
> single NFV use case.
>> Don't you see each Summit the lots of talks (and people attending
>> them) talking about how OpenStack should look at Pets vs. Cattle and
>> saying that the scheduler should be out of Nova ?
> There's been no concrete benefits discussed to having the scheduler
> outside of Nova.
> I don't really care how many people say that the scheduler should be out
> of Nova unless those same people come to the table with concrete reasons
> why. Just saying something is a benefit does not make it a benefit, and
> I think I've outlined some of the very real dangers -- in terms of code
> and payload complexity -- of breaking the scheduler out of Nova until
> the interfaces are cleaned up and the scheduler actually owns the
> resources upon which it exercises placement decisions.
>> From an operator perspective, people waited so long for having a
>> scheduler doing "scheduling" and not only "resource placement".
> Could you elaborate a bit here? What operators are begging for the
> scheduler to do more than resource placement? And if they are begging
> for this, what use cases are they trying to address?
> I'm genuinely curious, so looking forward to your reply here! :)
> snip...
>>> As for the idea that things will get *easier* once scheduler code
>>> is broken out of Nova, I go back to my original statement that I
>>> don't really see the benefit of the split at this point, and I
>>> would just bring up the fact that Neutron/nova-network is a shining
>>> example of how things can easily backfire when splitting of code is
>>> done too early before interfaces are cleaned up and
>>> responsibilities between internal components are not clearly agreed
>>> upon.
>> Please, please, don't mix the rationale for extensible Resource
>> Tracker and the current efforts for moving out the Scheduler. Both of
>> them try to have an agnostic and heterogeneous scheduler, but both
>> efforts are independent.
>> The ResourceTracker is something pure Nova. Saying to Gantt "I want
>> to store this data" and "I want you to select a destination" is
>> something enough agnostic for not including the port of
>> ResourceTracker to the Scheduler.
> Sorry, I'm not following you. Who is saying to Gantt "I want to store
> this data"?
> All I am saying is that the thing that places a resource on some
> provider of that resource should be the thing that owns the process of a
> requester *claiming* the resources on that provider, and in order to
> properly track resources in a race-free way in such a system, then the
> system needs to contain the resource tracker.
>> While I approve to define the interfaces now, there is no reason tho
>> to say we would have to change anything in how Nova is doing that.
>> The role of Gantt is to define the interfaces, make the line
>> Scheduler vs. Nova and forklift the Scheduler into a single project.
>> No big bang is needed here.
> Yeah, I just don't see the need to split the scheduler at this point,
> sorry. :(
> Best,
> -jay


