Hi all, I've made an analyse a while a go how to use SolverScheduler with a policy engine:
https://docs.google.com/document/d/1RfP7jRsw1mXMjd7in72ARjK0fTrsQv1bqolOriIQB2Y Basically there should be a plugin that translates the policy into constraints for solver to solve. This was made using Policy-Based Engine [1], but it works well with Congress. [1] https://blueprints.launchpad.net/nova/+spec/policy-based-scheduler ----- Mail original ----- > De: "Tim Hinrichs" <thinri...@vmware.com> > À: "ruby krishnaswamy" <ruby.krishnasw...@orange.com> > Cc: "Prabhakar Kudva" <ku...@us.ibm.com>, "openstack-dev" > <openstack-dev@lists.openstack.org>, "Gokul B Kandiraju" > <go...@us.ibm.com> > Envoyé: Jeudi 18 Décembre 2014 18:24:59 > Objet: Re: [openstack-dev] [Congress] Re: Placement and Scheduling via > Policy > > Hi all, > > Responses inline. > > On Dec 16, 2014, at 10:57 PM, > <ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>> > <ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>> wrote: > > Hi Tim & All > > @Tim: I did not reply to openstack-dev. Do you think we could have an > openstack list specific for “congress” to which anybody may subscribe? > > Sending to openstack-dev is the right thing, as long as we put [Congress] in > the subject. Everyone I know sets up filters on openstack-dev so they only > get the mail they care about. I think you’re the only one in the group who > isn’t subscribed to that list. > > > > 1) Enforcement: > By this we mean “how will the actions computed by the policy > engine be executed by the concerned OpenStack functional module”. > > > In this case, it is better to first work this out for a “simpler” case, > e.g. your running example concerning the network/groups. > Note: some actions concern only some data base (e.g. insert the > user within some group). > > > > 2) From Prabhakar’s mail > > “Enforcement. That is with a large number of constraints in place for > placement and > scheduling, how does the policy engine communicate and enforce the placement > constraints to nova scheduler. “ > > Nova scheduler (current): It assigns VMs to servers based on the > policy set by the administrator (through filters and host > aggregates). > > The administrator also configures a scheduling heuristic (implemented as a > driver), for example “round-robin” driver. > Then the computed assignment > is sent back to the > requestor (API server) that > interacts with nova-compute > to provision the VM. > The current nova-scheduler > has another function: It > updates the allocation > status of each compute node > on the DB (through another > indirection called > nova-conductor) > > So it is correct to re-interpret your statement as follows: > > - What is the entity with which the policy engine interacts for either > proactive or reactive placement management? > > - How will the output from the policy engine (for example the placement > matrix) be communicated back? > > o Proactive: this gives the mapping of VM to host > > o Reactive: this gives the new mapping of running VMs to hosts > > - How starting from the placement matrix, the correct migration plan > will be executed? (for reactive case) > > > > 3) Currently openstack does not have “automated management of reactive > placement”: Hence if the policy engine is used for reactive placement, then > there is a need for another “orchestrator” that can interpret the new > proposed placement configuration (mapping of VM to servers) and execute the > reconfiguration workflow. > > > 4) So with a policy-based “placement engine” that is integrated with > external solvers, then this engine will replace nova-scheduler? > > Could we converge on this? > > > > The notes from Yathiraj say that there is already a policy-based Nova > scheduler we can use. I suggest we look into that. It could potentially > simplify our problem to the point where we need only figure out how to > convert a fragment of the Congress policy language into their policy > language. But those of you who are experts in placement will know better. > > > https://github.com/stackforge/nova-solver-scheduler<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_stackforge_nova-2Dsolver-2Dscheduler&d=AAMGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=gSzCqpS6tRMB8r5xNbeWoNcpobYiFYvOFpo3QBmvm0M&s=mdMcHh7nMTJv8PmY0i8NpQXP9_gpUpI3gxEec6zyt7A&e=> > > Tim > > > Regards > Ruby > > De : Tim Hinrichs [mailto:thinri...@vmware.com] > Envoyé : mardi 16 décembre 2014 19:25 > À : Prabhakar Kudva > Cc : KRISHNASWAMY Ruby IMT/OLPS; Ramki Krishnan > (r...@brocade.com<mailto:r...@brocade.com>); Gokul B Kandiraju; > openstack-dev > Objet : [Congress] Re: Placement and Scheduling via Policy > > [Adding openstack-dev to this thread. For those of you just joining… We > started kicking around ideas for how we might integrate a special-purpose VM > placement engine into Congress.] > > Kudva: responses inline. > > > On Dec 16, 2014, at 6:25 AM, Prabhakar Kudva > <ku...@us.ibm.com<mailto:ku...@us.ibm.com>> wrote: > > > Hi, > > I am very interested in this. > > So, it looks like there are two parts to this: > 1. Policy analysis when there are a significant mix of logical and builtin > predicates (i.e., > runtime should identify a solution space when there are arithmetic > operators). This will > require linear programming/ILP type solvers. There might be a need to have a > function > in runtime.py that specifically deals with this (Tim?) > > I think it’s right that we expect there to be a mix of builtins and standard > predicates. But what we’re considering here is having the linear solver be > treated as if it were a domain-specific policy engine. So that solver > wouldn’t be embedded into the runtime.py necessarily. Rather, we’d delegate > part of the policy to that domain-specific policy engine. > > > 2. Enforcement. That is with a large number of constraints in place for > placement and > scheduling, how does the policy engine communicate and enforce the placement > constraints to nova scheduler. > > I would imagine that we could delegate either enforcement or monitoring or > both. Eventually we want enforcement here, but monitoring could be useful > too. > > And yes you’re asking the right questions. I was trying to break the problem > down into pieces in my bullet (1) below. But I think there is significant > overlap in the questions we need to answer whether we’re delegating > monitoring or enforcement. > > > Both of these require some form of mathematical analysis. > > Would be happy and interested to discuss more on these lines. > > Maybe take a look at how I tried to breakdown the problem into separate > questions in bullet (1) below and see if that makes sense. > > Tim > > > Prabhakar > > > > > > > From: Tim Hinrichs <thinri...@vmware.com<mailto:thinri...@vmware.com>> > To: > "ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>" > <ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>> > Cc: "Ramki Krishnan (r...@brocade.com<mailto:r...@brocade.com>)" > <r...@brocade.com<mailto:r...@brocade.com>>, Gokul B > Kandiraju/Watson/IBM@IBMUS, Prabhakar Kudva/Watson/IBM@IBMUS > Date: 12/15/2014 12:09 PM > Subject: Re: Placement and Scheduling via Policy > ________________________________ > > > > [Adding Prabhakar and Gokul, in case they are interested.] > > 1) Ruby, thinking about the solver as taking 1 matrix of [vm, server] and > returning another matrix helps me understand what we’re talking > about—thanks. I think you’re right that once we move from placement to > optimization problems in general we’ll need to figure out how to deal with > actions. But if it’s a placement-specific policy engine, then we can build > VM-migration into it. > > It seems to me that the only part left is figuring out how to take an > arbitrary policy, carve off the placement-relevant portion, and create the > inputs the solver needs to generate that new matrix. Some thoughts... > > - My gut tells me that the placement-solver should basically say “I enforce > policies having to do with the schema nova:location.” This way the Congress > policy engine knows to give it policies relevant to nova:location > (placement). If we do that, I believe we can carve off the right sub > theory. > > - That leaves taking a Datalog policy where we know nova:location is > important and converting it to the input language required by a linear > solver. We need to remember that the Datalog rules may reference tables > from other services like Neutron, Ceilometer, etc. I think the key will be > figuring out what class of policies we can actually do that for reliably. > Cool—a concrete question. > > > 2) We can definitely wait until January on this. I’ll be out of touch > starting Friday too; it seems we all get back early January, which seems > like the right time to resume our discussions. We have some concrete > questions to answer, which was what I was hoping to accomplish before we all > went on holiday. > > Happy Holidays! > Tim > > > On Dec 15, 2014, at 5:53 AM, > <ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>> > <ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>> wrote: > > Hi Tim > > “Questions: > 1) Is there any more data the solver needs? Seems like it needs something > about CPU-load for each VM. > 2) Which solver should we be using? What does the linear program that we > feed it look like? How do we translate the results of the linear solver > into a collection of ‘migrate_VM’ API calls?” > > > > Question (2) seems to me the first to address, in particular: > “how to prepare the input (variables, constraints, goal) and invoke the > solver” > => We need rules that represent constraints to give the solver (e.g. a > technical constraint that a VM should not be assigned to more than one > server or that more than maximum resource (cpu / mem …) of a server cannot > be assigned. > > “how to translate the results of the linear solver into a collection of > API calls”: > => The output from the “solver” will give the new placement plan (respecting > the constraints in input)? > o E.g. a table of [vm, server, true/false] > => Then this depends on how “action” is going to be implemented in Congress > (whether an external solver is used or not) > o Is the action presented as the “final” DB rows that the system must > produce as a result of the actions? > o E.g. if current vm table is [vm3, host4] and the recomputed row says > [vm3, host6], then the action is to move vm3 to host6? > > > “how will the solver be invoked”? > => When will the optimization call be invoked? > => Is it “batched”, e.g. periodically invoke Congress to compute new > assignments? > > Which solver to use: > > http://www.coin-or.org/projects/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.coin-2Dor.org_projects_&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=3lvgeryw4T-aWafrSZZG96NcydtHt6HnT_6vKookx6U&s=01_9grcy8VGwbKRXcqhFRex3N0XIoCBzOimWFwXYI58&e=> > and > > http://www.coin-or.org/projects/PuLP.xml<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.coin-2Dor.org_projects_PuLP.xml&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=3lvgeryw4T-aWafrSZZG96NcydtHt6HnT_6vKookx6U&s=RRiv5ZWCQwWguBZIsIXzCA4_otY4Gr7aeFmFMRB4ZZQ&e=> > I think it may be useful to pass through an interface (e.g. LP modeler > to generate LP files in standard formats accepted by prevalent > solvers) > > > The mathematical program: > We can (Orange) contribute to writing down in an informal way the > program for this precise use case, if this can wait until January. > Perhaps the objective is to may be “minimize the number of servers > whose usage is less than 50%”, since the original policy “Not more > than 1 server of type1 to have a load under 50%” need not necessarily > have a solution. > > This may help to derive the “mappings” from Congress (rules to program > equations, intermediary tables to program variables)? > > > For “migration” use case: it may be useful to add some constraint > representing cost of migration, such that the solver computes the new > assignment plan such that the maximum migration cost is not exceeded. To > start with, perhaps number of migrations? > > > I will be away from the end of the week until 5th January. I will also > discuss with colleagues to see how we can formalize contribution > (congress+nfv poc). > > Rgds > Ruby > > De : Tim Hinrichs [mailto:thinri...@vmware.com] > Envoyé : vendredi 12 décembre 2014 19:41 > À : KRISHNASWAMY Ruby IMT/OLPS > Cc : Ramki Krishnan (r...@brocade.com<mailto:r...@brocade.com>) > Objet : Re: Placement and Scheduling via Policy > > There’s a ton of good stuff here! > > So if we took Ramki’s initial use case and combined it with Ruby’s HA > constraint, we’d have something like the following policy. > > > // anti-affinity > error (server, VM1, VM2) :- > same_ha_group(VM1, VM2), > nova:location(VM1, server), > nova:location(VM2, server) > > // server-utilization > error(server) :- > type1_server(server), > ceilometer:average_utilization(server, “cpu-util”, avg), > avg < 50 > > As a start, this seems plenty complex to me. anti-affinity is great b/c it > DOES NOT require a sophisticated solver; server-utilization is great because > it DOES require a linear solver. > > Data the solver needs: > - Ceilometer: cpu-utilization for all the servers > - Nova: data as to where each VM is located > - Policy: high-availability groups > > Questions: > 1) Is there any more data the solver needs? Seems like it needs something > about CPU-load for each VM. > 2) Which solver should we be using? What does the linear program that we > feed it look like? How do we translate the results of the linear solver > into a collection of ‘migrate_VM’ API calls? > > Maybe another few emails and then we set up a phone call. > > Tim > > > > > > > > > On Dec 11, 2014, at 1:33 AM, > <ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>> > <ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>> wrote: > > > Hello > > A) First a small extension to the use case that Ramki proposes > > - Add high availability constraint. > - Assuming server-a and server-b are of same size and same failure model. > [Later: Assumption of identical failure rates can be loosened. > Instead of considering only servers as failure domains, can introduce > other failure domains ==> not just an anti-affinity policy but a > calculation from 99,99.. requirement to VM placements, e.g. > ] > - For an exemplary maximum usage scenario, 53 physical servers could be > under peak utilization (100%), 1 server (server-a) could be under partial > utilization (50%) with 2 instances of type large.3 and 1 instance of type > large.2, and 1 server (server-b) could be under partial utilization > (37.5%) with 3 instances of type large.2. > Call VM.one.large2 as the large2 VM in server-a > Call VM.two.large2 as one of the large2 VM in server-b > > - VM.one.large2 and VM.two.large2 > - When one of the large.3 instances mapped to server-a is deleted from > physical server type 1, Policy 1 will be violated, since the overall > utilization of server-a falls to 37,5%. > > - Various new placements(s) are described below > > > VM.two.large2 must not be moved. Moving VM.two.large2 breaks non-affinity > constraint. > > error (server, VM1, VM2) :- > node (VM1, server1), > node (VM2, server2), > same_ha_group(VM1, VM2), > equal(server1, server2); > > 1) New placement 1: Move 2 instances of large.2 to server-a. Overall > utilization of server-a - 50%. Overall utilization of server-b - > 12.5%. > > 2) New placement 2: Move 1 instance of large.3 to server-b. Overall > utilization of server-a - 0%. Overall utilization of server-b - > 62.5%. > > 3) New placement 3: Move 3 instances of large.2 to server-a. Overall > utilization of server-a - 62.5%. Overall utilization of server-b - > 0%. > > New placements 2 and 3 could be considered optimal, since they > achieve maximal bin packing and open up the door for turning off > server-a or server-b and maximizing energy efficiency. > > But new placement 3 breaks client policy. > > > BTW: what happens if a given situation does not allow the policy violation to > be removed? > > B) Ramki’s original use case can itself be extended: > > Adding additional constraints to the previous use case due to cases such > as: > > - Server heterogeneity > > - CPU “pinning” > > - “VM groups” (and allocation > > - Application interference > > - Refining on the statement “instantaneous energy consumption can be > approximately measured using an overall utilization metric, which is a > combination of CPU utilization, memory usage, I/O usage, and network usage” > > > Let me know if this will interest you. Some (e.g. application interference) > will need some time. E.G; benchmarking / profiling to class VMs etc. > > > C) New placement plan execution > > - In Ramki’s original use case, violation is detected at events such as > VM delete. > While certainly this by itself is sufficiently complex, we may need to > consider other triggering cases (periodic or when multiple VMs are > deleted/added) > - In this case, it may not be sufficient to compute the new placement > plan that brings the system to a configuration that does not break policy, > but also add other goals > > > > D) Let me know if a use case such as placing “video conferencing servers” > (geographically distributed clients) would suit you (multi site scenario) > > => Or is it too premature? > > Ruby > > De : Tim Hinrichs [mailto:thinri...@vmware.com] > Envoyé : mercredi 10 décembre 2014 19:44 > À : KRISHNASWAMY Ruby IMT/OLPS > Cc : Ramki Krishnan (r...@brocade.com<mailto:r...@brocade.com>) > Objet : Re: Placement and Scheduling via Policy > > Hi Ruby, > > Whatever information you think is important for the use case is good. > Section 3 from one of the docs Ramki sent you covers his use case. > https://datatracker.ietf.org/doc/draft-krishnan-nfvrg-policy-based-rm-nfviaas/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dkrishnan-2Dnfvrg-2Dpolicy-2Dbased-2Drm-2Dnfviaas_-3Finclude-5Ftext-3D1&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=R82SMwEX_3O32-8F5eqMQ8Y6wuHt9WhjmMg6rr-4gWs&s=nE7Xheq0TcCDN98mFIOG_VvMsmfBeIDNDVVFV1HpJx0&e=> > > From my point of view, the keys things for the use case are… > > - The placement policy (i.e. the conditions under which VMs require > migration). > > - A description of how we want to compute what specific migrations should be > performed (a sketch of (i) the information that we need about current > placements, policy violations, etc., (2) what systems/algorithms/etc. can > utilize that input to figure out what migrations to perform. > > I think we want to focus on the end-user/customer experience (write a policy, > and watch the VMs move around to obey that policy in response to environment > changes) and then work out the details of how to implement that experience. > That’s why I didn’t include things like delays, asynchronous/synchronous, > architecture, applications, etc. in my 2 bullets above. > > Tim > > On Dec 10, 2014, at 8:55 AM, > <ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>> > <ruby.krishnasw...@orange.com<mailto:ruby.krishnasw...@orange.com>> wrote: > > > > Hi Ramki, Tim > > > By a “format” for describing use cases, I meant to ask what sets of > information to provide, for example, > - what granularity in description of use case? > - a specific placement policy (and perhaps citing reasons for needing > such policy)? > - Specific applications > - Requirements on the placement manager itself (delay, …)? > o Architecture as well > - Specific services from the placement manager (using Congress), such > as, > o Violation detection (load, security, …) > - Adapting (e.g. context-aware) of policies used > > > In any case I will read the documents that Ramki has sent to not resend > similar things. > > Regards > Ruby > > De : Ramki Krishnan [mailto:r...@brocade.com] > Envoyé : mercredi 10 décembre 2014 16:59 > À : Tim Hinrichs; KRISHNASWAMY Ruby IMT/OLPS > Cc : Norival Figueira; Pierre Ettori; Alex Yip; > dilik...@in.ibm.com<mailto:dilik...@in.ibm.com> > Objet : RE: Placement and Scheduling via Policy > > Hi Tim, > > This sounds like a plan. It would be great if you could add the links below > to the Congress wiki. I am all for discussing this in the openstack-dev > mailing list and at this point this discussion is completely open. > > IRTF NFVRG Research Group: > https://trac.tools.ietf.org/group/irtf/trac/wiki/nfvrg<https://urldefense.proofpoint.com/v2/url?u=https-3A__trac.tools.ietf.org_group_irtf_trac_wiki_nfvrg&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=R82SMwEX_3O32-8F5eqMQ8Y6wuHt9WhjmMg6rr-4gWs&s=X---GnOf7YwhOGKMWYa8Mh52VtmO-2imfuZdKLEY39M&e=> > > IRTF NFVRG draft on NFVIaaS placement/scheduling (includes system analysis > for the PoC we are thinking): > https://datatracker.ietf.org/doc/draft-krishnan-nfvrg-policy-based-rm-nfviaas/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dkrishnan-2Dnfvrg-2Dpolicy-2Dbased-2Drm-2Dnfviaas_-3Finclude-5Ftext-3D1&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=R82SMwEX_3O32-8F5eqMQ8Y6wuHt9WhjmMg6rr-4gWs&s=nE7Xheq0TcCDN98mFIOG_VvMsmfBeIDNDVVFV1HpJx0&e=> > > IRTF NFVRG draft on Policy Architecture and Framework (looking forward to > your comments and thoughts): > https://datatracker.ietf.org/doc/draft-norival-nfvrg-nfv-policy-arch/?include_text=1<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dnorival-2Dnfvrg-2Dnfv-2Dpolicy-2Darch_-3Finclude-5Ftext-3D1&d=AAMFAw&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=B6BWd4kFfgOzAREgThxkmTZKy7dDXE2-eBAmL0PBK7s&m=R82SMwEX_3O32-8F5eqMQ8Y6wuHt9WhjmMg6rr-4gWs&s=lBet00H8iO1igDZNEMUGaryHWutkg8abBbL5VG8pjyk&e=> > > Hi Ruby, > > Looking forward to your use cases. > > Thanks, > Ramki > > > > _________________________________________________________________________________________________________________________ > > Ce message et ses pieces jointes peuvent contenir des informations > confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu > ce message par erreur, veuillez le signaler > a l'expediteur et le detruire ainsi que les pieces jointes. Les messages > electroniques etant susceptibles d'alteration, > Orange decline toute responsabilite si ce message a ete altere, deforme ou > falsifie. Merci. > > This message and its attachments may contain confidential or privileged > information that may be protected by law; > they should not be distributed, used or copied without authorisation. > If you have received this email in error, please notify the sender and delete > this message and its attachments. > As emails may be altered, Orange is not liable for messages that have been > modified, changed or falsified. > Thank you. > > > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev