Re: [openstack-dev] [Congress] Re: Placement and Scheduling via Policy
Hi all, I've made an analyse a while a go how to use SolverScheduler with a policy engine: https://docs.google.com/document/d/1RfP7jRsw1mXMjd7in72ARjK0fTrsQv1bqolOriIQB2Y Basically there should be a plugin that translates the policy into constraints for solver to solve. This was made using Policy-Based Engine [1], but it works well with Congress. [1] https://blueprints.launchpad.net/nova/+spec/policy-based-scheduler - Mail original - De: Tim Hinrichs thinri...@vmware.com À: ruby krishnaswamy ruby.krishnasw...@orange.com Cc: Prabhakar Kudva ku...@us.ibm.com, openstack-dev openstack-dev@lists.openstack.org, Gokul B Kandiraju go...@us.ibm.com Envoyé: Jeudi 18 Décembre 2014 18:24:59 Objet: Re: [openstack-dev] [Congress] Re: Placement and Scheduling via Policy Hi all, Responses inline. On Dec 16, 2014, at 10:57 PM, ruby.krishnasw...@orange.commailto:ruby.krishnasw...@orange.com ruby.krishnasw...@orange.commailto:ruby.krishnasw...@orange.com wrote: Hi Tim All @Tim: I did not reply to openstack-dev. Do you think we could have an openstack list specific for “congress” to which anybody may subscribe? Sending to openstack-dev is the right thing, as long as we put [Congress] in the subject. Everyone I know sets up filters on openstack-dev so they only get the mail they care about. I think you’re the only one in the group who isn’t subscribed to that list. 1) Enforcement: By this we mean “how will the actions computed by the policy engine be executed by the concerned OpenStack functional module”. In this case, it is better to first work this out for a “simpler” case, e.g. your running example concerning the network/groups. Note: some actions concern only some data base (e.g. insert the user within some group). 2) From Prabhakar’s mail “Enforcement. That is with a large number of constraints in place for placement and scheduling, how does the policy engine communicate and enforce the placement constraints to nova scheduler. “ Nova scheduler (current): It assigns VMs to servers based on the policy set by the administrator (through filters and host aggregates). The administrator also configures a scheduling heuristic (implemented as a driver), for example “round-robin” driver. Then the computed assignment is sent back to the requestor (API server) that interacts with nova-compute to provision the VM. The current nova-scheduler has another function: It updates the allocation status of each compute node on the DB (through another indirection called nova-conductor) So it is correct to re-interpret your statement as follows: - What is the entity with which the policy engine interacts for either proactive or reactive placement management? - How will the output from the policy engine (for example the placement matrix) be communicated back? oProactive: this gives the mapping of VM to host oReactive: this gives the new mapping of running VMs to hosts - How starting from the placement matrix, the correct migration plan will be executed? (for reactive case) 3) Currently openstack does not have “automated management of reactive placement”: Hence if the policy engine is used for reactive placement, then there is a need for another “orchestrator” that can interpret the new proposed placement configuration (mapping of VM to servers) and execute the reconfiguration workflow. 4) So with a policy-based “placement engine” that is integrated with external solvers, then this engine will replace nova-scheduler? Could we converge on this? The notes from Yathiraj say that there is already a policy-based Nova scheduler we can use. I suggest we look into that. It could potentially simplify our problem to the point where we need only figure out how to convert a fragment of the Congress policy language into their policy language. But those of you who are experts in placement will know better.
Re: [openstack-dev] [nova] FFE server-group-quotas
+1 for ServerGroup quotas. It's been a while since this feature is discussed and approved. As a public cloud provider we really want to get ServerGroup into production. However, without quotas it is more harm than gain. Since ServerGroup (and even its novaclient's command) is merged in Icehouse, IMO it is reasonable to secure it in Juno. Otherwise it's a waste. And as Phil said it is not much a change. Toan -Message d'origine- De : Day, Phil [mailto:philip@hp.com] Envoyé : vendredi 5 septembre 2014 14:57 À : OpenStack Development Mailing List (not for usage questions) Objet : [openstack-dev] [nova] FFE server-group-quotas Hi, I'd like to ask for a FFE for the 3 patchsets that implement quotas for server groups. Server groups (which landed in Icehouse) provides a really useful anti-affinity filter for scheduling that a lot of customers woudl like to use, but without some form of quota control to limit the amount of anti-affinity its impossible to enable it as a feature in a public cloud. The code itself is pretty simple - the number of files touched is a side-effect of having three V2 APIs that report quota information and the need to protect the change in V2 via yet another extension. https://review.openstack.org/#/c/104957/ https://review.openstack.org/#/c/116073/ https://review.openstack.org/#/c/116079/ Phil -Original Message- From: Sahid Orentino Ferdjaoui [mailto:sahid.ferdja...@redhat.com] Sent: 04 September 2014 13:42 To: openstack-dev@lists.openstack.org Subject: [openstack-dev] [nova] FFE request serial-ports Hello, I would like to request a FFE for 4 changesets to complete the blueprint serial-ports. Topic on gerrit: https://review.openstack.org/#/q/status:open+project:openstack/nova+br anch:master+topic:bp/serial-ports,n,z Blueprint on launchpad.net: https://blueprints.launchpad.net/nova/+spec/serial-ports They have already been approved but didn't get enough time to be merged by the gate. Sponsored by: Daniel Berrange Nikola Dipanov s. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [gantt] scheduler sub-group meeting agenda 6/3
Dear all, If we have time, I would like to take your attention to my new patch: Policy-based Scheduling engine https://review.openstack.org/#/c/97503/ This patch implements Policy-Based Scheduler blueprint: https://blueprints.launchpad.net/nova/+spec/policy-based-scheduler I presented its prototype at Atlanta summit: http://openstacksummitmay2014atlanta.sched.org/event/b4313b37de4645079e3d5 506b1d725df#.U43VqPl_tm4 It's a pity that the video of the demo is not yet available on OpenStack channel. We've contacted the foundation on this topic. Best regards, Toan -Message d'origine- De : Dugger, Donald D [mailto:donald.d.dug...@intel.com] Envoyé : mardi 3 juin 2014 04:38 À : OpenStack Development Mailing List (not for usage questions) Objet : [openstack-dev] [gantt] scheduler sub-group meeting agenda 6/3 1) Forklift (tasks status) 2) No-db scheduler discussion (BP ref - https://review.openstack.org/#/c/92128/ ) 3) Opens -- Don Dugger Censeo Toto nos in Kansa esse decisse. - D. Gale Ph: 303/443-3786 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ?
Abusive usage : If user can request anti-affinity VMs, then why doesnt he uses that? This will result in user constantly requesting all his VMs being in the same anti-affinity group. This makes scheduler choose one physical host per VM. This will quickly flood the infrastructure and mess up with the objective of admin (e.g. Consolidation that regroup VM instead of spreading, spared hosts, etc) ; at some time it will be reported back that there is no host available, which appears as a bad experience for user. De : Ian Wells [mailto:ijw.ubu...@cack.org.uk] Envoyé : mardi 8 avril 2014 01:02 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ? On 3 April 2014 08:21, Khanh-Toan Tran khanh-toan.t...@cloudwatt.com wrote: Otherwise we cannot provide redundancy to client except using Region which is dedicated infrastructure and networked separated and anti-affinity filter which IMO is not pragmatic as it has tendency of abusive usage. I'm sorry, could you explain what you mean here by 'abusive usage'? -- Ian. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ?
-Message d'origine- De : Jay Pipes [mailto:jaypi...@gmail.com] Envoyé : mardi 8 avril 2014 15:25 À : openstack-dev@lists.openstack.org Objet : Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ? On Tue, 2014-04-08 at 10:49 +, Day, Phil wrote: On a large cloud you’re protect against this to some extent if the number of servers is number of instances in the quota. However it does feel that there are a couple of things missing to really provide some better protection: - A quota value on the maximum size of a server group - A policy setting so that the ability to use service-groups can be controlled on a per project basis Alternately, we could just have the affinity filters serve as weighting filters instead of returning NoValidHosts. That way, a request containing an affinity hint would cause the scheduler to prefer placing the new VM near (or not-near) other instances in the server group, but if no hosts exist that meet that criteria, the filter simply finds a host with the most (or fewest, in case of anti-affinity) instances that meet the affinity criteria. Best, -jay The filters guarantee the desired effect, while the weighers just give the preference. Thus it makes sense to have AntiAffinity as a filter. Otherwise what is it good for if users do not know if their anti-affiniti-ed VMs are hosted in different hosts. I prefer the idea of anti-affinity quota. May propose that. From: Khanh-Toan Tran [mailto:khanh-toan.t...@cloudwatt.com] Sent: 08 April 2014 11:32 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ? “Abusive usage” : If user can request anti-affinity VMs, then why doesn’t he uses that? This will result in user constantly requesting all his VMs being in the same anti-affinity group. This makes scheduler choose one physical host per VM. This will quickly flood the infrastructure and mess up with the objective of admin (e.g. Consolidation that regroup VM instead of spreading, spared hosts, etc) ; at some time it will be reported back that there is no host available, which appears as a bad experience for user. De : Ian Wells [mailto:ijw.ubu...@cack.org.uk] Envoyé : mardi 8 avril 2014 01:02 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ? On 3 April 2014 08:21, Khanh-Toan Tran khanh-toan.t...@cloudwatt.com wrote: Otherwise we cannot provide redundancy to client except using Region which is dedicated infrastructure and networked separated and anti-affinity filter which IMO is not pragmatic as it has tendency of abusive usage. I'm sorry, could you explain what you mean here by 'abusive usage'? -- Ian. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ?
+1 for AZs not sharing hosts. Because its the only mechanism that allows us to segment the datacenter. Otherwise we cannot provide redundancy to client except using Region which is dedicated infrastructure and networked separated and anti-affinity filter which IMO is not pragmatic as it has tendency of abusive usage. Why sacrificing this power so that users can select the types of his desired physical hosts ? The latter can be exposed using flavor metadata, which is a lot safer and more controllable than using AZs. If someone insists that we really need to let users choose the types of physical hosts, then I suggest creating a new hint, and use aggregates with it. Dont sacrifice AZ exclusivity! Btw, there is a datacenter design called dual-room [1] which I think best fit for AZs to make your cloud redundant even with one datacenter. Best regards, Toan [1] IBM and Cisco: Together for a World Class Data Center, Page 141. http://books.google.fr/books?id=DHjJAgAAQBAJ http://books.google.fr/books?id=DHjJAgAAQBAJpg=PA141#v=onepageqf=false pg=PA141#v=onepageqf=false De : Sylvain Bauza [mailto:sylvain.ba...@gmail.com] Envoyé : jeudi 3 avril 2014 15:52 À : OpenStack Development Mailing List (not for usage questions) Objet : [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ? Hi, I'm currently trying to reproduce [1]. This bug requires to have the same host on two different aggregates, each one having an AZ. IIRC, Nova API prevents hosts of being part of two distinct AZs [2], so IMHO this request should not be possible. That said, there are two flaws where I can identify that no validation is done : - when specifying an AZ in nova.conf, the host is overriding the existing AZ by its own - when adding an host to an aggregate without AZ defined, and afterwards update the aggregate to add an AZ So, I need direction. Either we consider it is not possible to share 2 AZs for the same host and then we need to fix the two above scenarios, or we say it's nice to have 2 AZs for the same host and then we both remove the validation check in the API and we fix the output issue reported in the original bug [1]. Your comments are welcome. Thanks, -Sylvain [1] : https://bugs.launchpad.net/nova/+bug/1277230 [2] : https://github.com/openstack/nova/blob/9d45e9cef624a4a972c24c47c7abd57a72d 74432/nova/compute/api.py#L3378 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ?
+1 for AZs not sharing hosts. Because its the only mechanism that allows us to segment the datacenter. Otherwise we cannot provide redundancy to client except using Region which is dedicated infrastructure and networked separated and anti-affinity filter which IMO is not pragmatic as it has tendency of abusive usage. Why sacrificing this power so that users can select the types of his desired physical hosts ? The latter can be exposed using flavor metadata, which is a lot safer and more controllable than using AZs. If someone insists that we really need to let users choose the types of physical hosts, then I suggest creating a new hint, and use aggregates with it. Dont sacrifice AZ exclusivity! Btw, there is a datacenter design called dual-room [1] which I think best fit for AZs to make your cloud redundant even with one datacenter. Best regards, Toan -Message d'origine- De : Chris Friesen [mailto:chris.frie...@windriver.com] Envoyé : jeudi 3 avril 2014 16:51 À : openstack-dev@lists.openstack.org Objet : Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ? On 04/03/2014 07:51 AM, Sylvain Bauza wrote: Hi, I'm currently trying to reproduce [1]. This bug requires to have the same host on two different aggregates, each one having an AZ. IIRC, Nova API prevents hosts of being part of two distinct AZs [2], so IMHO this request should not be possible. That said, there are two flaws where I can identify that no validation is done : - when specifying an AZ in nova.conf, the host is overriding the existing AZ by its own - when adding an host to an aggregate without AZ defined, and afterwards update the aggregate to add an AZ So, I need direction. Either we consider it is not possible to share 2 AZs for the same host and then we need to fix the two above scenarios, or we say it's nice to have 2 AZs for the same host and then we both remove the validation check in the API and we fix the output issue reported in the original bug [1]. Currently host aggregates are quite general, but the only ways for an end-user to make use of them are: 1) By making the host aggregate an availability zones (where each host is only supposed to be in one availability zone) and selecting it at instance creation time. 2) By booting the instance using a flavor with appropriate metadata (which can only be set up by admin). I would like to see more flexibility available to the end-user, so I think we should either: A) Allow hosts to be part of more than one availability zone (and allow selection of multiple availability zones when booting an instance), or B) Allow the instance boot scheduler hints to interact with the host aggregate metadata. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ?
Dual-room link: [1] IBM and Cisco: Together for a World Class Data Center, Page 141. http://books.google.fr/books?id=DHjJAgAAQBAJpg=PA141#v=onepageqf=false -Message d'origine- De : Khanh-Toan Tran [mailto:khanh-toan.t...@cloudwatt.com] Envoyé : jeudi 3 avril 2014 17:22 À : OpenStack Development Mailing List (not for usage questions) Objet : RE: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ? +1 for AZs not sharing hosts. Because its the only mechanism that allows us to segment the datacenter. Otherwise we cannot provide redundancy to client except using Region which is dedicated infrastructure and networked separated and anti-affinity filter which IMO is not pragmatic as it has tendency of abusive usage. Why sacrificing this power so that users can select the types of his desired physical hosts ? The latter can be exposed using flavor metadata, which is a lot safer and more controllable than using AZs. If someone insists that we really need to let users choose the types of physical hosts, then I suggest creating a new hint, and use aggregates with it. Dont sacrifice AZ exclusivity! Btw, there is a datacenter design called dual-room [1] which I think best fit for AZs to make your cloud redundant even with one datacenter. Best regards, Toan -Message d'origine- De : Chris Friesen [mailto:chris.frie...@windriver.com] Envoyé : jeudi 3 avril 2014 16:51 À : openstack-dev@lists.openstack.org Objet : Re: [openstack-dev] [Nova] Hosts within two Availability Zones : possible or not ? On 04/03/2014 07:51 AM, Sylvain Bauza wrote: Hi, I'm currently trying to reproduce [1]. This bug requires to have the same host on two different aggregates, each one having an AZ. IIRC, Nova API prevents hosts of being part of two distinct AZs [2], so IMHO this request should not be possible. That said, there are two flaws where I can identify that no validation is done : - when specifying an AZ in nova.conf, the host is overriding the existing AZ by its own - when adding an host to an aggregate without AZ defined, and afterwards update the aggregate to add an AZ So, I need direction. Either we consider it is not possible to share 2 AZs for the same host and then we need to fix the two above scenarios, or we say it's nice to have 2 AZs for the same host and then we both remove the validation check in the API and we fix the output issue reported in the original bug [1]. Currently host aggregates are quite general, but the only ways for an end-user to make use of them are: 1) By making the host aggregate an availability zones (where each host is only supposed to be in one availability zone) and selecting it at instance creation time. 2) By booting the instance using a flavor with appropriate metadata (which can only be set up by admin). I would like to see more flexibility available to the end-user, so I think we should either: A) Allow hosts to be part of more than one availability zone (and allow selection of multiple availability zones when booting an instance), or B) Allow the instance boot scheduler hints to interact with the host aggregate metadata. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..
- Original Message - From: Sangeeta Singh sin...@yahoo-inc.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Wednesday, March 26, 2014 6:54:18 PM Subject: Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates.. On 3/26/14, 10:17 AM, Khanh-Toan Tran khanh-toan.t...@cloudwatt.com wrote: - Original Message - From: Sangeeta Singh sin...@yahoo-inc.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Tuesday, March 25, 2014 9:50:00 PM Subject: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates.. Hi, The availability Zones filter states that theoretically a compute node can be part of multiple availability zones. I have a requirement where I need to make a compute node part to 2 AZ. When I try to create a host aggregates with AZ I can not add the node in two host aggregates that have AZ defined. However if I create a host aggregate without associating an AZ then I can add the compute nodes to it. After doing that I can update the host-aggregate an associate an AZ. This looks like a bug. I can see the compute node to be listed in the 2 AZ with the availability-zone-list command. Yes it appears a bug to me (apparently the AZ metadata indertion is considered as a normal metadata so no check is done), and so does the message in the AvailabilityZoneFilter. I don't know why you need a compute node that belongs to 2 different availability-zones. Maybe I'm wrong but for me it's logical that availability-zones do not share the same compute nodes. The availability-zones have the role of partition your compute nodes into zones that are physically separated (in large term it would require separation of physical servers, networking equipments, power sources, etc). So that when user deploys 2 VMs in 2 different zones, he knows that these VMs do not fall into a same host and if some zone falls, the others continue working, thus the client will not lose all of his VMs. It's smaller than Regions which ensure total separation at the cost of low-layer connectivity and central management (e.g. scheduling per region). See: http://www.linuxjournal.com/content/introduction-openstack The former purpose of regouping hosts with the same characteristics is ensured by host-aggregates. The problem that I have is that I can still not boot a VM on the compute node when I do not specify the AZ in the command though I have set the default availability zone and the default schedule zone in nova.conf. I get the error ³ERROR: The requested availability zone is not available² What I am trying to achieve is have two AZ that the user can select during the boot but then have a default AZ which has the HV from both AZ1 AND AZ2 so that when the user does not specify any AZ in the boot command I scatter my VM on both the AZ in a balanced way. I do not understand your goal. When you create two availability-zones and put ALL of your compute nodes into these AZs, then if you don't specifies the AZ in your request, then AZFilter will automatically accept all hosts. The defaut weigher (RalWeigher) will then distribute the workload fairely among these nodes regardless of AZ it belongs to. Maybe it is what you want? With Havana that does not happen as there is a concept of default_scheduler_zone which is none if not specified and when we specify one can only specify a since AZ whereas in my case I basically want the 2 AZ that I create both to be considered default zones if nothing is specified. If you look into the code of the AvailabilityFilter, you'll see that the filter automatically accepts host if there is NO availability-zone in the request, which is the case when user does not specify AZ. This is exactly what I see in my Openstack platform (Hanava stable). FYI, I didn't set up a default AZ in config. So whenever I creates several VMs without specifying an AZ, the scheduler spreads the VMs into all hosts regardless of their AZ. What I think lacking is that user can not select a set of AZs instead of one or none right now. Any pointers. Thanks, Sangeeta ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http
Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..
No, what I mean is that user should be able to specify multiple AZs in his request, something like: nova boot --flavor 2 --image ubuntu --availability-zone AZ1 --availability-zone AZ2 vm1 De : Jérôme Gallard [mailto:gallard.jer...@gmail.com] Envoyé : jeudi 27 mars 2014 10:51 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates.. Hi Toan, Is what you say related to : https://blueprints.launchpad.net/nova/+spec/schedule-set-availability-zone s ? 2014-03-27 10:37 GMT+01:00 Khanh-Toan Tran khanh-toan.t...@cloudwatt.com: - Original Message - From: Sangeeta Singh sin...@yahoo-inc.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Wednesday, March 26, 2014 6:54:18 PM Subject: Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates.. On 3/26/14, 10:17 AM, Khanh-Toan Tran khanh-toan.t...@cloudwatt.com wrote: - Original Message - From: Sangeeta Singh sin...@yahoo-inc.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Tuesday, March 25, 2014 9:50:00 PM Subject: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates.. Hi, The availability Zones filter states that theoretically a compute node can be part of multiple availability zones. I have a requirement where I need to make a compute node part to 2 AZ. When I try to create a host aggregates with AZ I can not add the node in two host aggregates that have AZ defined. However if I create a host aggregate without associating an AZ then I can add the compute nodes to it. After doing that I can update the host-aggregate an associate an AZ. This looks like a bug. I can see the compute node to be listed in the 2 AZ with the availability-zone-list command. Yes it appears a bug to me (apparently the AZ metadata indertion is considered as a normal metadata so no check is done), and so does the message in the AvailabilityZoneFilter. I don't know why you need a compute node that belongs to 2 different availability-zones. Maybe I'm wrong but for me it's logical that availability-zones do not share the same compute nodes. The availability-zones have the role of partition your compute nodes into zones that are physically separated (in large term it would require separation of physical servers, networking equipments, power sources, etc). So that when user deploys 2 VMs in 2 different zones, he knows that these VMs do not fall into a same host and if some zone falls, the others continue working, thus the client will not lose all of his VMs. It's smaller than Regions which ensure total separation at the cost of low-layer connectivity and central management (e.g. scheduling per region). See: http://www.linuxjournal.com/content/introduction-openstack The former purpose of regouping hosts with the same characteristics is ensured by host-aggregates. The problem that I have is that I can still not boot a VM on the compute node when I do not specify the AZ in the command though I have set the default availability zone and the default schedule zone in nova.conf. I get the error ³ERROR: The requested availability zone is not available² What I am trying to achieve is have two AZ that the user can select during the boot but then have a default AZ which has the HV from both AZ1 AND AZ2 so that when the user does not specify any AZ in the boot command I scatter my VM on both the AZ in a balanced way. I do not understand your goal. When you create two availability-zones and put ALL of your compute nodes into these AZs, then if you don't specifies the AZ in your request, then AZFilter will automatically accept all hosts. The defaut weigher (RalWeigher) will then distribute the workload fairely among these nodes regardless of AZ it belongs to. Maybe it is what you want? With Havana that does not happen as there is a concept of default_scheduler_zone which is none if not specified and when we specify one can only specify a since AZ whereas in my case I basically want the 2 AZ that I create both to be considered default zones if nothing is specified. If you look into the code of the AvailabilityFilter, you'll see that the filter automatically accepts host if there is NO availability-zone in the request, which is the case when user does not specify AZ. This is exactly what I see in my Openstack platform (Hanava stable). FYI, I didn't set up a default AZ in config. So whenever I creates several VMs without specifying an AZ, the scheduler spreads the VMs into all hosts regardless of their AZ. What I think lacking is that user can not select a set of AZs instead of one or none right now. Any pointers. Thanks, Sangeeta
Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..
-Message d'origine- De : Sylvain Bauza [mailto:sylvain.ba...@bull.net] Envoyé : jeudi 27 mars 2014 11:05 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates.. Le 27/03/2014 10:37, Khanh-Toan Tran a écrit : - Original Message - From: Sangeeta Singh sin...@yahoo-inc.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Wednesday, March 26, 2014 6:54:18 PM Subject: Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates.. On 3/26/14, 10:17 AM, Khanh-Toan Tran khanh-toan.t...@cloudwatt.com wrote: - Original Message - From: Sangeeta Singh sin...@yahoo-inc.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Tuesday, March 25, 2014 9:50:00 PM Subject: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates.. Hi, The availability Zones filter states that theoretically a compute node can be part of multiple availability zones. I have a requirement where I need to make a compute node part to 2 AZ. When I try to create a host aggregates with AZ I can not add the node in two host aggregates that have AZ defined. However if I create a host aggregate without associating an AZ then I can add the compute nodes to it. After doing that I can update the host-aggregate an associate an AZ. This looks like a bug. I can see the compute node to be listed in the 2 AZ with the availability-zone-list command. Yes it appears a bug to me (apparently the AZ metadata indertion is considered as a normal metadata so no check is done), and so does the message in the AvailabilityZoneFilter. I don't know why you need a compute node that belongs to 2 different availability-zones. Maybe I'm wrong but for me it's logical that availability-zones do not share the same compute nodes. The availability-zones have the role of partition your compute nodes into zones that are physically separated (in large term it would require separation of physical servers, networking equipments, power sources, etc). So that when user deploys 2 VMs in 2 different zones, he knows that these VMs do not fall into a same host and if some zone falls, the others continue working, thus the client will not lose all of his VMs. It's smaller than Regions which ensure total separation at the cost of low-layer connectivity and central management (e.g. scheduling per region). See: http://www.linuxjournal.com/content/introduction-openstack The former purpose of regouping hosts with the same characteristics is ensured by host-aggregates. The problem that I have is that I can still not boot a VM on the compute node when I do not specify the AZ in the command though I have set the default availability zone and the default schedule zone in nova.conf. I get the error ³ERROR: The requested availability zone is not available² What I am trying to achieve is have two AZ that the user can select during the boot but then have a default AZ which has the HV from both AZ1 AND AZ2 so that when the user does not specify any AZ in the boot command I scatter my VM on both the AZ in a balanced way. I do not understand your goal. When you create two availability-zones and put ALL of your compute nodes into these AZs, then if you don't specifies the AZ in your request, then AZFilter will automatically accept all hosts. The defaut weigher (RalWeigher) will then distribute the workload fairely among these nodes regardless of AZ it belongs to. Maybe it is what you want? With Havana that does not happen as there is a concept of default_scheduler_zone which is none if not specified and when we specify one can only specify a since AZ whereas in my case I basically want the 2 AZ that I create both to be considered default zones if nothing is specified. If you look into the code of the AvailabilityFilter, you'll see that the filter automatically accepts host if there is NO availability-zone in the request, which is the case when user does not specify AZ. This is exactly what I see in my Openstack platform (Hanava stable). FYI, I didn't set up a default AZ in config. So whenever I creates several VMs without specifying an AZ, the scheduler spreads the VMs into all hosts regardless of their AZ. What I think lacking is that user can not select a set of AZs instead of one or none right now. That's because this is not the goal of this filter to exclude AZs if none specified ;-) If you want to isolate, there is another filter responsible for this [1] IMHO, a filter should still be as simple as possible. That's only combination of filters which should match any needs. [1] :https://github.com/openstack/nova/blob
Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..
- Original Message - From: Sangeeta Singh sin...@yahoo-inc.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Tuesday, March 25, 2014 9:50:00 PM Subject: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates.. Hi, The availability Zones filter states that theoretically a compute node can be part of multiple availability zones. I have a requirement where I need to make a compute node part to 2 AZ. When I try to create a host aggregates with AZ I can not add the node in two host aggregates that have AZ defined. However if I create a host aggregate without associating an AZ then I can add the compute nodes to it. After doing that I can update the host-aggregate an associate an AZ. This looks like a bug. I can see the compute node to be listed in the 2 AZ with the availability-zone-list command. Yes it appears a bug to me (apparently the AZ metadata indertion is considered as a normal metadata so no check is done), and so does the message in the AvailabilityZoneFilter. I don't know why you need a compute node that belongs to 2 different availability-zones. Maybe I'm wrong but for me it's logical that availability-zones do not share the same compute nodes. The availability-zones have the role of partition your compute nodes into zones that are physically separated (in large term it would require separation of physical servers, networking equipments, power sources, etc). So that when user deploys 2 VMs in 2 different zones, he knows that these VMs do not fall into a same host and if some zone falls, the others continue working, thus the client will not lose all of his VMs. It's smaller than Regions which ensure total separation at the cost of low-layer connectivity and central management (e.g. scheduling per region). See: http://www.linuxjournal.com/content/introduction-openstack The former purpose of regouping hosts with the same characteristics is ensured by host-aggregates. The problem that I have is that I can still not boot a VM on the compute node when I do not specify the AZ in the command though I have set the default availability zone and the default schedule zone in nova.conf. I get the error “ERROR: The requested availability zone is not available” What I am trying to achieve is have two AZ that the user can select during the boot but then have a default AZ which has the HV from both AZ1 AND AZ2 so that when the user does not specify any AZ in the boot command I scatter my VM on both the AZ in a balanced way. I do not understand your goal. When you create two availability-zones and put ALL of your compute nodes into these AZs, then if you don't specifies the AZ in your request, then AZFilter will automatically accept all hosts. The defaut weigher (RalWeigher) will then distribute the workload fairely among these nodes regardless of AZ it belongs to. Maybe it is what you want? Any pointers. Thanks, Sangeeta ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing
Well I use a 8 cores 128G RAM physical host :) I did not see much of the CPU consumption for these 100 containers, so I suspect we can use less resources. -Message d'origine- De : David Peraza [mailto:david_per...@persistentsys.com] Envoyé : lundi 3 mars 2014 20:27 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing Thanks Khanh, I see the potential issue with using threads. Thanks for pointing out. On using containers, that sounds like a cool configuration but that should have a bigger footprint on the host resources than just a separate service instance like I'm doing. I have to admit that 100 fake computes per physical host is good though. How big is your physical host. I'm running a 4 Gig 4 CPU VM. I suspect your physical system is much more equipped. Regards, David Peraza | Openstack Solutions Architect david_per...@persistentsys.com | Cell: (305)766-2520 Persistent Systems Inc. | Partners in Innovation | www.persistentsys.com -Original Message- From: Khanh-Toan Tran [mailto:khanh-toan.t...@cloudwatt.com] Sent: Tuesday, February 25, 2014 3:49 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing I could do that but I think I need to be able to scale more without the need to use this much resources. I will like to simulate a cloud of 100 maybe 1000 compute nodes that do nothing (Fake driver) this should not take this much memory. Anyone knows of a more efficient way to simulate many computes? I was thinking changing the Fake driver to report many compute services in different threads instead of having to spawn a process per compute service. Any other ideas? I'm not sure using threads is a good idea. We need a dedicated resources pool for each compute. If the threads share the same resources pool, then every new VM will change the available resources on all computes, which may lead to unexpected unpredicted scheduling result. For instance, RamWeigher may return the same compute twice instead of spreading, because at each time it finds out that the computes have the same free_ram. Using compute inside LXC, I created 100 computes per physical host. Here is what I did, it's very simple: - Creating a LXC with logical volume - Installing a fake nova-compute inside the LXC - Make a booting script that modifies its nova.conf to use its IP address starts nova-compute - Using the LXC above as the master, clone as many compute as you like! (Note that while cloning the LXC, the nova.conf is copied with the former's IP address, that's why we need the booting script.) Best regards, Toan -Message d'origine- De : David Peraza [mailto:david_per...@persistentsys.com] Envoyé : lundi 24 février 2014 21:13 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing Thanks John, I also think it is a good idea to test the algorithm at unit test level, but I will like to try out over amqp as well, that is, we process and threads talking to each other over rabbit or qpid. I'm trying to test out performance as well. Regards, David Peraza -Original Message- From: John Garbutt [mailto:j...@johngarbutt.com] Sent: Monday, February 24, 2014 11:51 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing On 24 February 2014 16:24, David Peraza david_per...@persistentsys.com wrote: Hello all, I have been trying some new ideas on scheduler and I think I'm reaching a resource issue. I'm running 6 compute service right on my 4 CPU 4 Gig VM, and I started to get some memory allocation issues. Keystone and Nova are already complaining there is not enough memory. The obvious solution to add more candidates is to get another VM and set another 6 Fake compute service. I could do that but I think I need to be able to scale more without the need to use this much resources. I will like to simulate a cloud of 100 maybe 1000 compute nodes that do nothing (Fake driver) this should not take this much memory. Anyone knows of a more efficient way to simulate many computes? I was thinking changing the Fake driver to report many compute services in different threads instead of having to spawn a process per compute service. Any other ideas? It depends what you want to test, but I was able to look at tuning the filters and weights using the test at the end of this file: https://review.openstack.org/#/c/67855/33/nova/tests/scheduler/test_cachin g _scheduler.py Cheers, John
Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing
I could do that but I think I need to be able to scale more without the need to use this much resources. I will like to simulate a cloud of 100 maybe 1000 compute nodes that do nothing (Fake driver) this should not take this much memory. Anyone knows of a more efficient way to simulate many computes? I was thinking changing the Fake driver to report many compute services in different threads instead of having to spawn a process per compute service. Any other ideas? I'm not sure using threads is a good idea. We need a dedicated resources pool for each compute. If the threads share the same resources pool, then every new VM will change the available resources on all computes, which may lead to unexpected unpredicted scheduling result. For instance, RamWeigher may return the same compute twice instead of spreading, because at each time it finds out that the computes have the same free_ram. Using compute inside LXC, I created 100 computes per physical host. Here is what I did, it's very simple: - Creating a LXC with logical volume - Installing a fake nova-compute inside the LXC - Make a booting script that modifies its nova.conf to use its IP address starts nova-compute - Using the LXC above as the master, clone as many compute as you like! (Note that while cloning the LXC, the nova.conf is copied with the former's IP address, that's why we need the booting script.) Best regards, Toan -Message d'origine- De : David Peraza [mailto:david_per...@persistentsys.com] Envoyé : lundi 24 février 2014 21:13 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing Thanks John, I also think it is a good idea to test the algorithm at unit test level, but I will like to try out over amqp as well, that is, we process and threads talking to each other over rabbit or qpid. I'm trying to test out performance as well. Regards, David Peraza -Original Message- From: John Garbutt [mailto:j...@johngarbutt.com] Sent: Monday, February 24, 2014 11:51 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova] Simulating many fake nova compute nodes for scheduler testing On 24 February 2014 16:24, David Peraza david_per...@persistentsys.com wrote: Hello all, I have been trying some new ideas on scheduler and I think I'm reaching a resource issue. I'm running 6 compute service right on my 4 CPU 4 Gig VM, and I started to get some memory allocation issues. Keystone and Nova are already complaining there is not enough memory. The obvious solution to add more candidates is to get another VM and set another 6 Fake compute service. I could do that but I think I need to be able to scale more without the need to use this much resources. I will like to simulate a cloud of 100 maybe 1000 compute nodes that do nothing (Fake driver) this should not take this much memory. Anyone knows of a more efficient way to simulate many computes? I was thinking changing the Fake driver to report many compute services in different threads instead of having to spawn a process per compute service. Any other ideas? It depends what you want to test, but I was able to look at tuning the filters and weights using the test at the end of this file: https://review.openstack.org/#/c/67855/33/nova/tests/scheduler/test_cachin g _scheduler.py Cheers, John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Meetup Summary
Agreed. I'm just thinking on the opportunity of providing a REST API on top of the scheduler RPC API with a 1:1 matching, so that the Gantt project would step up by itself. I don't think it's a hard stuff, provided I already did that stuff for Climate (providing Pecan/WSME API). What do you think about it ? Even if it's not top priority, that's a quickwin. Well, Im not sure about quickwin, though J I think that we should focus on the main objective of having a self-contained Gantt working with Nova first. Some of the interaction issues still worry me, especially the host_state host_update queries. These issues will have impact on the Gantt API (at least for Nova to use), so Im not sure the current RPC API will hold up either. But I will not discourage any personal effort J De : Sylvain Bauza [mailto:sylvain.ba...@gmail.com] Envoyé : mardi 18 février 2014 22:41 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [Nova] Meetup Summary Hi Don, 2014-02-18 21:28 GMT+01:00 Dugger, Donald D donald.d.dug...@intel.com: Sylvain- As you can tell from the meeting today the scheduler sub-group is really not the gantt group meeting, I try to make sure that messages for things like the agenda and what not include both `gantt and `scheduler in the subject so its clear were talking about the same thing. That's the main reason why I was unable to attend the previous scheduler meetings... Now that I attended this today meeting, that's quite clear to me. I apologize for this misunderstanding, but as I can't dedicate all my time on Gantt/Nova, I have to make sure the time I'm taking on it can be worth it. Now that we agreed on a plan for next steps, I think it's important to put the infos on Gantt blueprints, even if most of the changes are related to Nova. The current etherpad is huge, and frightening people who would want to contribute IMHO. Note that our ultimate goal is to create a scheduler that is usable by other projects, not just nova, but that is a second task. The first task is to create a separate scheduler that will be usable by nova at a minimum. (World domination will follow later J Agreed. I'm just thinking on the opportunity of providing a REST API on top of the scheduler RPC API with a 1:1 matching, so that the Gantt project would step up by itself. I don't think it's a hard stuff, provided I already did that stuff for Climate (providing Pecan/WSME API). What do you think about it ? Even if it's not top priority, that's a quickwin. -Sylvain -- Don Dugger Censeo Toto nos in Kansa esse decisse. - D. Gale Ph: 303/443-3786 tel:303%2F443-3786 From: Sylvain Bauza [mailto:sylvain.ba...@gmail.com] Sent: Monday, February 17, 2014 4:26 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] Meetup Summary Hi Russell and Don, 2014-02-17 23:41 GMT+01:00 Russell Bryant rbry...@redhat.com: Greetings, 2) Gantt - We discussed the progress of the Gantt effort. After discussing the problems encountered so far and the other scheduler work going on, the consensus was that we really need to focus on decoupling the scheduler from the rest of Nova while it's still in the Nova tree. Don was still interested in working on the existing gantt tree to learn what he can about the coupling of the scheduler to the rest of Nova. Nobody had a problem with that, but it doesn't sound like we'll be ready to regenerate the gantt tree to be the real gantt tree soon. We probably need another cycle of development before it will be ready. As a follow-up to this, I wonder if we should rename the current gantt repository from openstack/gantt to stackforge/gantt to avoid any possible confusion. We should make it clear that we don't expect the current repo to be used yet. There is currently no precise meeting timeslot for Gantt but the one with Nova scheduler subteam. Would it be possible to have a status on the current path for Gantt so that people interested in joining the effort would be able to get in ? There is currently a discussion on how Gantt and Nova should interact, in particular regarding HostState and how Nova Computes could update their status so as Gantt would be able to filter on them. There are also other discussions about testing, API, etc. so I'm just wondering how to help and where. On a side note, if Gantt is becoming a Stackforge project planning to have Nova scheduling first, could we also assume that we could also implement this service for being used by other projects (such as Climate) in parallel with Nova ? The current utilization-aware-scheduling blueprint is nearly done so that it can be used for other queries than just Nova scheduling, but unfortunately as the scheduler is still part of Nova and without a REST API, it can't be leveraged on third-party projects. Thanks, -Sylvain [1] :
Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
Second, there is nothing wrong with booting the instances (or instantiating other resources) as separate commands as long as we support some kind of reservation token. I'm not sure what reservation token would do, is it some kind of informing the scheduler that the resources would not be initiated until later ? Let's consider a following example: A user wants to create 2 VMs, a small one with 20 GB RAM, and a big one with 40 GB RAM in a datacenter consisted of 2 hosts: one with 50 GB RAM left, and another with 30 GB RAM left, using Filter Scheduler's default RamWeigher. If we pass the demand as two commands, there is a chance that the small VM arrives first. RamWeigher will put it in the 50 GB RAM host, which will be reduced to 30 GB RAM. Then, when the big VM request arrives, there will be no space left to host it. As a result, the whole demand is failed. Now if we can pass the two VMs in a command, SolverScheduler can put their constraints all together into one big LP as follow (x_uv = 1 if VM u is hosted in host v, 0 if not): 50GB RAM host constraint: 20 *x_11 + 40 * x_21 =50 30GB RAM host constraint: 20 *x_12 + 40 * x_22 =30 Small VM presence constraint:x_11 + x_12 = 1 Big VM presence constraint:x_21 + x_22 = 1 From these constraints there is only one root that is: x_11 = 0, x12 = 1; x_21 = 1; x_22 = 0; i.e, small VM hosted in 30 GB RAM host, and big VM hosted in 50 GB RAM host. As a conclusion, if we have VMs of multiple flavors to deal with, we cannot give the correct answer if we do not have all information. Therefore, if by reservation you mean that the scheduler would hold off the scheduling process and save the information until it receives all necessary information, then I'm agreed. But it just a workaround of passing the whole demand as a whole, which would better be handled by an API. That responses to your first point, too. If we don't mind that some VMs are placed and some are not (e.g. they belong to different apps), then it's OK to pass them to the scheduler without Instance Group. However, if the VMs are together (belong to an app), then we have to put them into an Instance Group. -Message d'origine- De : Chris Friesen [mailto:chris.frie...@windriver.com] Envoyé : lundi 10 février 2014 18:45 À : openstack-dev@lists.openstack.org Objet : Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler On 02/10/2014 10:54 AM, Khanh-Toan Tran wrote: Heat may orchestrate the provisioning process, but eventually the instances will be passed to Nova-scheduler (Gantt) as separated commands, which is exactly the problem Solver Scheduler wants to correct. Therefore the Instance Group API is needed, wherever it is used (nova-scheduler/Gantt). I'm not sure that this follows. First, the instance groups API is totally separate since we may want to schedule a number of instances simultaneously without them being part of an instance group. Certainly in the case of using instance groups that would be one input into the scheduler, but it's an optional input. Second, there is nothing wrong with booting the instances (or instantiating other resources) as separate commands as long as we support some kind of reservation token. In that model, we would pass a bunch of information about multiple resources to the solver scheduler, have it perform scheduling *and reserve the resources*, then return some kind of resource reservation tokens back to the caller for each resource. The caller could then allocate each resource, pass in the reservation token indicating both that the resources had already been reserved as well as what the specific resource that had been reserved (the compute-host in the case of an instance, for example). Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
Thanks, I will look closely at it. De : Dina Belova [mailto:dbel...@mirantis.com] Envoyé : mardi 11 février 2014 16:45 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler Like a restaurant reservation, it would claim the resources for use by someone at a later date. That way nobody else can use them. That way the scheduler would be responsible for determining where the resource should be allocated from, and getting a reservation for that resource. It would not have anything to do with actually instantiating the instance/volume/etc. Although I'm quite new to topic of Solver Scheduler, it seems to me that in that case you need to look on Climate project. It aims to provide resource reservation to OS clouds (and by resource I mean here instance/compute host/volume/etc.) And Climate logic is like: create lease - get resources from common pool - do smth with them when lease start time will come. I'll say one more time - I'm not really common with this discussion, but it looks like Climate might help here. Thanks Dina On Tue, Feb 11, 2014 at 7:09 PM, Chris Friesen chris.frie...@windriver.com wrote: On 02/11/2014 03:21 AM, Khanh-Toan Tran wrote: Second, there is nothing wrong with booting the instances (or instantiating other resources) as separate commands as long as we support some kind of reservation token. I'm not sure what reservation token would do, is it some kind of informing the scheduler that the resources would not be initiated until later ? Like a restaurant reservation, it would claim the resources for use by someone at a later date. That way nobody else can use them. That way the scheduler would be responsible for determining where the resource should be allocated from, and getting a reservation for that resource. It would not have anything to do with actually instantiating the instance/volume/etc. Let's consider a following example: A user wants to create 2 VMs, a small one with 20 GB RAM, and a big one with 40 GB RAM in a datacenter consisted of 2 hosts: one with 50 GB RAM left, and another with 30 GB RAM left, using Filter Scheduler's default RamWeigher. If we pass the demand as two commands, there is a chance that the small VM arrives first. RamWeigher will put it in the 50 GB RAM host, which will be reduced to 30 GB RAM. Then, when the big VM request arrives, there will be no space left to host it. As a result, the whole demand is failed. Now if we can pass the two VMs in a command, SolverScheduler can put their constraints all together into one big LP as follow (x_uv = 1 if VM u is hosted in host v, 0 if not): Yes. So what I'm suggesting is that we schedule the two VMs as one call to the SolverScheduler. The scheduler then gets reservations for the necessary resources and returns them to the caller. This would be sort of like the existing Claim object in nova/compute/claims.py but generalized somewhat to other resources as well. The caller could then boot each instance separately (passing the appropriate reservation/claim along with the boot request). Because the caller has a reservation the core code would know it doesn't need to schedule or allocate resources, that's already been done. The advantage of this is that the scheduling and resource allocation is done separately from the instantiation. The instantiation API could remain basically as-is except for supporting an optional reservation token. That responses to your first point, too. If we don't mind that some VMs are placed and some are not (e.g. they belong to different apps), then it's OK to pass them to the scheduler without Instance Group. However, if the VMs are together (belong to an app), then we have to put them into an Instance Group. When I think of an Instance Group, I think of https://blueprints.launchpad.net/nova/+spec/instance-group-api-extension; . Fundamentally Instance Groups describes a runtime relationship between different instances. The scheduler doesn't necessarily care about a runtime relationship, it's just trying to allocate resources efficiently. In the above example, there is no need for those two instances to necessarily be part of an Instance Group--we just want to schedule them both at the same time to give the scheduler a better chance of fitting them both. More generally, the more instances I want to start up the more beneficial it can be to pass them all to the scheduler at once in order to give the scheduler more information. Those instances could be parts of completely independent Instance Groups, or not part of an Instance Group at all...the scheduler can still do a better job if it has more information to work with. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards, Dina Belova
Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
I'm not sure what you mean by whatever it takes to finally create the instances, but that sounds like what I had assumed everybody meant by orchestration (until I heard that there is no widespread agreement) --- and I think we need to take a properly open approach to that. I think the proper API for cross-service whole-pattern scheduling should primarily focus on conveying the placement problem to the thing that will make the joint decision. After the joint decision is made comes the time to create the individual resources. I think we can NOT mandate one particular agent or language for that. We will have to allow general clients to make calls on Nova, Cinder, etc. to do the individual resource creations (with some sort of reference to the decision that was already made). My original position was that we could use Heat for this, but I think we have gotten push-back saying it is NOT OK to *require* that. For example, note that some people do not want to use Heat at all, they prefer to make individual calls on Nova, Cinder, etc. Of course, we definitely want to support, among others, the people who *do* use Heat. I do not think Heat would be appropriate, either. Heat does not have the detailed knowledges on the infrastructure and it uses Nova (Gantt) API to pass the command, so if Nova (Gantt) API does not support multiple instances provisioning, Heat will not get the joint decision for all VMs as a whole. Heat may orchestrate the provisioning process, but eventually the instances will be passed to Nova-scheduler (Gantt) as separated commands, which is exactly the problem Solver Scheduler wants to correct. Therefore the Instance Group API is needed, wherever it is used (nova-scheduler/Gantt). I think Gantt should be the cross-service joint decision point. Heat still keeps orchestrating processes like it always does, but the provisioning decision has to be made all together as one atomic step in Heat's whole process. Here's a more detailed description of our thoughts on how such a protocol might look: https://wiki.openstack.org/wiki/Nova/PlacementAdvisorAndEngine We've concentrated on the Nova scheduler; Would be interesting to see if this protocol aligns with Yathiraj's thoughts on a global scheduler addressing compute+storage+network. Feedback is most welcome. Thank you for the link, I will give it a try. Best regards, Toan - Original Message - From: Mike Spreitzer mspre...@us.ibm.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Tuesday, February 4, 2014 9:05:22 AM Subject: Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler From: Khanh-Toan Tran khanh-toan.t...@cloudwatt.com ... There is an unexpected line break in the middle of the link, so I post it again: https://docs.google.com/document/d/1RfP7jRsw1mXMjd7in72ARjK0fTrsQv1bqolOri IQB2Y The mailing list software keeps inserting that line break. I re-constructed the URL and looked at the document. As you point out at the end, the way you attempt to formulate load balancing as a linear objective does not work. I think load-balancing is a non-linear thing. I also doubt that simple load balancing is what cloud providers want; I think cloud providers want to bunch up load, within limits, for example to keep some hosts idle so that they can be powered down to save on costs or left available for future exclusive use. From: Gil Rapaport g...@il.ibm.com ... As Alex Glikson hinted a couple of weekly meetings ago, our approach to this is to think of the driver's work as split between two entities: -- A Placement Advisor, that constructs placement problems for scheduling requests (filter-scheduler and policy-based-scheduler) -- A Placement Engine, that solves placement problems (HostManager in get_filtered_hosts() and solver-scheduler with its LP engine). Yes, I see the virtue in that separation. Let me egg it on a little. What Alex and KTT want is more structure in the Placement Advisor, where there is a multiplicity of plugins, each bound to some fraction of the whole system, and a protocol for combining the advice from the plugins. I would also like to remind you of another kind of structure: some of the placement desiderata come from the cloud users, and some from the cloud provider. From: Yathiraj Udupi (yudupi) yud...@cisco.com ... Like you point out, I do agree the two entities of placement advisor, and the placement engine, but I think there should be a third one – the provisioning engine, which should be responsible for whatever it takes to finally create the instances, after the placement decision has been taken. I'm not sure what you mean by whatever it takes to finally create the instances, but that sounds like what I had assumed everybody meant by orchestration (until I heard that there is no widespread agreement
Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
Nice idea. I also think that filters and weighers are FilterScheduler-specific. Thus, that is unnecessary for SolverScheduler to try to translate all filters/weighers into constraints. It would be easier for transition, though. Anyway, we just need some placement logics that will be written as constraints, as the they are currently represented as filters in FilterScheduler. So yes we will need a placement advisor here. For provisioning engine, isn't scheduler manager (or maybe nova-conductor) will be the one? I still don't figure out after we have gantt, how nova-conductor interact with gantt, or we put its logic into gantt, too. Another though would be the need for Instance Group API [1]. Currently users can only request multiple instances of the same flavors. These requests do not need LP to solve, just placing instances one by one is sufficient. Therefore we need this API so that users can request instances of different flavors, with some relations (constraints) among them. The advantage is that this logic and API will help us add Cinder volumes with ease (not sure how the Cinder-stackers think about it, though). Best regards, Toan [1] https://wiki.openstack.org/wiki/InstanceGroupApiExtension - Original Message - From: Yathiraj Udupi (yudupi) yud...@cisco.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thu, 30 Jan 2014 18:13:59 - (UTC) Subject: Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler It is really good we are reviving the conversation we started during the last summit in Hongkong during one of the scheduler sessions called “Smart resource placement”. This is the document we used to discuss during the session. Probably you may have seen this before: https://docs.google.com/document/d/1IiPI0sfaWb1bdYiMWzAAx0HYR6UqzOan_Utgml5W1HI/edit The idea is to separate out the logic for the placement decision engine from the actual request and the final provisioning phase. The placement engine itself can be pluggable, and as we show in the solver scheduler blueprint, we show how it fits inside of Nova. The discussions at the summit and in our weekly scheduler meetings led to us starting the “Smart resource placement” idea inside of Nova, and then take it to a unified global level spanning cross services such as cinder and neutron. Like you point out, I do agree the two entities of placement advisor, and the placement engine, but I think there should be a third one – the provisioning engine, which should be responsible for whatever it takes to finally create the instances, after the placement decision has been taken. It is good to take incremental approaches, hence we should try to get patches like these get accepted first within nova, and then slowly split up the logic into separate entities. Thanks, Yathi. On 1/30/14, 7:14 AM, Gil Rapaport g...@il.ibm.commailto:g...@il.ibm.com wrote: Hi all, Excellent definition of the issue at hand. The recent blueprints of policy-based-scheduler and solver-scheduler indeed highlight a possible weakness in the current design, as despite their completely independent contributions (i.e. which filters to apply per request vs. how to compute a valid placement) their implementation as drivers makes combining them non-trivial. As Alex Glikson hinted a couple of weekly meetings ago, our approach to this is to think of the driver's work as split between two entities: -- A Placement Advisor, that constructs placement problems for scheduling requests (filter-scheduler and policy-based-scheduler) -- A Placement Engine, that solves placement problems (HostManager in get_filtered_hosts() and solver-scheduler with its LP engine). Such modularity should allow developing independent mechanisms that can be combined seamlessly through a unified well-defined protocol based on constructing placement problem objects by the placement advisor and then passing them to the placement engine, which returns the solution. The protocol can be orchestrated by the scheduler manager. As can be seen at this point already, the policy-based-scheduler blueprint can now be positioned as an improvement of the placement advisor. Similarly, the solver-scheduler blueprint can be positioned as an improvement of the placement engine. I'm working on a wiki page that will get into the details. Would appreciate your initial thoughts on this approach. Regards, Gil From: Khanh-Toan Tran khanh-toan.t...@cloudwatt.commailto:khanh-toan.t...@cloudwatt.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org, Date: 01/30/2014 01:43 PM Subject: Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler Hi Sylvain, 1) Some Filters such as AggregateCoreFilter, AggregateRAMFilter can change its parameters
Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
There is an unexpected line break in the middle of the link, so I post it again: https://docs.google.com/document/d/1RfP7jRsw1mXMjd7in72ARjK0fTrsQv1bqolOri IQB2Y -Message d'origine- De : Khanh-Toan Tran [mailto:khanh-toan.t...@cloudwatt.com] Envoyé : mercredi 29 janvier 2014 13:25 À : 'OpenStack Development Mailing List (not for usage questions)' Objet : [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler Dear all, As promised in the Scheduler/Gantt meeting, here is our analysis on the connection between Policy Based Scheduler and Solver Scheduler: https://docs.google.com/document/d/1RfP7jRsw1mXMjd7in72ARjK0fTrsQv1bq olOri IQB2Y This document briefs the mechanism of the two schedulers and the possibility of cooperation. It is my personal point of view only. In a nutshell, Policy Based Scheduler allows admin to define policies for different physical resources (an aggregate, an availability-zone, or all infrastructure) or different (classes of) users. Admin can modify (add/remove/modify) any policy in runtime, and the modification effect is only in the target (e.g. the aggregate, the users) that the policy is defined to. Solver Scheduler solves the placement of groups of instances simultaneously by putting all the known information into a integer linear system and uses Integer Program solver to solve the latter. Thus relation between VMs and between VMs- computes are all accounted for. If working together, Policy Based Scheduler can supply the filters and weighers following the policies rules defined for different computes. These filters and weighers can be converted into constraints cost function for Solver Scheduler to solve. More detailed will be found in the doc. I look forward for comments and hope that we can work it out. Best regards, Khanh-Toan TRAN ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova][Scheduler] Policy Based Scheduler needs review approvals
Dear Stackers, Ice-house 3 deadline is approaching quickly and we still need reviews approval for Policy Based Scheduler ! So I kindly ask for your attention for this blueprint. The purpose of this blueprint is to manage the scheduling process by the policy. With it admin can define scheduling rules per group of physical resources (an aggregate, an availability-zone, or the whole infrastructure); or per (classes of) users. For instance, admin can define a policy of Load Balancing (distribute workload evenly among the servers) in some aggregates, and Consolidation (concentrate workloads in minimal of servers to be able to hibernate others) in other aggregates. Admin can also change the policies in runtime and the changes will immediately take effect. Among the usecases would be the Pclouds : https://blueprints.launchpad.net/nova/+spec/whole-host-allocation where we need a scheduling configuration/decision per Pclouds. It can be done easily by defining a policy to each Pclouds. Future development of the policy system will even allow users to define their own rules in their Pclouds! Best regards, Khanh-Toan Tran ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
Hi Sylvain, 1) Some Filters such as AggregateCoreFilter, AggregateRAMFilter can change its parameters for aggregates. But what if admin wants to change for all hosts in an availability-zone? Does he have to rewrite all the parameters in all aggregates? Or should we create a new AvailabilityZoneCoreFilter? The Policy Based Scheduler (PBS) blueprint separates the effect (filter according to Core) from its target (all hosts in an aggregate, or in an availability-zone). It will benefit all filters, not just CoreFilter or RAMFilter, so that we can avoid creating for each filter XFilter the AggregateXFilter and AvailabilityZoneWFilter from now on. Beside, if admin wants to apply the a filter to some aggregates (or availability-zone) and not the other (dont call filters at all, not just modify parameters), he can do it. It help us avoid running all filters on all hosts. 2) In fact, we also prepare for a separated scheduler in which PBS is a very first step of it, thats why we purposely separate the Policy Based Scheduler from Policy Based Scheduling Module (PBSM) [1] which is the core of our architecture. If you look at our code, you will see that Policy_Based_Scheduler.py is only slightly different from Filter Scheduler. That is because we just want a link from Nova-scheduler to PBSM. Were trying to push some more management into scheduler without causing too much modification, as you can see in the patch . Thus Im very happy when Gantt is proposed. As I see it, Gantt is based on Nova-scheduler code, with the planning on replacing nova-scheduler in J. The separation from Nova will be complicated, but not on scheduling part. Thus integrating PBS and PBSM into Gantt would not be a problem. Best regards, [1] https://docs.google.com/document/d/1gr4Pb1ErXymxN9QXR4G_jVjLqNOg2ij9oA0JrL wMVRA Toan De : Sylvain Bauza [mailto:sylvain.ba...@gmail.com] Envoyé : jeudi 30 janvier 2014 11:16 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler Hi Khanh-Toan, I only have one comment on your proposal : why are you proposing something new for overcommitments with aggregates while the AggregateCoreFilter [1] and AggregateRAMFilter [2]already exist, which AIUI provide same feature ? I'm also concerned about the scope of changes for scheduler, as Gantt is currently trying to replace it. Can we imagine such big changes to be committed on the Nova side, while it's planned to have a Scheduler service in the next future ? -Sylvain [1] https://github.com/openstack/nova/blob/master/nova/scheduler/filters/core_ filter.py#L74 [2] https://github.com/openstack/nova/blob/master/nova/scheduler/filters/ram_f ilter.py#L75 2014-01-30 Khanh-Toan Tran khanh-toan.t...@cloudwatt.com There is an unexpected line break in the middle of the link, so I post it again: https://docs.google.com/document/d/1RfP7jRsw1mXMjd7in72ARjK0fTrsQv1bqolOri https://docs.google.com/document/d/1RfP7jRsw1mXMjd7in72ARjK0fTrsQv1bqolOr iIQB2Y IQB2Y -Message d'origine- De : Khanh-Toan Tran [mailto:khanh-toan.t...@cloudwatt.com] Envoyé : mercredi 29 janvier 2014 13:25 À : 'OpenStack Development Mailing List (not for usage questions)' Objet : [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler Dear all, As promised in the Scheduler/Gantt meeting, here is our analysis on the connection between Policy Based Scheduler and Solver Scheduler: https://docs.google.com/document/d/1RfP7jRsw1mXMjd7in72ARjK0fTrsQv1bq olOri IQB2Y This document briefs the mechanism of the two schedulers and the possibility of cooperation. It is my personal point of view only. In a nutshell, Policy Based Scheduler allows admin to define policies for different physical resources (an aggregate, an availability-zone, or all infrastructure) or different (classes of) users. Admin can modify (add/remove/modify) any policy in runtime, and the modification effect is only in the target (e.g. the aggregate, the users) that the policy is defined to. Solver Scheduler solves the placement of groups of instances simultaneously by putting all the known information into a integer linear system and uses Integer Program solver to solve the latter. Thus relation between VMs and between VMs- computes are all accounted for. If working together, Policy Based Scheduler can supply the filters and weighers following the policies rules defined for different computes. These filters and weighers can be converted into constraints cost function for Solver Scheduler to solve. More detailed will be found in the doc. I look forward for comments and hope that we can work it out. Best regards, Khanh-Toan TRAN ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
Dear all, As promised in the Scheduler/Gantt meeting, here is our analysis on the connection between Policy Based Scheduler and Solver Scheduler: https://docs.google.com/document/d/1RfP7jRsw1mXMjd7in72ARjK0fTrsQv1bqolOri IQB2Y This document briefs the mechanism of the two schedulers and the possibility of cooperation. It is my personal point of view only. In a nutshell, Policy Based Scheduler allows admin to define policies for different physical resources (an aggregate, an availability-zone, or all infrastructure) or different (classes of) users. Admin can modify (add/remove/modify) any policy in runtime, and the modification effect is only in the target (e.g. the aggregate, the users) that the policy is defined to. Solver Scheduler solves the placement of groups of instances simultaneously by putting all the known information into a integer linear system and uses Integer Program solver to solve the latter. Thus relation between VMs and between VMs-computes are all accounted for. If working together, Policy Based Scheduler can supply the filters and weighers following the policies rules defined for different computes. These filters and weighers can be converted into constraints cost function for Solver Scheduler to solve. More detailed will be found in the doc. I look forward for comments and hope that we can work it out. Best regards, Khanh-Toan TRAN ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [gantt] Scheduler sub-group meeting agenda 1/28
Dear all, If we have time, I would like to discuss our new blueprint: Policy-Based-Scheduler: https://blueprints.launchpad.net/nova/+spec/policy-based-scheduler whose code is ready for review: https://review.openstack.org/#/c/61386/ Best regards, Toan - Original Message - From: Donald D Dugger donald.d.dug...@intel.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Tuesday, January 28, 2014 4:46:12 AM Subject: [openstack-dev] [gantt] Scheduler sub-group meeting agenda 1/28 1) Memcached based scheduler updates 2) Scheduler code forklift 3) Opens -- Don Dugger Censeo Toto nos in Kansa esse decisse. - D. Gale Ph: 303/443-3786 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Next steps for Whole Host allocation / Pclouds
Exactly - that's why I wanted to start this debate about the way forward for the Pcloud Blueprint, which was heading into some kind of middle ground. As per my original post, and it sounds like the three of us are at least aligned I'm proposing to spilt this into two streams: i) A new BP that introduces the equivalent of AWS dedicated instances. Why do you want to transform pCloud into AWS dedicated instances? As I see it, pCloud is for requesting physical hosts (HostFlovors as in pcloud wiki) on which users can create their own instances (theoretically in unlimited number). Therefore it should be charged per physical server (HostFlavor), not by instances. It is completely different from AWS dedicated instances which is charged per instance. IMO, pcloud resembles Godrid Dedicated Server, not AWS Dedicated Instance. If you want to provide AWS dedicated instances typed service, then it would not be Pcloud, nor it is a continuation of the WholeHostAllocaiton blueprint, which , IMO, is damned well designed. It'll be just another scheduler job. Well, I did not say that it's not worth pursuing ; I just say that WholeHostAllocation is worth being kept pcloud. User - Only has to specify that at boot time that the instance must be on a host used exclusively by that tenant. Scheduler - ether finds a hoist which matches this constraint or it doesn't. No linkage to aggregates (other than that from other filters), no need for the aggregate to have been pre-configured Compute Manager - has to check the constraint (as with any other scheduler limit) and add the info that this is a dedicated instance to notification messages Operator - has to manage capacity as they do for any other such constraint (it is a significant capacity mgmt issue, but no worse in my mind that having flavors that can consume most of a host) , and work out how they want to charge for such a model (flat rate additional charge for first such instance, charge each time a new host is used, etc). How about using migration for releasing compute hosts for new allocation? In standard configuration, admin would use LoadBalancing for his computes. Thus if we don't have a dedicated resources pool (this comes back to aggregate configuration), then all hosts would be used, which leaves no host empty for hosting dedicated instances. I think there is clear water between this and the existing aggregate based isolation. I also think this is a different use case from reservations. It's *mostly* like a new scheduler hint, but because it has billing impacts I think it needs to be more than just that - for example the ability to request a dedicated instance is something that should be controlled by a specific role. Agreed. The billing is rather the problem here. Nova can handle this all right, but how this new functionality cope with the billing model. Basically, which information is recorded, and where. ii) Leave the concept of private clouds within a cloud to something that can be handled at the region level. I think there are valid use cases here, but it doesn’t make sense to try and get this kind of granularity within Nova. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Schduler] Volunteers wanted for a modest proposal for an external scheduler in our lifetime
We are also interested in the proposal and would like to contribute whatever we can. Currently we're working on nova-scheduler we think that an independent scheduler is a need for Openstack. We've been engaging in several discussions on this topic in the ML as well as in Nova meeting, thus we were thrilled to hear your proposal. PS: I've written in a mail expressing our interest in this topic earlier , but I feel it's better to have an more official submission to join the team :) Best regards, Jerome Gallard Khanh-Toan Tran -Message d'origine- De : Robert Collins [mailto:robe...@robertcollins.net] Envoyé : mardi 3 décembre 2013 09:18 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [Nova][Schduler] Volunteers wanted for a modest proposal for an external scheduler in our lifetime The team size was a minimum, not a maximum - please add your names. We're currently waiting on the prerequisite blueprint to land before work starts in earnest; and for the blueprint to be approved (he says, without having checked to see if it has been now:)) -Rob ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Schduler] Volunteers wanted for a modest proposal for an external scheduler in our lifetime
The first stage is technical - move Nova scheduling code from A to be. What do we achieve - not much - we actually complicate things - there is always churn in Nova and we will have duplicate code bases. In addition to this the only service that can actually make use of they is Nova The second stage is defining an API that other modules can use (we have yet to decide if this will be RPC based or have a interface like Glance, Cinder etc.) We have yet to even talk about the API's. The third stage is adding shiny new features and trying to not have a community tar and feather us. Yup; I look forward to our tar and feathering overlords. :) Prior to copying code we really need to discuss the API's. I don't think we do: it's clear that we need to come up with them - it's necessary, and noone has expressed any doubt about the ability to do that. RPC API evolution is fairly well understood - we add a new method, and have it do the necessary, then we go to the users and get them using it, then we delete the old one. I agree with Robert. I think that nova RPC is fairly enough for the new scheduler right now. Most of the scheduler works focus on nova anyway, so starting from there is reasonable and rather easy for the transition. We can think of enhancing API later (even creating REST API perhaps). This can even be done in parallel if your concern is time and resources. But the point is we need a API to interface with the service. For a start we can just address the Nova use case. We need to at least address: 1. Scheduling interface 2. Statistics updates 3. API's for configuring the scheduling policies If by 2. Statistics update you mean the database issue for scheduler then yes, it is a big issue, especially during the transition period when nova still holds the host state data. Should scheduler get access to nova's DB for the time being, and later fork out the DB to scheduler? According to Boris, Merantis has already studied the separation of host state from nova's DB. I think we can benefit from their experience. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Schduler] Volunteers wanted for a modest proposal for an external scheduler in our lifetime
Dear all, I'm very interested in this subject as well. Actually there is also a discussion of the possibility of an independent scheduler in the mailisg list: http://lists.openstack.org/pipermail/openstack-dev/2013-November/019518.html Would it be possible to discuss about this subject in the next Scheduler meeting Nov 26th? Best regards, Toan - Original Message - From: Mike Spreitzer mspre...@us.ibm.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Friday, November 22, 2013 4:58:46 PM Subject: Re: [openstack-dev] [Nova][Schduler] Volunteers wanted for a modest proposal for an external scheduler in our lifetime I'm still a newbie here, so can not claim my Nova skills are even modest. But I'd like to track this, if nothing more. Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][oslo][scheduler] How to leverage oslo schduler/filters for nova and cinder
Boris, Is it really OK to drop these tables? Could Nova can work without them (e.g. rollback)? And if Ceilometer is about to ask nova for host state metrics ? Yes it is OK, because now ceilometer and other projects could ask scheduler about host state. (I don't see any problems) IMO, since the scheduler doesnt communicate directly to hypervisors, thats the role of computes, thus we should not rely on it for collecting the host state data. I think that it should be inversed, i.e. scheduler relies on others, s.a. Ceilometer for that matter. But this means we have to deal with data synchronization. Alex, By the way, since the relationships between resources are likely to reside in Heat DB, it could make sense to have this thing as a new Engine under Heat umbrella (as discussed in couple of other threads, you are also likely to need orchestration, when dealing with groups of resources). Im not so sure that this scheduler should fall into Heat. Heat does not know *every* compute, it communicates with Nova-API and thats all it knows. Scheduler has the complete knowledge of the infrastructure, and responses to question which compute hosts which VM. Thus whoever the scheduler responds to should be able to communicate with *every* computes. For instance, the scheduler can directly initiate VM like in the old days, or have some conductor for this task, or some orchestration like you said. Of course, Heat can call this scheduler which will initiate the VMs is a sound scenario. But otherwise for Heat to have this scheduler integrated is too much intrusive into the infra. Best regards, Toan De : Alex Glikson [mailto:glik...@il.ibm.com] Envoyé : lundi 18 novembre 2013 09:29 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [nova][cinder][oslo][scheduler] How to leverage oslo schduler/filters for nova and cinder Boris Pavlovic mailto:bpavlo...@mirantis.com bpavlo...@mirantis.com wrote on 18/11/2013 08:31:20 AM: Actually schedulers in nova and cinder are almost the same. Well, this is kind of expected, since Cinder scheduler started as a copy-paste of the Nova scheduler :-) But they already started diverging (not sure whether this is necessarily a bad thing or not). So, Cinder (as well as Neutron, and potentially others) would need to be hooked to Nova rpc? As a first step, to prove approach yes, but I hope that we won't have nova or cinder scheduler at all. We will have just scheduler that works well. So, do you envision this code being merged in Nova first, and then move out? Start as a new thing from the beginning? Also, when it will be separate (probably not in icehouse?), will the communication continue being over RPC, or would we need to switch to REST? This could be conceptually similar to the communicate between cells today, via a separate RPC. By the way, since the relationships between resources are likely to reside in Heat DB, it could make sense to have this thing as a new Engine under Heat umbrella (as discussed in couple of other threads, you are also likely to need orchestration, when dealing with groups of resources). Instances of memcached. In an environment with multiple schedulers. I think you mentioned that if we have, say, 10 schedulers, we will also have 10 instances of memcached. Actually we are going to make implementation based on sqlalchemy as well. In case of memcached I just say one of arch, that you could run on each server with scheduler service memcahced instance. But it is not required, you can have even just one memcached instance for all scheulers (but it is not HA). I am not saying that having multiple instances of memcached is wrong - just that it would require some work.. It seems that one possible approach could be partitioning -- each scheduler will take care of a subset of the environment (availability zone?). This way data will be naturally partitioned too, and the data in memcached's will not need to be synchronized. Of course, making this HA would also require some effort (something like ZooKeeper could be really useful to manage all of this - configuration of each scheduler, ownership of underlying 'zones', leader election, etc). Regards, Alex Best regards, Boris Pavlovic --- Mirantis Inc. _ Aucun virus trouvé dans ce message. Analyse effectuée par AVG - www.avg.fr Version: 2014.0.4158 / Base de données virale: 3629/6844 - Date: 17/11/2013 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Configure overcommit policy
Step 1: use flavors so nova can tell between the two workloads, and configure them differently Step 2: find capacity for your workload given your current cloud usage At the moment, most of our solutions involve reserving bits of your cloud capacity for different workloads, generally using host aggregates. The issue with claiming back capacity from other workloads is a bit tricker. The issue is I don't think you have defined where you get that capacity back from? Maybe you want to look at giving some workloads a higher priority over the constrained CPU resources? But you will probably starve the little people out at random, which seems bad. Maybe you want to have a concept of spot instances where they can use your spare capacity until you need it, and you can just kill them? But maybe I am miss understanding your use case, its not totally clear to me. Yes currently we can only reserve some hosts for particular workloads. But «reservation» is done by admins operation, not «on-demand» as I understand. Anyway, its just some speculations from what I think of Alexander usecase. Or maybe I misunderstand Alexander ? It is interesting to see the development of the CPU entitlement blueprint that Alex mentioned. It was registered in Jan 2013. Any idea whether it is still going on? De : Alex Glikson [mailto:glik...@il.ibm.com] Envoyé : jeudi 14 novembre 2013 16:13 À : OpenStack Development Mailing List (not for usage questions) Objet : Re: [openstack-dev] [nova] Configure overcommit policy In fact, there is a blueprint which would enable supporting this scenario without partitioning -- https://blueprints.launchpad.net/nova/+spec/cpu-entitlement The idea is to annotate flavors with CPU allocation guarantees, and enable differentiation between instances, potentially running on the same host. The implementation is augmenting the CoreFilter code to factor in the differentiation. Hopefully this will be out for review soon. Regards, Alex From:John Garbutt j...@johngarbutt.com To:OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org, Date:14/11/2013 04:57 PM Subject:Re: [openstack-dev] [nova] Configure overcommit policy _ On 13 November 2013 14:51, Khanh-Toan Tran khanh-toan.t...@cloudwatt.com wrote: Well, I don't know what John means by modify the over-commit calculation in the scheduler, so I cannot comment. I was talking about this code: https://github.com/openstack/nova/blob/master/nova/scheduler/filters/core _filter.py#L64 https://github.com/openstack/nova/blob/master/nova/scheduler/filters/core_ filter.py#L64 But I am not sure thats what you want. The idea of choosing free host for Hadoop on the fly is rather complicated and contains several operations, namely: (1) assuring the host never get pass 100% CPU load; (2) identifying a host that already has a Hadoop VM running on it, or already 100% CPU commitment; (3) releasing the host from 100% CPU commitment once the Hadoop VM stops; (4) possibly avoiding other applications to use the host (to economy the host resource). - You'll need (1) because otherwise your Hadoop VM would come short of resources after the host gets overloaded. - You'll need (2) because you don't want to restrict a new host while one of your 100% CPU commited hosts still has free resources. - You'll need (3) because otherwise you host would be forerever restricted, and that is no longer on the fly. - You'll may need (4) because otherwise it'd be a waste of resources. The problem of changing CPU overcommit on the fly is that when your Hadoop VM is still running, someone else can add another VM in the same host with a higher CPU overcommit (e.g. 200%), (violating (1) ) thus effecting your Hadoop VM also. The idea of putting the host in the aggregate can give you (1) and (2). (4) is done by AggregateInstanceExtraSpecsFilter. However, it does not give you (3); which can be done with pCloud. Step 1: use flavors so nova can tell between the two workloads, and configure them differently Step 2: find capacity for your workload given your current cloud usage At the moment, most of our solutions involve reserving bits of your cloud capacity for different workloads, generally using host aggregates. The issue with claiming back capacity from other workloads is a bit tricker. The issue is I don't think you have defined where you get that capacity back from? Maybe you want to look at giving some workloads a higher priority over the constrained CPU resources? But you will probably starve the little people out at random, which seems bad. Maybe you want to have a concept of spot instances where they can use your spare capacity until you need it, and you can just kill them? But maybe I am miss understanding your use case, its not totally clear to me. John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http
Re: [openstack-dev] [nova] Configure overcommit policy
FYI, by default Openstack overcommit CPU 1:16, meaning it can host 16 times number of cores it possesses. As mentioned Alex, you can change it by enabling AggregateCoreFilter in nova.conf: scheduler_default_filters = list of your filters, adding AggregateCoreFilter here and modifying the overcommit ratio by adding: cpu_allocation_ratio=1.0 Just a suggestion, think of isolating the host for the tenant that uses Hadoop so that it will not serve other applications. You have several filters at your disposal: AggregateInstanceExtraSpecsFilter IsolatedHostsFilter AggregateMultiTenancyIsolation Best regards, Toan - Original Message - From: Alex Glikson glik...@il.ibm.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Tuesday, November 12, 2013 3:54:02 PM Subject: Re: [openstack-dev] [nova] Configure overcommit policy You can consider having a separate host aggregate for Hadoop, and use a combination of AggregateInstanceExtraSpecFilter (with a special flavor mapped to this host aggregate) and AggregateCoreFilter (overriding cpu_allocation_ratio for this host aggregate to be 1). Regards, Alex From: John Garbutt j...@johngarbutt.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org, Date: 12/11/2013 04:41 PM Subject: Re: [openstack-dev] [nova] Configure overcommit policy On 11 November 2013 12:04, Alexander Kuznetsov akuznet...@mirantis.com wrote: Hi all, While studying Hadoop performance in a virtual environment, I found an interesting problem with Nova scheduling. In OpenStack cluster, we have overcommit policy, allowing to put on one compute more vms than resources available for them. While it might be suitable for general types of workload, this is definitely not the case for Hadoop clusters, which usually consume 100% of system resources. Is there any way to tell Nova to schedule specific instances (the ones which consume 100% of system resources) without overcommitting resources on compute node? You could have a flavor with no-overcommit extra spec, and modify the over-commit calculation in the scheduler on that case, but I don't remember seeing that in there. John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] When is it okay for submitters to say 'I don't want to add tests' ?
Hey thanks a lot! - Original Message - From: Clint Byrum cl...@fewbar.com To: openstack-dev openstack-dev@lists.openstack.org Sent: Thursday, October 31, 2013 7:49:55 PM Subject: Re: [openstack-dev] When is it okay for submitters to say 'I don't want to add tests' ? Excerpts from Khanh-Toan Tran's message of 2013-10-31 07:22:06 -0700: Hi all, As a newbie of the community, I'm not familiar with unittest and how to use it here. I've learned that Jenkins runs tests everytime we submit some code. But how to write the test and what is a 'good test' and a 'bad test'? I saw some commits in gerrit but am unable to say if the written test is enough to judge the code, since it is the author of the code who writes the test. Is there a framework to follow or some rules/pratices to respect? Do you have some links to help me out? This is a nice synopsis of the concept of test driven development: http://net.tutsplus.com/tutorials/python-tutorials/test-driven-development-in-python/ In OpenStack we always put tests in _base_module_name_/tests, So if you are working on nova, you can see the unit tests in: nova/tests You can generally always run the tests by installing the 'tox' python module/command on your system and running 'tox' in the root of the git repository. Projects use various testing helpers to make tests easier to read and write. The most common one is testtools. A typical test will look like this: import testtools from basemodule import submodule class TestSubmoduleFoo(testtools.TestCase): def test_foo_apple(self): self.assertEquals(1, submodule.foo('apple')) def test_foo_banana(self): self.assertEquals(0, submodule.foo('banana')) Often unit tests will include mocks and fakes to hide real world interfacing code from the unit tests. You would do well to read up on how those concepts work as well, google for 'python test mocking' and 'python test fakes'. Good luck, and #openstack-dev is always there to try and help. :) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] When is it okay for submitters to say 'I don't want to add tests' ?
Hi all, As a newbie of the community, I'm not familiar with unittest and how to use it here. I've learned that Jenkins runs tests everytime we submit some code. But how to write the test and what is a 'good test' and a 'bad test'? I saw some commits in gerrit but am unable to say if the written test is enough to judge the code, since it is the author of the code who writes the test. Is there a framework to follow or some rules/pratices to respect? Do you have some links to help me out? Thanks, Toan - Original Message - From: Kyle Mestery (kmestery) kmest...@cisco.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, October 31, 2013 3:05:27 PM Subject: Re: [openstack-dev] When is it okay for submitters to say 'I don't want to add tests' ? On Oct 31, 2013, at 8:56 AM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Mark McLoughlin's message of 2013-10-31 06:30:32 -0700: On Thu, 2013-10-31 at 15:37 +1300, Robert Collins wrote: This is a bit of a social norms thread I've been consistently asking for tests in reviews for a while now, and I get the occasional push-back. I think this falls into a few broad camps: A - there is no test suite at all, adding one in unreasonable B - this thing cannot be tested in this context (e.g. functional tests are defined in a different tree) C - this particular thing is very hard to test D - testing this won't offer benefit E - other things like this in the project don't have tests F - submitter doesn't know how to write tests G - submitter doesn't have time to write tests Nice breakdown. Now, of these, I think it's fine not add tests in cases A, B, C in combination with D, and D. I don't think E, F or G are sufficient reasons to merge something without tests, when reviewers are asking for them. G in the special case that the project really wants the patch landed - but then I'd expect reviewers to not ask for tests or to volunteer that they might be optional. I totally agree with the sentiment but, especially when it's a newcomer to the project, I try to put myself in the shoes of the patch submitter and double-check whether what we're asking is reasonable. Even with a long time contributor, empathy is an important part of constructing reviews. We could make more robotic things that review for test coverage, but we haven't, because this is a gray area. The role of a reviewer isn't just get patches merged and stop defects. It is also to grow the other developers. For example, if someone shows up to Nova with their first OpenStack contribution, it fixes something which is unquestionably a bug - think typo like raise NotFund('foo') - and testing this code patch requires more than adding a simple new scenario to existing tests ... This goes back to my recent suggestion to help the person not with a -1 or a +2, but with an additional patch that fixes it. That, for me, is an example of -1, we need a test! untested code is broken! is really shooting the messenger, not valuing the newcomers contribution and risks turning that person off the project forever. Reviewers being overly aggressive about this where the project doesn't have full test coverage to begin with really makes us seem unwelcoming. In cases like that, I'd be of a mind to go +2 Awesome! Thanks for catching this! It would be great to have a unit test for this, but it's clear the current code is broken so I'm fine with merging the fix without a test. You could say it's now the reviewers responsibility to merge a test, but if that requirement then turns off reviewers even reviewing such a patch, then that doesn't help either. I understand entirely why you choose this, and I think that is fine. I, however, see this as a massive opportunity to teach. That code was only broken because it was allowed it to be merged without tests. By letting that situation continue, we only fix it today. The next major refactoring has a high chance now of breaking that part of the code again. So, rather than +2, I suggest -1 with compassion. Engage with the submitter. If you don't know them, take a look at how hard it would be to write a test for the behavior and give pointers to the exact test suite that would need to be changed, or suggest a new test suite and point at a good example to copy. So, with all of this, let's make sure we don't forget to first appreciate the effort that went into submitting the patch that lacks tests. I'm not going to claim that I've always practiced -1 with compassion, so thanks for reminding us all that we're not just reviewing code, we are having a dialog with real live people. I think this is the key thing here, thanks for bringing this up Clint. At the end of the day, patches are submitted by real people. If we want to grow the committer base and help people to become better reviewers, taking the time to show them
Re: [openstack-dev] [nova][scheduler] Instance Group Model and APIs - Updated document with an example request payload
Hi Yathi, Thank you for yor example. I have some remarks concerning the JSON format: 1) Member of a group is recursive. A member can be group or an instance. In this case there are two different declaration formats for members, as with http-server-group-1 (name, policy, edge) and Http-Server-1 (name, request_spec, type). Would it be better if group-typed member also have type field to better interpret the member? Like policy which has type field to declare that's a egde-typed policy or group-typed policy. 2) The edge is not clear to me. It seems to me that edge is just a place holder for the edge policy. Does it have some particular configuration like group members (e.g. group-typed member is described by its member,edge and policy, while instance-typed member is described by its request_spec) ? 3) Members groups have policy declaration nested in them. Why is edge-policy is declared outside of edge's declaration? Anyway, good work. Toan - Original Message - From: John Garbutt j...@johngarbutt.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Tuesday, October 29, 2013 12:29:19 PM Subject: Re: [openstack-dev] [nova][scheduler] Instance Group Model and APIs - Updated document with an example request payload On 29 October 2013 06:46, Yathiraj Udupi (yudupi) yud...@cisco.com wrote: The Instance Group API document is now updated with a simple example request payload of a nested group, and some description of how the API implementation should handle the registration of the components of a nested instance group. https://docs.google.com/document/d/17OIiBoIavih-1y4zzK0oXyI66529f-7JTCVj-BcXURA/edit Hope we will have a good API design session at the summit, and also continue the discussion face-to-face over there. Its looking good, but I was thinking about a slightly different approach: * I would like to see instance groups be used to describe all scheduler hints (including, please run on cell X, or please run on hypervisor Y) * passing old scheduler hints to the API will just create a new instance group to persist the request * ensure live-migrate/migrate never lets you violate the rules in the user hints, at least don't allow it to happen by accident * I was expecting to see hard and soft constraints/hints, like: try keep in same switch, but make sure on separate servers * Would be nice to have admin defined global options, like: ensure tenant does note have two servers on the same hypervisor or soft * I expected to see the existing boot server command simply have the addition of a reference to a group, keeping the existing methods of specifying multiple instances * I aggree you can't change a group's spec once you have started some VMs in that group, but you could then simply launch more VMs keeping to the same policy * New task API (see summit session) should help with the reporting if the VM actually could be started or not, and the reason why it was not possible * augment the server details (and group?) with more location information saying where the scheduler actually put things, obfuscated on per tenant basis. So imagine nova, cinder, neutron exposing ordered (arbitrary tagged) location metadata like nova: ((host_id, foo), (switch_group_id: bar), (power_group: bas)) * the above should help us define the scope of a constraint relative to either a nova, cinder or neutron resource. * Consider a constraint that includes constraints about groups, like must be separate to group X, in the scope of the switch, or something like that * Need more thought on constraints between volumes, servers and networks, I don't think edges are the right way to state that, I think it would be better as a cross group constraint, where the scope of the constraint is related to neutron. Anyways, just a few thoughts I was having. Does that fit in with what you were thinking? John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [scheduler] APIs for Smart Resource Placement - Updated Instance Group Model and API extension model - WIP Draft
I didn't see any command referring InstanceGroupMemberConnection. What is it exactly? Could you give an example? And how can we create an InstanceGroup? 1) Create an empty group 2) Add policy, metadata 3) Add group instances ... ? or in the InstanceGroup POST message there is already a description of all InstanceGroupMembers, Connections, etc ? An (raw) example would be really helpful to understand the proposition. Best regards, Toan - Original Message - From: Mike Spreitzer mspre...@us.ibm.com To: Yathiraj Udupi (yudupi) yud...@cisco.com Cc: OpenStack Development Mailing List openstack-dev@lists.openstack.org Sent: Wednesday, October 23, 2013 5:36:25 AM Subject: Re: [openstack-dev] [scheduler] APIs for Smart Resource Placement - Updated Instance Group Model and API extension model - WIP Draft Yathiraj Udupi (yudupi) yud...@cisco.com wrote on 10/15/2013 03:08:32 AM: I have made some edits to the document: https://docs.google.com/ document/d/17OIiBoIavih-1y4zzK0oXyI66529f-7JTCVj-BcXURA/edit?pli=1# ... One other minor thing to discuss in the modeling is metadata. I am not eager to totally gorp up the model, but shouldn't all sorts of things allow metadata? Thanks, Mike ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] support for multiple active scheduler policies/drivers
I'm not sure it's a good moment for this but I would like to re-open the topic a little bit. Just a small idea: is it OK if we use a file, or a database as a central point to store the policies and their associated aggregates? The Scheduler reads it first, then calls the scheduler drivers listed in the policy file for the associated aggregates. In this case we can get the list of filters and targeted aggregates before actually running the filters. Thus we avoid the loop filter - aggregate - policy - filter -. Moreover, admin does not need to populate the flavors' extra_specs or associate them with the aggregates, effectively avoiding defining two different policies in 2 flavors whose VMs are eventually hosted in a same aggregate. The downside of this method is that it is not API-accessible: at the current state we do not have a policy management system. I would like a policy management system with REST API, but still, it is not worse than using nova config. Best regards, Toan Alex Glikson GLIKSON at il.ibm.com Wed Aug 21 17:25:30 UTC 2013 Just to update those who are interested in this feature but were not able to follow the recent commits, we made good progress converging towards a simplified design, based on combination of aggregates and flavors (both of which are API-drvien), addressing some of the concerns expressed in this thread (at least to certain extent). The current design and possible usage scenario has been updated at https://wiki.openstack.org/wiki/Nova/MultipleSchedulerPolicies Comments are welcome (as well as code reviews at https://review.openstack.org/#/c/37407/). Thanks, Alex From: Joe Gordon joe.gordon0 at gmail.com To: OpenStack Development Mailing List openstack-dev at lists.openstack.org, Date: 27/07/2013 01:22 AM Subject:Re: [openstack-dev] [Nova] support for multiple active scheduler policies/drivers On Wed, Jul 24, 2013 at 6:18 PM, Alex Glikson GLIKSON at il.ibm.com wrote: Russell Bryant rbryant at redhat.com wrote on 24/07/2013 07:14:27 PM: I really like your point about not needing to set things up via a config file. That's fairly limiting since you can't change it on the fly via the API. True. As I pointed out in another response, the ultimate goal would be to have policies as 'first class citizens' in Nova, including a DB table, API, etc. Maybe even a separate policy service? But in the meantime, it seems that the approach with config file is a reasonable compromise in terms of usability, consistency and simplicity. I do like your idea of making policies first class citizens in Nova, but I am not sure doing this in nova is enough. Wouldn't we need similar things in Cinder and Neutron?Unfortunately this does tie into how to do good scheduling across multiple services, which is another rabbit hole all together. I don't like the idea of putting more logic in the config file, as it is the config files are already too complex, making running any OpenStack deployment require some config file templating and some metadata magic (like heat). I would prefer to keep things like this in aggregates, or something else with a REST API. So why not build a tool on top of aggregates to push the appropriate metadata into the aggregates. This will give you a central point to manage policies, that can easily be updated on the fly (unlike config files). In the long run I am interested in seeing OpenStack itself have a strong solution for for policies as a first class citizen, but I am not sure if your proposal is the best first step to do that. Regards, Alex -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev at lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler] A new blueprint for Nova-scheduler: Policy-based Scheduler
I like what you proposed in the blueprint. I totally agree that nova-scheduler needs finer granularity in its usage of filters and weighers. Our objective is thus very similar. Our approach is little different. Since flavors are choices of clients, and aggregates are selected during host selection (which comes after filters), we choose to separate the policies from flavors and aggregates and put them into a Policy Repository (a database or a simple file). The Policy-based Scheduler then looks at the Repo first to know which policy applied to which target (aggregates, tenants, etc). It is an extensible architecture: It allows to customize policies and plug other solutions easily. The policy may be as simple as to apply, like in your proposal, a filter (policy - (filter + aggregate)), a weigher, a combination of them or a completely new driver, say a new scheduling solution. Currently we're working on an implementation of the blueprint which allows only admin to set up policies, but I also like the idea of letting client say their preferences (e.g. preferred availability-zone, anti-affinity, choice between silver-class or gold-class service). It is a question of philosophy. Best regards, Toan Global archi: https://docs.google.com/document/d/1gr4Pb1ErXymxN9QXR4G_jVjLqNOg2ij9oA0JrLwMVRA -- Message original Sujet: Re: [openstack-dev] [nova][scheduler] A new blueprint for Nova-scheduler: Policy-based Scheduler Date : Wed, 16 Oct 2013 14:38:38 +0300 De : Alex Glikson glik...@il.ibm.com Répondre à : OpenStack Development Mailing List openstack-dev@lists.openstack.org Pour : OpenStack Development Mailing List openstack-dev@lists.openstack.org This sounds very similar to https://blueprints.launchpad.net/nova/+spec/multiple-scheduler-drivers We worked on it in Havana, learned a lot from feedbacks during the review cycle, and hopefully will finalize the details at the summit and will be able to continue finish the implementation in Icehouse. Would be great to collaborate. Regards, Alex From: Khanh-Toan Tran khanh-toan.t...@cloudwatt.com To: openstack-dev@lists.openstack.org, Date: 16/10/2013 01:42 PM Subject:[openstack-dev] [nova][scheduler] A new blueprint for Nova-scheduler: Policy-based Scheduler Dear all, I've registered a new blueprint for nova-scheduler. The purpose of the blueprint is to propose a new scheduler that is based on policy: https://blueprints.launchpad.net/nova/+spec/policy-based-scheduler With current Filter_Scheduler, admin cannot change his placement policy without restarting nova-scheduler. Neither can he define local policy for a group of resources (say, an aggregate), or a particular client. Thus we propose this scheduler to provide admin with the capability of defining/changing his placement policy in runtime. The placement policy can be global (concerning all resources), local (concerning a group of resources), or tenant-specific. Please don't hesitate to contact us for discussion, all your comments are welcomed! Best regards, Khanh-Toan TRAN Cloudwatt Email: khanh-toan.tran[at]cloudwatt.com 892 Rue Yves Kermen 92100 BOULOGNE-BILLANCOURT FRANCE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova][scheduler] A new blueprint for Nova-scheduler: Policy-based Scheduler
Dear all, I've registered a new blueprint for nova-scheduler. The purpose of the blueprint is to propose a new scheduler that is based on policy: https://blueprints.launchpad.net/nova/+spec/policy-based-scheduler With current Filter_Scheduler, admin cannot change his placement policy without restarting nova-scheduler. Neither can he define local policy for a group of resources (say, an aggregate), or a particular client. Thus we propose this scheduler to provide admin with the capability of defining/changing his placement policy in runtime. The placement policy can be global (concerning all resources), local (concerning a group of resources), or tenant-specific. Please don't hesitate to contact us for discussion, all your comments are welcomed! Best regards, Khanh-Toan TRAN Cloudwatt Email: khanh-toan.tran[at]cloudwatt.com 892 Rue Yves Kermen 92100 BOULOGNE-BILLANCOURT FRANCE ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev