Hi Brian, comments inline :) On Tue, Apr 12, 2011 at 12:34 PM, Brian Schott <bfsch...@gmail.com> wrote: > > I'm trying to understand how best to implement our architecture-aware > scheduler for Diablo: > https://blueprints.launchpad.net/nova/+spec/schedule-instances-on-heterogeneous-architectures > > Right now our scheduler is similar in approach to SimpleScheduler with a few > extra filters on instances and compute_nodes table queries for the cpu_arch > and xpu_arch fields that we added. For example, for "-t cg1.4xlarge" GPU > instance type the scheduler reads instance_types.cpu_arch="x86_64" and > instance_types.xpu_arch = "fermi", then filters the respective compute_node > and instance fields. http://wiki.openstack.org/HeterogeneousInstanceTypes > > That's OK for Cactus, but going beyond that, I'm struggling to reconcile > these different blueprints: > https://blueprints.launchpad.net/nova/+spec/advanced-scheduler > https://blueprints.launchpad.net/nova/+spec/distributed-scheduler > > - How is the instance_metadata table used? I see the "cpu_arch, xpu_arch" > and other fields we added as of the same class of data as vcpus, local_gb, or > mem_mb fields, which is why I put them in the instances table. > Virtualization type is of a similar class. I think of meta-data as less > defined constraints passed to the scheduler like "near vol-12345678".
:( I've brought this up before as well. The term metadata is used incorrectly to refer to custom key/value attributes of something instead of referring to data about the data (for instance, the type and length constraints of a data field). Unfortunately, because the OpenStack API uses the actual term "metadata" in the API, that's what the table was named and that's how key/value pairs are referred to in the code. We have at least three choices here: 1) Continue to add fields to the instances table (or compute_nodes table) for these main attributes like cpu_arch, etc. 2) Use the custom key/value table (instance_metadata) to store these attribute names and their values 3) Do both 1) and 2) I would prefer that we use 1) above for fields that are common to all nodes (and thus can be NOT NULL fields in the database and be properly indexed. And all other attributes that are not common to all nodes use the instance_metadata table. Thoughts? > - Will your capabilities scheduler, constraint scheduler, and/or distributed > schedulers understand different available hardware resources on compute nodes? I was assuming they would "understand" different available hardware resources by querying a database table that housed attributes pertaining to a single host or a group of hosts (a zone). > - Should there be an instance_types_metadata table for things like "cpu_arch" > rather than our current approach? There could be if those fields were added as main attributes on the instances table. If those attributes are added to the instances_metadata table as custom key/value pairs, no, that wouldn't make much sense. > As long as we can inject a "-t cg1.4xlarge" at one end and have that get > routed to a compute node with GPU hardware on the other end, we're not tied > to the centralized database implementation. I don't see how having the database implementation be centralized or not affects the above statement. Could you elaborate? > PS: I sent this to the mailing list a week ago and didn't get a reply, now > can't even find this in the openstack list archive. Anyone else having their > posts quietly rejected? I saw the original, if you are referring to this one: https://lists.launchpad.net/openstack/msg01645.html Cheers! -jay _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp