Hi Miguel Angel, comments/answers inline :)

On 04/08/2016 09:17 AM, Miguel Angel Ajo Pelayo wrote:
Hi!,

    In the context of [1] (generic resource pools / scheduling in nova)
and [2] (minimum bandwidth guarantees -egress- in neutron), I had a talk
a few weeks ago with Jay Pipes,

    The idea was leveraging the generic resource pools and scheduling
mechanisms defined in [1] to find the right hosts and track the total
available bandwidth per host (and per host "physical network"),
something in neutron (still to be defined where) would notify the new
API about the total amount of "NIC_BW_KB" available on every host/physnet.

Yes, what we discussed was making it initially per host, meaning the host would advertise a total aggregate bandwidth amount for all NICs that it uses for the data plane as a single amount.

The other way to track this resource class (NIC_BW_KB) would be to make the NICs themselves be resource providers and then the scheduler could pick a specific NIC to bind the port to based on available NIC_BW_KB on a particular NIC.

The former method makes things conceptually easier at the expense of introducing greater potential for retrying placement decisions (since the specific NIC to bind a port to wouldn't be known until the claim is made on the compute host). The latter method adds complexity to the filtering and scheduler in order to make more accurate placement decisions that would result in fewer retries.

    That part is quite clear to me,

    From [1] I'm not sure which blueprint introduces the ability to
schedule based on the resource allocation/availability itself,
("resource-providers-scheduler" seems more like an optimization to the
schedule/DB interaction, right?)

Yes, you are correct about the above blueprint; it's only for moving the Python-side filters to be a DB query.

The resource-providers-allocations blueprint:

https://review.openstack.org/300177

Is the one where we convert the various consumed resource amount fields to live in the single allocations table that may be queried for usage information.

We aim to use the ComputeNode object as a facade that hides the migration of these data fields as much as possible so that the scheduler actually does not need to know that the schema has changed underneath it. Of course, this only works for *existing* resource classes, like vCPU, RAM, etc. It won't work for *new* resource classes like the discussed NET_BW_KB because, clearly, we don't have an existing field in the instance_extra or other tables that contain that usage amount and therefore can't use ComputeNode object as a facade over a non-existing piece of data.

Eventually, the intent is to change the ComputeNode object to return a new AllocationList object that would contain all of the compute node's resources in a tabular format (mimicking the underlying allocations table):

https://review.openstack.org/#/c/282442/20/nova/objects/resource_provider.py

Once this is done, the scheduler can be fitted to query this AllocationList object to make resource usage and placement decisions in the Python-side filters.

We are still debating on the resource-providers-scheduler-db-filters blueprint:

https://review.openstack.org/#/c/300178/

Whether to change the existing FilterScheduler or create a brand new scheduler driver. I could go either way, frankly. If we made a brand new scheduler driver, it would do a query against the compute_nodes table in the DB directly. The legacy FilterScheduler would manipulate the AllocationList object returned by the ComputeNode.allocations attribute. Either way we get to where we want to go: representing all quantitative resources in a standardized and consistent fashion.

     And, that brings me to another point: at the moment of filtering
hosts, nova  I guess, will have the neutron port information, it has to
somehow identify if the port is tied to a minimum bandwidth QoS policy.

Yes, Nova's conductor gathers information about the requested networks *before* asking the scheduler where to place hosts:

https://github.com/openstack/nova/blob/stable/mitaka/nova/conductor/manager.py#L362

     That would require identifying that the port has a "qos_policy_id"
attached to it, and then, asking neutron for the specific QoS policy
  [3], then look out for a minimum bandwidth rule (still to be defined),
and extract the required bandwidth from it.

Yep, exactly correct.

    That moves, again some of the responsibility to examine and
understand external resources to nova.

Yep, it does. The alternative is more retries for placement decisions because accurate decisions cannot be made until the compute node is already selected and the claim happens on the compute node.

     Could it make sense to make that part pluggable via stevedore?, so
we would provide something that takes the "resource id" (for a port in
this case) and returns the requirements translated to resource classes
(NIC_BW_KB in this case).

Not sure Stevedore makes sense in this context. Really, we want *less* extensibility and *more* consistency. So, I would envision rather a system where Nova would call to Neutron before scheduling when it has received a port or network ID in the boot request and ask Neutron whether the port or network has any resource constraints on it. Neutron would return a standardized response containing each resource class and the amount requested in a dictionary (or better yet, an os_vif.objects.* object, serialized). Something like:

{
  'resources': {
    '<UUID of port or network>': {
      'NIC_BW_KB': 2048,
      'IPV4_ADDRESS': 1
    }
  }
}

In the case of the NIC_BW_KB resource class, Nova's scheduler would look for compute nodes that had a NIC with that amount of bandwidth still available. In the case of the IPV4_ADDRESS resource class, Nova's scheduler would use the generic-resource-pools interface to find a resource pool of IPV4_ADDRESS resources (i.e. a Neutron routed network or subnet allocation pool) that has available IP space for the request.

Best,
-jay

Best regards,
Miguel Ángel Ajo


[1]
http://lists.openstack.org/pipermail/openstack-dev/2016-February/086371.html
[2] https://bugs.launchpad.net/neutron/+bug/1560963
[3] http://developer.openstack.org/api-ref-networking-v2-ext.html#showPolicy

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to