Re: [openstack-dev] [nova][placement][ironic] Progress on custom resource classes

2016-12-05 Thread Jim Rollenhagen
On Fri, Dec 2, 2016 at 12:10 PM, Jay Pipes  wrote:

> Ironic colleagues, heads up, please read the below fully! I'd like your
> feedback on a couple outstanding questions.
>
> tl;dr
> -
>
> Work for custom resource classes has been proceeding well this cycle, and
> we're at a point where reviews from the Ironic community and functional
> testing of a series of patches would be extremely helpful.
>
> https://review.openstack.org/#/q/topic:bp/custom-resource-cl
> asses+status:open


\o/ will do sir.


>
>
> History
> ---
>
> As a brief reminder, in Newton, the Ironic community added a
> "resource_class" attribute to the primary Node object returned by the GET
> /nodes/{uuid} API call. This resource class attribute represents the
> "hardware profile" (for lack of a better term) of the Ironic baremetal node.
>
> In Nova-land, we would like to stop tracking Ironic baremetal nodes as
> collections of vCPU, RAM, and disk space -- because an Ironic baremetal
> node is consumed atomically, not piecemeal like a hypervisor node is.
> We'd like to have the scheduler search for an appropriate Ironic baremetal
> node using a simplified search that simply looks for node that has a
> particular hardware profile [1] instead of searching for nodes that have a
> certain amount of VCPU, RAM, and disk space.
>
> In addition to the scheduling and "boot request" alignment issues, we want
> to fix the reporting and account of resources in an OpenStack deployment
> containing Ironic. Currently, Nova reports an aggregate amount of CPU, RAM
> and disk space but doesn't understand that, when Ironic is in the mix, that
> a significant chunk of that CPU, RAM and disk isn't "targetable" for
> virtual machines. We would much prefer to have resource reporting look like:
>
>  48 vCPU total, 14 used
>  204800 MB RAM total, 10240 used
>  1340 GB disk total, 100 used
>  250 baremetal profile "A" total, 120 used
>  120 baremetal profile "B" total, 16 used
>
> instead of mixing all the resources together.
>
> Need review and functional testing on a few things
> --
>
> Now that the custom resource classes REST API endpoint is established [2]
> in the placement REST API, we are figuring out an appropriate way of
> migrating the existing inventory and allocation records for Ironic
> baremetal nodes from the "old-style" way of storing inventory for VCPU,
> MEMORY_MB and DISK_GB resources towards the "new-style" way of storing a
> single inventory record of amount 1 for the Ironic node's "resource_class"
> attribute.
>
> The patch that does this online data migration (from within the
> nova-compute resource tracker) is here:
>
> https://review.openstack.org/#/c/404472/
>
> I'd really like to get some Ironic contributor eyeballs on that patch and
> provide me feedback on whether the logic in the
> _cleanup_ironic_legacy_allocations() method is sound.
>
> There are still a couple things that need to be worked out:
>
> 1) Should the resource tracker auto-create custom resource classes in the
> placement REST API when it sees an Ironic node's resource_class attribute
> set to a non-NULL value and there is no record of such a resource class in
> the `GET /resource-classes` placement API call? My gut reaction to this is
> "yes, let's just do it", but I want to check with operators and Ironic devs
> first. The alternative is to ignore errors about "no such resource class
> exists", log a warning, and wait for an administrator to create the custom
> resource classes that match the distinct Ironic node resource classes that
> may exist in the deployment.
>

I tend to agree with Matt, there's no need to make operators do this when
they've already explicitly configured it on the ironic side.


>
> 2) How we are going to modify the Nova baremetal flavors to specify that
> the flavor requires one resource where the resource is of a set of custom
> resource classes? For example, let's say I'm have an Ironic installation
> with 10 different Ironic node hardware profiles. I've set all my Ironic
> node's resource_class attributes to match one of those hardware profiles. I
> now need to set up a Nova flavor that requests one of those ten hardware
> profiles. How do I do that? One solution might be to have a hacky flavor
> extra_spec called "ironic_resource_classes=CUSTOM_METAL_A,CUSTOM_METAL_B..."
> or similar. When we construct the request_spec object that gets sent to the
> scheduler (and later the placement service), we could look for that
> extra_spec and construct a special request to the placement service that
> says "find me a resource provider that has a capacity of 1 for any of the
> following resource classes...". The flavor extra_specs thing is a total
> hack, admittedly, but flavors are the current mess that Nova has to specify
> requested resources and we need to work within that mess unfortunately...
>

As it was explained to me at the beginning of all this, flavors were going
to have some sort of 

Re: [openstack-dev] [nova][placement][ironic] Progress on custom resource classes

2016-12-02 Thread Matt Riedemann

On 12/2/2016 11:10 AM, Jay Pipes wrote:

Ironic colleagues, heads up, please read the below fully! I'd like your
feedback on a couple outstanding questions.

tl;dr
-

Work for custom resource classes has been proceeding well this cycle,
and we're at a point where reviews from the Ironic community and
functional testing of a series of patches would be extremely helpful.

https://review.openstack.org/#/q/topic:bp/custom-resource-classes+status:open


History
---

As a brief reminder, in Newton, the Ironic community added a
"resource_class" attribute to the primary Node object returned by the
GET /nodes/{uuid} API call. This resource class attribute represents the
"hardware profile" (for lack of a better term) of the Ironic baremetal
node.

In Nova-land, we would like to stop tracking Ironic baremetal nodes as
collections of vCPU, RAM, and disk space -- because an Ironic baremetal
node is consumed atomically, not piecemeal like a hypervisor node is.
We'd like to have the scheduler search for an appropriate Ironic
baremetal node using a simplified search that simply looks for node that
has a particular hardware profile [1] instead of searching for nodes
that have a certain amount of VCPU, RAM, and disk space.

In addition to the scheduling and "boot request" alignment issues, we
want to fix the reporting and account of resources in an OpenStack
deployment containing Ironic. Currently, Nova reports an aggregate
amount of CPU, RAM and disk space but doesn't understand that, when
Ironic is in the mix, that a significant chunk of that CPU, RAM and disk
isn't "targetable" for virtual machines. We would much prefer to have
resource reporting look like:

 48 vCPU total, 14 used
 204800 MB RAM total, 10240 used
 1340 GB disk total, 100 used
 250 baremetal profile "A" total, 120 used
 120 baremetal profile "B" total, 16 used

instead of mixing all the resources together.

Need review and functional testing on a few things
--

Now that the custom resource classes REST API endpoint is established
[2] in the placement REST API, we are figuring out an appropriate way of
migrating the existing inventory and allocation records for Ironic
baremetal nodes from the "old-style" way of storing inventory for VCPU,
MEMORY_MB and DISK_GB resources towards the "new-style" way of storing a
single inventory record of amount 1 for the Ironic node's
"resource_class" attribute.

The patch that does this online data migration (from within the
nova-compute resource tracker) is here:

https://review.openstack.org/#/c/404472/

I'd really like to get some Ironic contributor eyeballs on that patch
and provide me feedback on whether the logic in the
_cleanup_ironic_legacy_allocations() method is sound.

There are still a couple things that need to be worked out:

1) Should the resource tracker auto-create custom resource classes in
the placement REST API when it sees an Ironic node's resource_class
attribute set to a non-NULL value and there is no record of such a
resource class in the `GET /resource-classes` placement API call? My gut
reaction to this is "yes, let's just do it", but I want to check with
operators and Ironic devs first. The alternative is to ignore errors
about "no such resource class exists", log a warning, and wait for an
administrator to create the custom resource classes that match the
distinct Ironic node resource classes that may exist in the deployment.


Seems to me that if you had to go to the work of setting the 
node.resource_class field in Ironic already, then Nova could be helpful 
and just auto-create the custom resource class to match that if it 
doesn't exist. That's one less manual step that operators need to deal 
with to start using this stuff, which seems like goodness.




2) How we are going to modify the Nova baremetal flavors to specify that
the flavor requires one resource where the resource is of a set of
custom resource classes? For example, let's say I'm have an Ironic
installation with 10 different Ironic node hardware profiles. I've set
all my Ironic node's resource_class attributes to match one of those
hardware profiles. I now need to set up a Nova flavor that requests one
of those ten hardware profiles. How do I do that? One solution might be
to have a hacky flavor extra_spec called
"ironic_resource_classes=CUSTOM_METAL_A,CUSTOM_METAL_B..."  or similar.
When we construct the request_spec object that gets sent to the
scheduler (and later the placement service), we could look for that
extra_spec and construct a special request to the placement service that
says "find me a resource provider that has a capacity of 1 for any of
the following resource classes...". The flavor extra_specs thing is a
total hack, admittedly, but flavors are the current mess that Nova has
to specify requested resources and we need to work within that mess
unfortunately...

The following patch series:

https://review.openstack.org/#/q/topic:bp/custom-resource-classes+s

[openstack-dev] [nova][placement][ironic] Progress on custom resource classes

2016-12-02 Thread Jay Pipes
Ironic colleagues, heads up, please read the below fully! I'd like your 
feedback on a couple outstanding questions.


tl;dr
-

Work for custom resource classes has been proceeding well this cycle, 
and we're at a point where reviews from the Ironic community and 
functional testing of a series of patches would be extremely helpful.


https://review.openstack.org/#/q/topic:bp/custom-resource-classes+status:open

History
---

As a brief reminder, in Newton, the Ironic community added a 
"resource_class" attribute to the primary Node object returned by the 
GET /nodes/{uuid} API call. This resource class attribute represents the 
"hardware profile" (for lack of a better term) of the Ironic baremetal node.


In Nova-land, we would like to stop tracking Ironic baremetal nodes as 
collections of vCPU, RAM, and disk space -- because an Ironic baremetal 
node is consumed atomically, not piecemeal like a hypervisor node is.
We'd like to have the scheduler search for an appropriate Ironic 
baremetal node using a simplified search that simply looks for node that 
has a particular hardware profile [1] instead of searching for nodes 
that have a certain amount of VCPU, RAM, and disk space.


In addition to the scheduling and "boot request" alignment issues, we 
want to fix the reporting and account of resources in an OpenStack 
deployment containing Ironic. Currently, Nova reports an aggregate 
amount of CPU, RAM and disk space but doesn't understand that, when 
Ironic is in the mix, that a significant chunk of that CPU, RAM and disk 
isn't "targetable" for virtual machines. We would much prefer to have 
resource reporting look like:


 48 vCPU total, 14 used
 204800 MB RAM total, 10240 used
 1340 GB disk total, 100 used
 250 baremetal profile "A" total, 120 used
 120 baremetal profile "B" total, 16 used

instead of mixing all the resources together.

Need review and functional testing on a few things
--

Now that the custom resource classes REST API endpoint is established 
[2] in the placement REST API, we are figuring out an appropriate way of 
migrating the existing inventory and allocation records for Ironic 
baremetal nodes from the "old-style" way of storing inventory for VCPU, 
MEMORY_MB and DISK_GB resources towards the "new-style" way of storing a 
single inventory record of amount 1 for the Ironic node's 
"resource_class" attribute.


The patch that does this online data migration (from within the 
nova-compute resource tracker) is here:


https://review.openstack.org/#/c/404472/

I'd really like to get some Ironic contributor eyeballs on that patch 
and provide me feedback on whether the logic in the 
_cleanup_ironic_legacy_allocations() method is sound.


There are still a couple things that need to be worked out:

1) Should the resource tracker auto-create custom resource classes in 
the placement REST API when it sees an Ironic node's resource_class 
attribute set to a non-NULL value and there is no record of such a 
resource class in the `GET /resource-classes` placement API call? My gut 
reaction to this is "yes, let's just do it", but I want to check with 
operators and Ironic devs first. The alternative is to ignore errors 
about "no such resource class exists", log a warning, and wait for an 
administrator to create the custom resource classes that match the 
distinct Ironic node resource classes that may exist in the deployment.


2) How we are going to modify the Nova baremetal flavors to specify that 
the flavor requires one resource where the resource is of a set of 
custom resource classes? For example, let's say I'm have an Ironic 
installation with 10 different Ironic node hardware profiles. I've set 
all my Ironic node's resource_class attributes to match one of those 
hardware profiles. I now need to set up a Nova flavor that requests one 
of those ten hardware profiles. How do I do that? One solution might be 
to have a hacky flavor extra_spec called 
"ironic_resource_classes=CUSTOM_METAL_A,CUSTOM_METAL_B..."  or similar. 
When we construct the request_spec object that gets sent to the 
scheduler (and later the placement service), we could look for that 
extra_spec and construct a special request to the placement service that 
says "find me a resource provider that has a capacity of 1 for any of 
the following resource classes...". The flavor extra_specs thing is a 
total hack, admittedly, but flavors are the current mess that Nova has 
to specify requested resources and we need to work within that mess 
unfortunately...


The following patch series:

https://review.openstack.org/#/q/topic:bp/custom-resource-classes+status:open

contains all the outstanding patches for the custom resource classes 
work. Getting more eyeballs on these patches would be super. If you are 
an Ironic operator that has some time to play with the new code and 
offer feedback and testing, that would be super awesome. Please come 
find me, cdent, bauzas, dans