Re: [openstack-dev] [nova][scheduler] ResourceProvider design issues

2016-10-18 Thread Ed Leafe

> On Oct 17, 2016, at 8:45 PM, Jay Pipes  wrote:
> 
> On 10/17/2016 11:14 PM, Ed Leafe wrote:
>> Now that we’re starting to model some more complex resources, it seems that 
>> some of the original design decisions may have been mistaken. One approach 
>> to work around this is to create multiple levels of resource providers. 
>> While that works, it is unnecessarily complicated IMO. I think we need to 
>> revisit some basic assumptions about the design before we dig ourselves a 
>> big design hole that will be difficult to get out of. I’ve tried to 
>> summarize my thoughts in a blog post. I don’t presume that this is the only 
>> possible solution, but I feel it is better than the current approach.
>> 
>> https://blog.leafe.com/virtual-bike-sheds/
> 
> I commented on your blog, but leave it here for posterity:

Likewise, responded on the blog, but following your lead by posting in both 
places.

You didn't include this in your email, but I think you misunderstood my comment 
how "those of us experienced in OOP" might object to having multiple classes 
that differ solely on a single attribute. Since you are the one who is doing 
the objecting to multiple class names, I was merely saying that anyone with 
background in object-oriented programming might have a reflexive aversion to 
having slight variations on something with 'Class' in its name. That was the 
reason I said that if they had been named 'ResourceTypes' instead, the aversion 
might not be as strong. Sorry for the misunderstanding. I was in no way trying 
to minimize your understanding of OOPy things.

Regarding your comments on standardization, I'm not sure that I can see the 
difference between what you've described and what I have. In your design, you 
would have a standard class name for the SR-IOV-VF, and standard trait names 
for the networks. So with a two-network deployment, there would need to be 3 
standardized names. With multiple classes, there would need to be 2 
standardized names: not a huge difference. Now if there might be a more complex 
deployment than simply 'public' and 'private' networks for SR-IOV devices, then 
things are less clear. For things to be standardized across clouds, the way you 
request a resource has to be standardized. How would the various network names 
be constrained across clouds? Let's say there are N network types; the same 
math would apply. Nested providers would need N+1 standard names and multiple 
classes would need N in order to distinguish. If there are no restrictions on 
network names, then both approaches will fail on standardization, since a 
provider could call a network whatever they want.

As far as NUMA cells and their inventory accounting are concerned, that sounds 
like something where a whiteboard discussion will really help. Most of the 
people working on the placement engine, myself included, have only a passing 
understanding of the intricacies of NUMA arrangements. But even without that, I 
don't see the need to have multiple awkward names for the different NUMA 
resource classes. Based on my understanding, a slightly different approach 
would be sufficient. Instead of having multiple classes, we could remove the 
restriction that a ResourceProvider can only have one of any individual 
ResourceClass. In other words, the host would have two ResourceClass records of 
type NUMA_SOCKET (is that the right class?), one for each NUMA cell, and each 
of those would have their individual inventory records. So a request for 
MEMORY_PAGE_1G would involve a ResourceProvider seeing if any of their 
ResourceClass records has enough of that type of inventory available.

I think the same approach applies to the NIC bandwidth example you gave. By 
allowing multiple ResourceClass records representing the different NICs, the 
total bandwidth will also be a simple aggregate.

Finally, regarding the SQL complexity, I spent years as a SQL DBA and yet I am 
always impressed by how much better your SQL solutions are than the ones I 
might come up with. I'm not saying that the SQL is so complex as to be 
unworkable; I'm simply saying that it is more complex than it needs to be.

In any event, I am looking forward to carrying on these discussions in 
Barcelona with you and the rest of the scheduler subteam.


-- Ed Leafe






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][scheduler] ResourceProvider design issues

2016-10-17 Thread Jay Pipes

On 10/17/2016 11:14 PM, Ed Leafe wrote:

Now that we’re starting to model some more complex resources, it seems that 
some of the original design decisions may have been mistaken. One approach to 
work around this is to create multiple levels of resource providers. While that 
works, it is unnecessarily complicated IMO. I think we need to revisit some 
basic assumptions about the design before we dig ourselves a big design hole 
that will be difficult to get out of. I’ve tried to summarize my thoughts in a 
blog post. I don’t presume that this is the only possible solution, but I feel 
it is better than the current approach.

https://blog.leafe.com/virtual-bike-sheds/


I commented on your blog, but leave it here for posterity:

First, one of the reasons for the resource providers work was to 
*standardize* as much as possible the classes of resource that a cloud 
provides. Without standardized resource classes, there is no 
interoperability between clouds. The proposed solution of creating 
resource classes for each combination of actual resource class (the 
SRIOV VF) and the collection of traits that the VF might have (physical 
network tag, speed, product and vendor ID, etc) means there would be no 
interoperable way of referring to a VF resource in one OpenStack cloud 
as provided the same thing in another OpenStack cloud. The fact that a 
VF might be tagged to physical network A or physical network B doesn’t 
change the fundamentals: it’s a virtual function on an SR-IOV-enabled 
NIC that a guest consumes. If I don’t have a single resource class that 
represents a virtual function on an SR-IOV-enabled NIC (and instead I 
have dozens of different resource classes that refer to variations of 
VFs based on network tag and other traits) then I cannot have a 
normalized multi-OpenStack cloud environment because there’s no 
standardization.


Secondly, the compute host to SR-IOV PF is only one relationship that 
can be represented by nested resource providers. Other relationships 
that need to be represented include:


* Compute host to NUMA cell relations where a NUMA cell provides both 
VCPU, MEMORY_MB and MEMORY_PAGE_2M and MEMORY_PAGE_1G inventories that 
are separate from each other but accounted for in the parent provider 
(meaning the compute host’s MEMORY_MB inventory is logically the 
aggregate of both NUMA cells’ inventories of MEMORY_MB). In your data 
modeling, how would you represent two NUMA cells, each with their own 
inventories and allocations? Would you create resource classes called 
NUMA_CELL_0_MEMORY_MB and NUMA_CELL_1_MEMORY_MB etc? See point above 
about one of the purposes of the resource providers work being the 
standardization of resource classification.


* NIC bandwidth and NIC bandwidth per physical network. If I have 4 
physical NICs on a compute host and I want to track network bandwidth as 
a consumable resource on each of those NICs, how would I go about doing 
that? Again, would you suggest auto-creating a set of resource classes 
representing the NICs? So, NET_BW_KB_EKB_ENP3S1, NET_BW_KB_ENP4S0, and 
so on? If I wanted to see the total aggregate bandwidth of the compute 
host, the system will now have to have tribal knowledge built into it to 
know that all the NET_BW_KB* resource classes are all describing the 
same exact resource class (network bandwidth in KB) but that the 
resource class names should be interpreted in a certain way. Again, not 
standardizable. In the nested resource providers modeling, we would have 
a parent compute host resource provider and 4 child resource providers — 
one for each of the NICs. Each NIC would have a set of traits 
indicating, for example, the interface name or physical network tag. 
However, the inventory (quantitative) amounts for network bandwidth 
would be a single standardized resource class, say NET_BW_KB. This 
nested resource providers system accurately models the real world setup 
of things that are providing the consumable resource, which is network 
bandwidth.


Finally, I think you are overstating the complexity of the SQL that is 
involved in the placement queries.  I’ve tried to design the DB schema 
with an eye to efficient and relatively simple SQL queries — and keeping 
quantitative and qualitative things decoupled in the schema was a big 
part of that efficiency. I’d like to see specific examples of how you 
would solve the above scenarios by combining the qualitative and 
quantitative aspects into a single resource type but still manage to 
have some interoperable standards that multiple OpenStack clouds can expose.


Best,
-jay

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][scheduler] ResourceProvider design issues

2016-10-17 Thread Ed Leafe
Now that we’re starting to model some more complex resources, it seems that 
some of the original design decisions may have been mistaken. One approach to 
work around this is to create multiple levels of resource providers. While that 
works, it is unnecessarily complicated IMO. I think we need to revisit some 
basic assumptions about the design before we dig ourselves a big design hole 
that will be difficult to get out of. I’ve tried to summarize my thoughts in a 
blog post. I don’t presume that this is the only possible solution, but I feel 
it is better than the current approach.

https://blog.leafe.com/virtual-bike-sheds/


-- Ed Leafe






__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev