Re: [openstack-dev] [nova][scheduler] ResourceProvider design issues

Jay Pipes Mon, 17 Oct 2016 18:49:58 -0700

On 10/17/2016 11:14 PM, Ed Leafe wrote:

Now that we’re starting to model some more complex resources, it seems that 
some of the original design decisions may have been mistaken. One approach to 
work around this is to create multiple levels of resource providers. While that 
works, it is unnecessarily complicated IMO. I think we need to revisit some 
basic assumptions about the design before we dig ourselves a big design hole 
that will be difficult to get out of. I’ve tried to summarize my thoughts in a 
blog post. I don’t presume that this is the only possible solution, but I feel 
it is better than the current approach.


https://blog.leafe.com/virtual-bike-sheds/


I commented on your blog, but leave it here for posterity:

First, one of the reasons for the resource providers work was to*standardize* as much as possible the classes of resource that a cloudprovides. Without standardized resource classes, there is nointeroperability between clouds. The proposed solution of creatingresource classes for each combination of actual resource class (theSRIOV VF) and the collection of traits that the VF might have (physicalnetwork tag, speed, product and vendor ID, etc) means there would be nointeroperable way of referring to a VF resource in one OpenStack cloudas provided the same thing in another OpenStack cloud. The fact that aVF might be tagged to physical network A or physical network B doesn’tchange the fundamentals: it’s a virtual function on an SR-IOV-enabledNIC that a guest consumes. If I don’t have a single resource class thatrepresents a virtual function on an SR-IOV-enabled NIC (and instead Ihave dozens of different resource classes that refer to variations ofVFs based on network tag and other traits) then I cannot have anormalized multi-OpenStack cloud environment because there’s nostandardization.

Secondly, the compute host to SR-IOV PF is only one relationship thatcan be represented by nested resource providers. Other relationshipsthat need to be represented include:

* Compute host to NUMA cell relations where a NUMA cell provides bothVCPU, MEMORY_MB and MEMORY_PAGE_2M and MEMORY_PAGE_1G inventories thatare separate from each other but accounted for in the parent provider(meaning the compute host’s MEMORY_MB inventory is logically theaggregate of both NUMA cells’ inventories of MEMORY_MB). In your datamodeling, how would you represent two NUMA cells, each with their owninventories and allocations? Would you create resource classes calledNUMA_CELL_0_MEMORY_MB and NUMA_CELL_1_MEMORY_MB etc? See point aboveabout one of the purposes of the resource providers work being thestandardization of resource classification.

* NIC bandwidth and NIC bandwidth per physical network. If I have 4physical NICs on a compute host and I want to track network bandwidth asa consumable resource on each of those NICs, how would I go about doingthat? Again, would you suggest auto-creating a set of resource classesrepresenting the NICs? So, NET_BW_KB_EKB_ENP3S1, NET_BW_KB_ENP4S0, andso on? If I wanted to see the total aggregate bandwidth of the computehost, the system will now have to have tribal knowledge built into it toknow that all the NET_BW_KB* resource classes are all describing thesame exact resource class (network bandwidth in KB) but that theresource class names should be interpreted in a certain way. Again, notstandardizable. In the nested resource providers modeling, we would havea parent compute host resource provider and 4 child resource providers —one for each of the NICs. Each NIC would have a set of traitsindicating, for example, the interface name or physical network tag.However, the inventory (quantitative) amounts for network bandwidthwould be a single standardized resource class, say NET_BW_KB. Thisnested resource providers system accurately models the real world setupof things that are providing the consumable resource, which is networkbandwidth.

Finally, I think you are overstating the complexity of the SQL that isinvolved in the placement queries. 🙂 I’ve tried to design the DB schemawith an eye to efficient and relatively simple SQL queries — and keepingquantitative and qualitative things decoupled in the schema was a bigpart of that efficiency. I’d like to see specific examples of how youwould solve the above scenarios by combining the qualitative andquantitative aspects into a single resource type but still manage tohave some interoperable standards that multiple OpenStack clouds can expose.


Best,
-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova][scheduler] ResourceProvider design issues

Reply via email to