On 05/31/2016 01:06 PM, Chris Dent wrote:
On Tue, 31 May 2016, Jay Pipes wrote:
Kinda. What the compute node needs is an InventoryList object
containing all inventory records for all resource classes both local
to it as well as associated to it via any aggregate-resource-pool
mapping.

Okay, that mostly makes sense. A bit different from what I've proved
out so far, but plenty of room to make it go that way.

Understood, and not a problem. I will provide more in-depth coded examples in code review comments.

The SQL for generating this InventoryList is the following:

Presumably this would be a method on the InventoryList object
itself?

InventoryList.get_by_compute_node() would be my suggestion. :)

We can deal with multiple shared storage pools per aggregate at a
later time. Just take the first resource provider in the list of
inventory records returned from the above SQL query that corresponds
to the DISK_GB resource class and that is resource provider you will
deduct from.

So this seem rather fragile and pretty user-hostile. We're creating an
opportunity for people to easily replace their existing bad tracking of
disk usage with a different style of bad tracking of disk usage.

I'm not clear why the new way of tracking disk usage would be "bad tracking"? The new way is correct -- i.e. the total amount of DISK_GB will be correct instead of multiplied by the number of compute nodes using that shared storage.

If we assign to different shared disk resource pools to the same
aggregate we've got a weird situation (unless we explicitly order
the resource providers by something).

Sure, but I'm saying that, for now, this isn't something I think we need to be concerned about. Deployers cannot *currently* have multiple shared storage pools used for providing VM ephemeral disk resources. So, there is no danger -- outside of a deployer deliberately sabotaging things -- for a compute node to have >1 DISK_GB inventory record if we have a standard process for deployers that use shared storage to create their resource pools for DISK_GB and assign compute nodes to that resource pool.

Maybe that's fine, for now, but it seems we need to be aware of, not
only for ourselves, but in the documentation when we tell people how
to start using resource pools: Oh, by the way, for now, just
associate one shared disk pool to an aggregate.

Sure, absolutely.

Assume only a single resource provider of DISK_GB. It will be either a
compute node's resource provider ID or a resource pool's resource
provider ID.
For this initial work, my idea was to have some code that, on creation
of a resource pool and its association with an aggregate, if that
resource pool has an inventory record with resource_class of DISK_GB
then remove any inventory records with DISK_GB resource class for any
compute node's (local) resource provider ID associated with that
aggregate. This way we ensure the existing behaviour that a compute
node either has local disk or it uses shared storage, but not both.

So let me translate that to make sure I get it:

* node X exists, has inventory of DISK_GB
* node X is in aggregate Y
* resource pool A is created
* two possible paths now: first associating aggregate to pool or
   first adding inventory pool
* in either case, when aggregate Y is associated, if the pool has
   DISK_GB, traverse the nodes in aggregate Y and drop the disk
   inventory

Correct.

So, effectively, any time we associate an aggregate we need to
inspect its nodes?

Yeah.... good point. :(

I suppose the alternative would be to "deal" with the multiple resource providers by just having the resource tracker pick whichever one appears first for a resource class (and order by the resource provider ID...). This might actually be a better alternative long-term, since then all we would need to do is change the ordering logic to take into account multiple resource providers of the same resource class instead of dealing with all this messy validation and conversion.

What happens if we ever disassociate an aggregate from a resource pool?
Do the nodes in the aggregate have some way to get their local Inventory
back or are we going to assume that the switch to shared is one way?

OK, yeah, you've sold me that my solution isn't good. By just allowing multiple providers and picking the "first" that appears, we limit ourselves to just needing to do the scrubbing of compute node local DISK_GB inventory records -- which we can do in an online data migration -- and we don't have to worry about the disassociate/associate aggregate problems.

In my scribbles when I was thinking this through (that led to the
start of this thread) I had imagined that rather than finding both
the resource pool and compute node resource providers when finding
available disk we'd instead see if there was resource pool, use it
if it was there, and if not, just use the compute node. Therefore if
the resource pool was ever disassociated, we'd be back to where we
were before without needing to reset the state in the artifact
world.

That would work too, yes. And seems simpler to reason about... but has the potential of leaving bad inventory records in the inventories table for "local" DISK_GB resources that never will be used.

Best,
-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to