On Tue, 31 May 2016, Jay Pipes wrote:
So this seem rather fragile and pretty user-hostile. We're creating an
opportunity for people to easily replace their existing bad tracking of
disk usage with a different style of bad tracking of disk usage.
I'm not clear why the new way of tracking disk usage would be "bad tracking"?
The new way is correct -- i.e. the total amount of DISK_GB will be correct
instead of multiplied by the number of compute nodes using that shared
storage.
The issue is not with the new way, but rather that unless we protect
against multiple pools of the same class associating with the
same aggregate _or_ teach the scheduler and resource tracker to
choose the right one when recording allocations then we have pools
being updated unpredictably.
But the solutions below ought to deal with it, so: under control.
Sure, but I'm saying that, for now, this isn't something I think we need to
be concerned about. Deployers cannot *currently* have multiple shared storage
pools used for providing VM ephemeral disk resources. So, there is no danger
-- outside of a deployer deliberately sabotaging things -- for a compute node
to have >1 DISK_GB inventory record if we have a standard process for
deployers that use shared storage to create their resource pools for DISK_GB
and assign compute nodes to that resource pool.
I'm not sure I would categorize "just happened to add an aggregate
to a resource pool" as "deliberately sabotaging things". That's all
I'm getting at with this particular concern.
And if we do this:
Maybe that's fine, for now, but it seems we need to be aware of, not
only for ourselves, but in the documentation when we tell people how
to start using resource pools: Oh, by the way, for now, just
associate one shared disk pool to an aggregate.
Then we get this:
Sure, absolutely.
so is probably okay enough.
I suppose the alternative would be to "deal" with the multiple resource
providers by just having the resource tracker pick whichever one appears
first for a resource class (and order by the resource provider ID...). This
might actually be a better alternative long-term, since then all we would
need to do is change the ordering logic to take into account multiple
resource providers of the same resource class instead of dealing with all
this messy validation and conversion.
That's kind of what I was thinking. Get the multiple providers, sort
them by arbitrary something now, something smarter later. Can think of
three off the top of my head: least used, most used, random.
In my scribbles when I was thinking this through (that led to the
start of this thread) I had imagined that rather than finding both
the resource pool and compute node resource providers when finding
available disk we'd instead see if there was resource pool, use it
if it was there, and if not, just use the compute node. Therefore if
the resource pool was ever disassociated, we'd be back to where we
were before without needing to reset the state in the artifact
world.
That would work too, yes. And seems simpler to reason about... but has the
potential of leaving bad inventory records in the inventories table for
"local" DISK_GB resources that never will be used.
Well, presumably we're still going to need some way for a node to
update its inventory (after an upgrade and reboot) so that
functionality ought to take care of it: If the node hasn't been
reboot we assumed the representation of reality in the Inventory is
correct. If there's a reboot it gets updated?
Dunno, riffing.
--
Chris Dent (╯°□°)╯︵┻━┻ http://anticdent.org/
freenode: cdent tw: @anticdent
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev