Re: [RFC master] Initial Storage Pools design doc

Constantinos Venetsanopoulos Thu, 14 Feb 2013 09:08:44 -0800

Hello everybody,

firing up the thread again :)


During the merge of the ExtStorage patch series we had a very interesting
discussion with Iustin and seems that we are getting close to bridging
whatever gaps, after Guido's comments on this thread too.

Please see here:

https://groups.google.com/forum/?fromgroups=#!searchin/ganeti-devel/Multiple$20ExtStorage$20Providers$20and$20ext-params/ganeti-devel/d8lE_JwHJ70/lfOwS2LlrCwJ

so that we can all be in sync. I think it combines and merges more or less
everybody's comments.

comments inline with reference to the above mail.

On 09/14/2012 11:47 AM, Guido Trotter wrote:

On Thu, Sep 13, 2012 at 4:31 PM, Constantinos Venetsanopoulos
<[email protected]> wrote:

Yes I agree on the mix! So, I think the best would be to have a
cluster-level
dict that defines each pool, but not in the previous format. The template
would be just one of the pool's attributes. And there, we should also define
specific template's params, by setting only the ones that we want to
override
from the default cluster-level disk-params dict. So it would look like this:

storage_pools: {
   "default_drbd_pool": {
     "template": "drbd",
     "template_params": {}
   }
   "second_drbd_pool": {
     "template": "drbd",
     "template_params": {
       "metavg": "different_metavg_than_the_default"
     }
   },
   "default_lvm_pool: {
     "template": "plain",
     "template_params": {}
   }
   "second_lvm_pool: {
     "template": "plain",
     "template_params": {
       "stripes": 2
     }
   },
   "radoscluster1": {
     "template": "rbd",
     "template_params": {}
   },
   "radoscluster2": {
     "template": "rbd",
     "template_params": {
       "pool": "ganetipool"
     }
   },

This sounds good. Just confirming with the Haskell overlords: is this
the best data structure, for parsing and handling in a strongly typed
language?
I believe so, since the "differentiation" between what template_params
should be is defined at the external level in the "template"
parameter. Do you confirm?

   "nas1": {
     "template": "ext",
     "template_params": {},
     "provider": "emc"
   },
   "nas2": {
     "template": "ext",
     "template_params": {},
     "provider": "netapp"
   }
}

Here we might have to put "provider" inside the params, and perhaps if
needed have an extra level of params there, for the reason above.
(unless all pools support a provider, in which case it might be that
they support only one)


ACK. See reference email. We could set 'provider' as a template
parameter and also mark it as modifyable for template 'ext'.
Then we could just ensure than when creating an 'ext' Storage
Pool, this specific parameter should be always set at Storage
Pool level. I think that solves both our problems.

If we do that, then indeed we do not need parameters being specified at
nodegroup level. There, we will only define the connected_storage_pools
attribute as a list available pools for the specific nodegroup, defining the
mobility domain (exactly as you proposed).


Note that in my proposal it was still possible to override "some"
parameters. Example of this would be instances in the same pool, but
which change dynamic parameters such as resync speed in different
nodegroup (due to different hardware capabilities of the nodes in each
group).
If we don't do this we must have an "easy" way to move instances
between pools, without copying the data if the template is the same,
which sounds more tricky! :/

Another thing to do, is make the "vg" a disk parameter for templates
"plain" and "drbd" to unify the design and align with metavg.

Yes, vg and metavg should be template level parameters. I think
removing the possibility of configuring those separately for each
instance and just tying them to the template is fine.
In the future we might have parameters overridden at the instance
level to solve this as well.


ACK.

We can also cease the disk-params dict on nodegroup level, which
complicates the inheritance issues and doesn't seem to add any
functionality if we proceed with the above design. The only thing I can
think of is: if a parameter is set at nodegroup level, use that, to override
the value found inside the "template_params". However this seems a bit
ugly to me, because it complicates a lot the flow of the inheritance.

No, I'm not sure we can do that, see above. It actually is a needed
functionality to differentiate what drbd does in different nodegroups.


ACK. See reference mail. Inheritance at cluster/node group level
stays intact.

With the above design, we probably overcome also the 'aliases' problem, that
you correctly pointed above. You won't be able to have two different pools
with the same template's params. However, this needs a little more thinking,
because maybe it is a feature to be able to have let's say two different
isolated
RADOS clusters with the same settings, each one connected to a different
nodegroup. Or the same names for vg/metavg in two different nodegroups,
but the first nodegroup to be "exclusive_storage" and the second not
partitioned? Maybe we would want to have two different pools with the same
settings in this case..

Yes, but only for some of these. For rados for example I wouldn't
allow it, as moving between the two requires copying data and
otherwise it wouldn't. Since a data copy is needed, the pool is
different.
For drbd I would as a data copy is needed anyway.

Finally, in a second round of implementation we could add more attributes
for
each pool such as "total_space", "free_space" etc. and combine the storage
reporting too.

No these must be calculated, not added as parameters, I believe. they
might be cached, but they anyway belong in a different data structure.


ACK.

ACK: in our idea the IDISK_VG becomes the pool, and we had a metapool as
well,
for drbd disks, but we can discuss over how to best do this: currently
it's
easy to have many instances with vg and metavg "quite varied", while with
this
idea you'd have to create a pool each time you wanted vg and metavg to be
different. Is this a restriction which is ok?


Yes, you'd have to create different pools. But I don't think this is a
drawback. Actually it is the one of the design's concepts, to allow
for abstracted layers. The admin sets and configures the storage
backend (storage pools and their parameters) and the user just
chooses from them, when adding new disks.
I think it is a fair tradeoff in favor of simplicity and clarity of disk
options.

Sounds good, actually.


ACK.

 From our point of view, it is not necessary. If this makes things
very hard for you, we can leave them both and see how it goes.

I believe we can try to do without and just use different pools.


ACK.

+ # gnt-instance add -o debootstrap+default
+                    --disk 0:adopt="/dev/disk/by-uuid/example_disk",
+                             pool=new_blockdev_pool
+

In this case I'd say the blockdev pool just doesn't have a target name or
namespace, so do we really need different pools for it? (in our view in
which
the pool is a parameter of the disk template, we don't. in yours we can't
do
without).


Yes. This is true. We can talk about this further if you want, but I think
it is a corner case wrt the first implementation.

So for blockdev I'd say you can also adopt in an existing pool.
Same for drbd or file *as long as the file or lvm device is in the
right location* (dir or vg).


ACK.

ACK, except if in the future you want "volatile" disks then it could, with
the
disk being recreated. Also KVM supports "migrating" disk images and Xen is
going to implement it, so this restriction might soon go away.


ACK, we can rediscuss such features after the first implementation.
It's an interesting idea.

Absolutely, perhaps let's just mention this in "future work".

Not sure. This would mean that nowadays an instance can have two drbd
disks on
different LVs and after this it wouldn't, so it's not a good transition
point,
unless we still allow that "in between", somehow.


Makes sense after the previous conversation. If we conclude in the design,
I'm sure we can find a transition point, that will be convenient for
everybody.
I will rethink how we can regroup the implementation steps, and come back
on this.

ACK.

Thanks,

Guido



Regards,
Constantinos

Re: [RFC master] Initial Storage Pools design doc

Reply via email to