On Wed, 4 Sep 2013, Josh Durgin wrote:
> On 09/04/2013 09:22 AM, Sage Weil wrote:
> > Hi Mike,
> >
> > On Wed, 4 Sep 2013, Mike Dawson wrote:
> > > Thanks for the feature request Mark!
> > >
> > > To illustrate where my understanding ends, here is a patch to stub in this
> > > functionality:
> > >
> > > From 137ebc2d5326ca710a5e99e2899fd851b02c10f7 Mon Sep 17 00:00:00 2001
> > > From: Mike Dawson <[email protected]>
> > > Date: Wed, 4 Sep 2013 11:39:05 -0400
> > > Subject: [PATCH] Update config.cc
> > >
> > > ---
> > > src/common/config.cc | 4 +++-
> > > 1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/src/common/config.cc b/src/common/config.cc
> > > index 5c64f4e..89ac0ee 100644
> > > --- a/src/common/config.cc
> > > +++ b/src/common/config.cc
> > > @@ -929,7 +929,7 @@ int md_config_t::set_val_raw(const char *val, const
> > > config_option *opt)
> > > }
> > >
> > > static const char *CONF_METAVARIABLES[] =
> > > - { "cluster", "type", "name", "host", "num", "id", "pid" };
> > > + { "cluster", "type", "name", "host", "num", "id", "pid", "image-name"
> > > };
> > > static const int NUM_CONF_METAVARIABLES =
> > > (sizeof(CONF_METAVARIABLES) / sizeof(CONF_METAVARIABLES[0]));
> > >
> > > @@ -1007,6 +1007,8 @@ bool md_config_t::expand_meta(std::string &origval)
> > > const
> > > out += name.get_id().c_str();
> > > else if (var == "pid")
> > > out += stringify(getpid());
> > > + else if (var == "image-name")
> > > + out += "TBD"
> > > else
> > > assert(0); // unreachable
> > > found_meta = true;
> > > --
> > > 1.8.4
> > >
> > >
> > > Obviously, the I'm stuck at "TBD". I'll finish it if someone could point
> > > me in
> > > the right direction.
> >
> > The challenge here is that the image name is currently parsed by things
> > like rbd.cc or not at all. And more significantly, a single librbd
> > instance can map any number of images, but the config context has only one
> > 'image-name' slot and log.
>
> If these were set in librbd, which knows the image name, it would need
> to delay opening or reopen the files in the new location, since they're
> initially created when rados_connect() calls common_init_finish(). This
> happens before librbd gets involved, so the image name isn't available
> yet.
>
> > An alternative approach would be to stick some other unique identification
> > of the librbd 'instance'. It would be gibberish but would give you
> > different instances of the asok file (and/or logs).
>
> This would be simplest (it could be done in librados with the unique
> gibberish determined during rados_connect(), perhaps using the nonce
> already generated there). This would work with qemu since qemu creates
> a new rados_cluster_t for each rbd image. I'd rather avoid the gibberish
> aspect though.
>
> > Or, since qemu is the one creating one librbd instance per mapped image,
> > it could explicitly set the log file and admin socket to something
> > appropriate. I think that in general having these enabled by default is a
> > good thing (especially the admin socket).
>
> I think this is a good place to replace rbd-specific metavariables - in
> things like rbd.cc and qemu that use librbd. This can be done via
> rados_conf_get() and rados_conf_set() on log_file and admin_socket
> before calling rados_connect().
>
> To avoid duplication, it might be best to provide a librbd method that
> does this. It could also include any new settings that would need
> metavariable expansion in the future.
>
> Something like:
>
> void rbd_expand_config_vars(rados_t cluster,
> const char *pool,
> const char *namespace,
> const char *image,
> const char *snapshot)
>
> This would expand $pool, $namespace, $image, and $snapshot via
> rados_conf_get() and rados_conf_set() on log_file, admin_socket,
> and others in the future (e.g. persistent cache path).
What about something more generic:
void rados_conf_add_meta(rados_t cluster,
const char *var,
const char *value);
and call it 4 times? That will things much prettier when we add the 5th
one.
Is it possible that even with /var/log/ceph/rbd.$pool.$ns.$image.$snap.log
we'd need some extra gibberish for uniqueness?
> One downside to this is that it wouldn't affect config options
> that are redefined via admin socket commands. This could probably
> be worked around, but I don't think it's a big enough issue to
> warrant the extra complexity. Admin socket users can specify a
> fully-expanded path easily enough.
The method above could register the metavar internally and reapply the
expansion as appropriate later... then we'd have a
void rados_conf_remove_meta(rados_t cluster, const char *var);
for completeness?
sage
>
> Josh
>
> > > On 9/4/2013 11:07 AM, Mark Nelson wrote:
> > > > On 09/04/2013 09:46 AM, Mike Dawson wrote:
> > > > > There are currently a few metavariables available for use in
> > > > > ceph.conf:
> > > > > http://ceph.com/docs/master/rados/configuration/ceph-conf/#metavariables
> > > > >
> > > > > In addition to those listed in that document, $pid was added in
> > > > > Bobtail.
> > > > > That allows me to get RBD admin sockets for libvirt/qemu guests by
> > > > > specifying
> > > > >
> > > > > [client]
> > > > > admin socket = /var/run/ceph/rbd-$pid.asok
> > > > >
> > > > > in /etc/ceph/ceph.conf. That works well for VMs with a single volume,
> > > > > but it does not work for VMs with multiple volumes. With multiple
> > > > > volumes, the admin only lists data on the last volume specified,
> > > > > leaving
> > > > > me unable to monitor stats on any volumes other than the last.
> > > > >
> > > > > I imagine you could specify a unique admin socket for each volume on
> > > > > the
> > > > > command-line that libvirt uses to spawn qemu, but that seems like a
> > > > > pain.
> > > > >
> > > > > Is it possible instead to add a metavariable that would expand to the
> > > > > image name? This way, my ceph.conf would look like
> > > > >
> > > > >
> > > > > [client]
> > > > > admin socket = /var/run/ceph/$imagename.asok
> > > > >
> > > > > and I would end up with admin sockets like:
> > > > >
> > > > > /var/run/ceph/rbd-volume-0025b693-6230-443e-ad6a-3aa43efdd648.asok
> > > > > /var/run/ceph/rbd-volume-009228f3-b8d7-4f3c-8458-0cdbd4809dc6.asok
> > > > >
> > > > > # rbd -p volumes ls
> > > > > volume-0025b693-6230-443e-ad6a-3aa43efdd648
> > > > > volume-009228f3-b8d7-4f3c-8458-0cdbd4809dc6
> > > > >
> > > > > I'd be happy to provide a patch if someone could give me sufficient
> > > > > guidance on how/where to add this functionality.
> > > > >
> > > >
> > > > Created a feature request in the tracker for this:
> > > >
> > > > http://tracker.ceph.com/issues/6228
> > > >
> > > > Mark
>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html