Re: [openstack-dev] [glance] Replication on image create

Flavio Percoco Thu, 15 Jan 2015 00:02:35 -0800

On 14/01/15 05:46 -0700, Boden Russell wrote:



On 1/14/15 1:38 AM, Flavio Percoco wrote:

On 13/01/15 21:24 -0500, Jay Pipes wrote:

On 01/13/2015 04:55 PM, Boden Russell wrote:

Looking for some feedback from the glance dev team on a potential BP…


This is the solution that I would recommend. Frankly, this kind of
replication should be an async out-of-band process similar to
bittorrent. Just have bittorrent or rsync or whatever replicate the
image bits to a set of target locations and then call the
glanceclient.v2.client.images.add_location() method:

https://github.com/openstack/python-glanceclient/blob/master/glanceclient/v2/images.py#L211


to add the URI of the replicated image bits.


It recently landed in Glance an async workers engine (?) that allows
for this kind of things to exists. For instance, it'll be used for
image introspection to extract information from images after they have
been *imported* into glance.

The right hooks that trigger this async workers maybe need to be
defined better but the infrastructure is there. Once that's more
solid, you'll be able to write your own plugin that will do that job
on every glance image import.


While I understand the motivation for suggesting the "out of band"
approach (async workers or separate process), my major concern here is
the additional processing required. In my particular scenario this would
require the out of band process to pull the image bits back down from
the remote location and then push them back up to the replication
locations. If the image size is decent, this could be a fairly expensive
operation. Moreover an out of band process (IMO) would make for a less
than optimal user experience as users would have to query the image
locations metadata to understand if the image has replicated yet.
Perhaps async workers improves this user experience a bit (can query
worker status), but it still seems cleaner (IMO) to have the replication
happen in-line with the image create flow.


That's one valid point of view, yes. However, you could also see it
this way. While should a user wait for the image to be available in
the three locations before he/she can use the image? The point of
replicating it - afaict based on what you said - is to have it
redundant and accessible from locations that are closer to some users.
If you make this part of the image creation process, you'd need to
wait until the images are fully replicated before you can actually use
it (which is not nice for users). If you instead use async workers,
you can make the image available as soon as one of the locations is
ready to serve the image.

As far as the transfering bits problem goes, the best way to avoid it
is to upload the bits offline (assuming you're using stores like http,
which doesn't have support for internal replication) and then add
locations to an image. This way, you'd have to upload the image bits N
times, where N is the number of replicast you want, instead of N+1
times, which includes the upload to one if the glance nodes.

I'm not saying the above is the ultimate solution and that glance
won't ever support this. However, it's worht noting that such
solutions are not considered bad practice whatsoever.

All that being said, it'd be very nice if you could open a spec on
this topic so we can discuss over the spec review and one of us (or
you) can implement it if we reach consensus.

Cheers,
Flavio

In a prototype I implemented #1 which can be done with no impact outside
of the store driver code itself.


I'm not entirely sure how you did that considering the http storage
backend is readonly. Are you saying you implemented the add() method
for the glance_store._drivers.http.Store class?


I was trying to generalize my use case to other glance store drivers,
but my generalization using the http store driver was obviously a poor
choice... My interest and PoC is based on the VMware datastore driver.


Let me ask more directly -- if we wanted to enhance the VMware datastore
driver to support replication (as I described in approach #1 of my
initial email) is this something the community would consider (assume
changes are contained to the VMware datastore driver), or would such an
enhancement be an uphill battle to get reviewed / merged?


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


--
@flaper87
Flavio Percoco

pgpytar7HPaeV.pgp
Description: PGP signature

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [glance] Replication on image create

Reply via email to